In a rare instance of collaboration among otherwise fierce rivals, Google, Yahoo and Microsoft said they would support a new web standard that will allow millions of publishers to remove duplicate pages from their websites. As a result, search engines would be able to make their search results more comprehensive.
"There is a lot of clutter on the web and with this, publishers will be able to clean up a lot of junk," said Matt Cutts, an engineer who heads Google's spam fighting efforts, the New York Times reported.
"I think it is going to gain traction pretty quickly," said Cutts.
The problem is the following: Many web publishers, especially those that have voluminous sites, like e-commerce companies, have multiple URLs that all point to the same page. This confuses search engines, sometimes causing them to index the same page multiple times. As much as 20 percent of URLs on the web may be duplicates, according to some estimates.
Engineers at Google came up with a simple way for web publishers to indicate when a URL is a duplicate, and if so, which is the principal, or "canonical," URL that search engines should be indexing. Yahoo and Microsoft, the no. 2 and no. 3 search engines, have agreed to support the same standard.
"We are happy that everyone is going to support the same implementation," said Nathan Buggia, a lead program manager at Microsoft. "This is a clear benefit for publishers as it gives them an opportunity to get more exposure through search engines."
All search engines have developed technologies to detect duplicates that are more or less effective. The so-called Canonical Link Tag, as the standard is known, should make it easier for both publishers and search engines to address the problem, NYT reported Thursday.
"It is an important step because all the search engines are coming out with it," said Priyank Garg, director of product management for web search at Yahoo.
"There is a lot of clutter on the web and with this, publishers will be able to clean up a lot of junk," said Matt Cutts, an engineer who heads Google's spam fighting efforts, the New York Times reported.
"I think it is going to gain traction pretty quickly," said Cutts.
The problem is the following: Many web publishers, especially those that have voluminous sites, like e-commerce companies, have multiple URLs that all point to the same page. This confuses search engines, sometimes causing them to index the same page multiple times. As much as 20 percent of URLs on the web may be duplicates, according to some estimates.
Engineers at Google came up with a simple way for web publishers to indicate when a URL is a duplicate, and if so, which is the principal, or "canonical," URL that search engines should be indexing. Yahoo and Microsoft, the no. 2 and no. 3 search engines, have agreed to support the same standard.
"We are happy that everyone is going to support the same implementation," said Nathan Buggia, a lead program manager at Microsoft. "This is a clear benefit for publishers as it gives them an opportunity to get more exposure through search engines."
All search engines have developed technologies to detect duplicates that are more or less effective. The so-called Canonical Link Tag, as the standard is known, should make it easier for both publishers and search engines to address the problem, NYT reported Thursday.
"It is an important step because all the search engines are coming out with it," said Priyank Garg, director of product management for web search at Yahoo.
Article Source:- http://www.in.com/news/readnews-science-technology-news-google-yahoo-microsoft-to-clean-up-web-text-8021035-dde97869a14a4d480337a1157fc387c9e2d990d8-1.html
0 comments:
Post a Comment