Yahoo’s Link-Based Spam Detection Patent Application 雅虎的基于链接的垃圾邮件检测专利申请
Search Engine News May 7th, 2006 搜索引擎新闻 2006年5月7日In an effort to curb spam, TrustRank has been used as one of the factors in establishing rankings in search engines like Big G.在努力遏止垃圾邮件, trustrank已被用作的因素之一,在建立的排名在搜索引擎一样,大g.
Now, Yahoo is joining in the game by using domain “Trust” as a ranking factor.现在,雅虎是加入在游戏中使用域“信任”作为排名的因素。
From the latest patent abstract by Yahoo :从最新的专利文摘由Yahoo :
A computer implemented method of ranking search hits in a search result set.电脑实施的方法,排名的搜索结果在搜索结果中订定。 The computer-implemented method includes receiving a query from a user and generating a list of hits related to the query, where each of the hits has a relevance to the query, where the hits have one or more boosting linked documents pointing to the hits, and where the boosting linked documents affect the relevance of the hits to the query.计算机-实施方法,包括接受查询,从用户和发电的名单安打相关的查询,如每个点击有一个相关的查询,如安打有一个或多个刺激联系在一起的文件指出,该安打,而刺激联系在一起的文件影响的相关性点击到查询。 The method associates a metric to each of at least a subset of the hits, the metric being representative of the number of boosting linked documents that point to each of at least a subset of the hits and which artificially inflate the relevance of the hits.该方法Associates的一个指标,以每个至少有一个子的安打,公制被代表的数目增加联系的文件这一点,每个至少有一个子的安打,其中人为膨胀的相关性的点击。 The method then compares the metric, which is representative of the size of a spam farm pointing to the hit, with a threshold value, processes the list of hits to form a modified list based in part on the comparison, and transmits the modified list to the user.该方法,然后比较公制,这是代表的大小垃圾农场指向命中,与阈值,工序清单安打,形成一种改进的名单,部分是基于比较,并传达改性名单用户。
The patent provides some insight into the way it would identifying spam pages from search results, in conjunction with pagerank.该专利提供了一些深入了解的方式,将查明垃圾邮件的页面从搜索结果中,联同的PageRank 。 The system sorts reputable pages from spam pages by using combining with input from humans reviewers who manually identify these reputable seed pages.该系统的各种信誉的页面由垃圾邮件的页面用相结合的投入,从人类的审评谁手动找出这些信誉良好的种子的页面。
While link “trustability” acts as a fairly good indicator of site quality overall, it is still flawed as shown in the case of Expedia subdomains .而联系“ trustability ”作为一个较好的指标,网站质量的整体,它仍然是有缺陷的显示,在该案件Expedia的子网域 。
link spam , search engine patents 链接的垃圾邮件 , 搜索引擎专利








Update: Big G has cleaned up its results for Buy Viagra, Buy Cialis, and other keywords since then.更新:大克已清理其结果为购买威而钢,犀利士购买,和其他的关键字自那时以来。 The results seen in the screen capture no longer show up in the search results.结果主要出现在萤幕撷取不再显示在搜索结果中。