A detailed analysis of filters Google

(1 ratings Rating: 5 out of 5)
Загрузка ... Loading ...

Google SandBox

At the start of 2004, occurred as a mysterious concept of SEO - Google SandBox, or " sandbox »Google. This designation gave the new spam filter Google, aiming at the selection of the issuance of new sites created

Подробный разбор фильтров Google

SandBox filter is manifested in the fact that no new sites in a search engine when issuing for all phrases. This happens even though carried out right promotion (and, without connecting spam techniques), and the presence of unique and quality of content.

To date, the SandBox regards only the English segment, and sites in Russian and other languages ​​of the processing of this filter is not exposed. But probably, the filter will be able to expand influence.

Suppose the goal SandBox-filter - delete from the issuance of spam sites - indeed, the appearance of the results can not be expected for months no one spammer search. But, at the same time, suffers from a large number of high-quality, new sites.

Reliable information is still missing, we specifically do not understand what the SandBox-filter.

There are a number of assumptions that are derived from experience that shown below:

  1. SandBox - young filter sites. The new site falls in the "Sandbox", being uncertain time in her search for the system translates it into "normal."
  2. SandBox - filter links that are put on the new sites. We will try to see from the assumptions of the previous fundamental difference - the filter is not applied to the age of the site, and links to the sites of age. In other words, Google has no claims to the site, but it is taken into account external links to it shall be refused if the date of their appearance has not passed ten months. Because the rankings are the main factors external links, then ignoring them is tantamount to issue search engine absence site. Which of these assumptions is true, it is difficult to say, but probably they are both correct.
  3. The site may be contained in the "sandbox" from three months to one year or longer. It is observed that the sites out of the "sandbox" massively. That is, the terms "Sandbox" is not defined individually for each site, and for groups (sites that have been created in the time range to fall into a single group). The filter was then removed to a group, thus a group of sites will be in the "sand" different times.

How to determine whether a "sandbox" Google's

Typical symptoms when you suspect site:

  1. Your site is regularly visited by search engines and indexed by Google.
  2. Your site has received PageRank, and search engine finds and displays the correct external reference.
  3. The search at the site (www.site.com) give correct results, with the correct title, a description of the resource (snippet).
  4. Your site for finding unique and rare phrases that are contained in the text pages.
  5. Your site does not show up on other demands in the first thousand results, even for those under which it was originally created. Sometimes there are exceptions, and the site is located in the 500 - 600 positions across multiple requests, which, of course, the essence does not change.

Virtually no method to bypass the filter. There are a number of assumptions, such as can be done. But this assumption is not more than that, for the average webmaster barely acceptable, in RuNet running the wizard. The basic method - to work on the site (the most important place is still occupied by the source code page optimization). And of course - to wait until the filter.

When the filter is soaring ratings of about 400 - 500 positions and above.

Google LocalRank

25 February 2003 by Google patented a new algorithm for ranking pages, called LocalRank. And it is based on a great idea that the pages do not rank them on a global reference citation, and in the group of pages that are thematically related to the request, according to the citation.

LocalRank algorithm in practice is not used (at least, in the form described in the patent), but the patents contain a number of great ideas that everyone should read optimizer. Mainstreaming by referring pages used by almost every search engine. Though it is, apparently on other algorithms. Cheats-source search engines or Google, Yandex does not provide any. Not only they apply the principles of their work. Therefore, only a thorough study of the patents will allow the general ideas to understand how their work can be realized in practice.

During the reading of the post, keep in mind there is a theoretical information, and is not a guide to action in practice.

The basic idea of ​​the algorithm LocalRank expressed by three points:

  1. Using an algorithm that selects the required number of documents relevant to that search query (denoted by the number N). Initially, these documents are sorted according to the criteria (this can be PageRank, relevance or evaluation or some other criterion il their grouping). Numerical criteria expression denote, for example, OldScore.
  2. Each of the N-up takes place ranking, in the end, each of which receives a new rank. Denotes the rank of LocalScore.
  3. In the third step of the value LocalScore OldScore and multiplied, in the end it turns NewScore (new value), according to which the final ranking pages.

The key to the algorithm is the procedure of the new rank, in result of which each page is assigned a new rank LocalScore. We describe this process in detail.

0. Using the ranking algorithm, select N pages that meet your search query. The new ranking algorithm only N-pages. Each of this group is the rank OldScore.

1. For a page in the calculations LocalScore allocate all pages of N, which are the external links to this page. Denote the set of pages M. It is noteworthy that in a variety of M does not fall within the identical page host (host, filtering takes place on the IP-address) as well as pages that are mirrors of the original.

2. On a subset of Li partition the set M. In the subset includes page attributes are combined:

  1. Belonging to host a similar or the same. For example, pages that have the first three octets of the IP-addresses are identical, fall into the same group. In short, the pages, IP-address which belongs to the range xxx.xxx.xxx.0 - xxx.xxx.xxx.255, will belong to the same group.
  2. Pages that have similar content or adequate (mirrors, mirrors).
  3. Pages of the same site (domain).

3. Each page in the set Li is ranked (OldScore). From each of the sets selected by a single page, in which the largest OldScore, other - are excluded from consideration. So, it turns out a specific set of K-pages that provide links to this page.

4. In the set of K pages are sorted according to the parameters OldScore, then in a variety of K are only the first k pages (k - certain predetermined number), others - are excluded from consideration.

5. In the fifth step forward LocalScore. For the remaining k-page summarizes their meaning OldScore. This is expressed by the formula:

Here m - parameter specified, it can vary from one to three (unfortunately, the information contained in the patent on the algorithm described, a detailed description of the parameter is not given).

Later, as the calculation of LocalScore of N for each page is completed, the calculated values ​​NewScore and re-sort the page according to the new criteria. For calculations NewScore use a formula:

NewScore (i) = (a + LocalScore (i) / MaxLS) * (b + OldScore (i) / MaxOS)

i - Page, for which a new rank is calculated;
a and b - are the number (patents do not provide complete information about these options);
MaxLS - the maximum calculated value LocalScore;
MaxOS - maximum OldScore.

I will try to get away from math and repeat the above said in plain language. :)

At the initial stage of a selected number of pages that match the query. Do it on algorithms that do not consider subjects of links (eg, generalized link popularity and relevance).

After determining the group of pages will be calculated each local link popularity. All pages in one way or another connected with the theme of the search query and, of course, have a similar kind of theme. Having analyzed a selected group of links pages (ignoring the other pages on the Internet), we obtain a local or thematic link popularity.

After using the previous steps we have values ​​OldScore (page rankings on the basis of relevance, overall link popularity and other factors) and LocalScore (Rating pages of thematically linked pages). On the basis of a combination of two factors were ranked pages and displayed overall rating.

Last modified: 10/08/2013 at 07:57
Published: Friday, January 2, 2009 at 14:15
Choose the language:

Comments: 1

Average bar: 4.96 out of 5
  1. SFrolov
    January 4, 2009 at 02:54

    Very informative. Prior to that, only heard of the sandbox, and somehow sneaked, but know can not hurt. Thanks for the material.

    Reply

Have something to say? Do not be silent!


Your comment will appear after being moderated.
Spam and off-topic posts are deleted.

To insert php-code, use the tag:
<pre lang="php"> php-code </ pre>


I'm not a spammer!