The Google Rating Guidelines first leaked by Pot Pie Girl are absolutely worth a half day of reading for any SEO. The document details how manual raters rate sites.
It is important to remember that these are what manual raters are asked to look for. Machine learning algorithms like Panda are completely separate, but both manual raters and the Panda classifier are probably working to roughly the same goals.
Raters are given a query and a location and are then asked to rate pages for that particular query and location based on the presumed user intent for a particular query. Each sites gets one of the following ratings: Vital, Useful, Relevant, Slightly Relevant, Off-Topic or Useless, and Unrateable.
Sites can also be flagged as Spam, Pornography and Malicious.
Raters are expected to evaluate local intent based on the query location and query itself.
Based on screenshots within the document, a number of raters will rate a particular query/page and they are supported by administrators and moderators.
Here are some highlights.
Six Months From Now, Lists Will Be a Spam Signal
The document identifies a class of queries called “queries that ask for a list.” So go spam it to death.
After typing a query, the search engine user sees a result page. You can think of the results on the result page as a list. Sometimes, the best results for “queries that ask for a list” are the best individual examples from that list.The page of search results itself is a nice list for users.
A landing page that provides links to many good individual results can also be very helpful to users.
“Queries that ask for a list” may be typed in singular or plural form. For example, the query may be [bank], English (US) or [banks], English (US)
The document lists credit cards, banks, bikes, airlines hotels and London Boutiques as examples of queries that ask for a list.
So now everyone will be running to create lists for this sort of query which will quickly make lists a negative quality signal forcing Google to focus on providing the list in the search results page rather than ranking pages with comprehensive lists.
The Relationship between Ratings and Spam
Spam flags do not depend on a relationship between the query and the landing page. A page should get a Spam flag if it is created using deceptive techniques – no matter what the query is or how helpful the page might be…
In some specific cases, it is also possible for a page to receive a Vital rating, and also be assigned a Spam flag.
This isn’t really a big surprise, but it shows how big sites like JC Penny and BMW can spam Google and get away with it because they also get Vital or Relevant ratings for a lot of queries.
Not that I’ve ever felt the need to stuff a page full of keywords, but for clients that need some convincing…
“We ask you to assign a Spam flag if you think the number of keywords on the page is excessive and would be annoying and distracting to the real user.” p. 99
and, somewhat more interestingly
“URLs may also contain keyword stuffing. These URLs are computer-generated based on the words in the query and are often formatted with many hyphens (dashes) in them. They are a strong spam signal.” p. 99
Redirects to domains under different registrants is a spam signal.
If you are changing domain names, make sure you use the same registrant for both domain names. This is a particular danger for freelancers who start under their own name and then create a proper business entity or businesses that change names.
What is Spam
Most of the answers shouldn’t come as a big surprise, but I thought the below technical definitions to be quite instructive.
Feed-driven sites with PPC ads are a no-no.
“A page that just contains freely available feeds and PPC ads, and was created just to make money, is spam.” p. 102
And raters are asked to watch for template-driven sites that use keyword suggestion tools to generate related pages.
Some websites use templates to mass-reproduce webpages automatically. The content is usually copied from sources that provide such content. You will learn to recognize templates, which usually follow a generic format or pattern. Look for slight keyword variations that suggest automated use of a keyword suggestion tool. p. 103
Not all Affiliates are Thin
I know a lot of people are struggling with Google’s definition of “Thin Affiliate.” So, straight from the behemoth’s mouth…
Some affiliates are created to help users. Anyone can become an “affiliate” of merchant sites such as Amazon and link to Amazon products. Webmasters may do this to show products they like or to help users find a good deal.
For example, if the affiliate offers price comparison functionality, or displays product reviews, recipes, lyrics, etc., it is usually not a thin affiliate, and, therefore, not spam. Some websites that offer price comparisons or other helpful shopping features, in addition to the affiliate link p. 106.
The lesson here is to be a good website first and an affiliate second. I know, it’s not a great revelation, but it’s worth repeating.
Unique Content is King, Or at Least it is not Spam
Some webpages with content are created just for the purpose of putting ads on them; writers are paid by spammers to create articles on a wide range of topics. Often the articles are very generic and don’t provide a lot of good information, but they are original. You won’t find the articles on another website. Although you may be convinced that the intent is to deceive, if the content makes sense and appears to be original, you will not be able to assign a Spam flag to such pages. You will have to use your judgment. p. 107
Of course, having unique low quality content can still get you a low rating, you just won’t get a spam flag.
Ooh boy, I’m not sure how I feel about this. I know it will make some people very angry. But I don’t often see Phishing sites in Google, so should I care about what they tell manual raters?
This landing page should make users (and raters) very suspicious and cautious. The spelling and grammar are bad and unprofessional, and the page feels “spammy”. What is most worrisome is that the page asks for the user’s bank password and pin number!
Even though we would not want to interact with the page, this type of phishing does not go against the Webspam Guidelines and the page should not be flagged as spam or malicious. p. 108
Conclusion, as in Google’s Conclusion, Not my Own
Remember to look at the page as a whole. Spam pages usually have some of these characteristics:
- PPC ads are usually very prominent on the page, and it is obvious that the page was created for them.
- If you do a text search, you will find that the content has been copied.
- If you visually remove all of the spam elements from the page (PPC ads and copied content), there is nothing of any value remaining
Good pages usually have these characteristics:
- The page is well-organized. There may be ads on the page, but they are well identified and not distracting.
- If you do a text search, the original page is usually the first result displayed.
- The page will have value to the user. A good search engine would want the page in a set of search results
I laughed at the good pages have ads that are not distracting part. So basically every single newspaper site in the world is a bad site for those terrible flyover ads.
I found this document on another site, it is not available at that site anymore. If you are looking for the document, I recommend you take an excerpt from this post and search for it in quotes in either Bing or, if you love irony, Google. Please don’t ask me for the document.
Photo Credit: Brent Moore via Flickr