Query Forensics: Choosing the Right Synonyms for Long-Tail Key Phrases

Query expansion via synonyms provides clues to the highest volume search terms.

Written by Damon on July 7, 2011

In the previous query forensics post on how Google handles synonyms, we discovered that a page can rank with a synonym without even appearing in the text under the right conditions.

The great news is that you can see in Google which queries are returning synonyms and use this information to find synonyms that can rank you for the maximum number of key phrases.

Have a look at these results for “configure remote desktop” in Google.

Google SERPs: Configure Remote Desktop

The first thing to look at is the number of results returned (labeled “1” in the image). If Google returns results in the millions for a long-tail search term, there is a good chance you are looking at search results that have been expanded through synonyms.

The next thing to do is to look for highlighted synonyms (labeled “2” in the image) in the SERPs. While the example just goes to page 1, you will sometimes need to go deeper than the first page to find query expansion via synonyms.

For this particular query, “enable,” “setup,” and “set up” are all synonyms that Google is using. Because Google is checking for synonyms and it is returning millions of results, this query is being expanded with synonyms. For more competitive queries, you won’t see any query expansion.

The optional third step is to check the Google cache of these pages to see if the synonym candidate appears anywhere on the page. The reason we check the Google cache rather than the actual page is that if a page is ranking for a term because of backlinks, then Google shows a message at the top of the cache page.

Google Cache Message

In this particular example, the third document is ranking for configure because of anchor text from links pointing to the page. Configure does not appear anywhere on the page hosting the YouTube video and a number of the results in the bottom of page 1 only have the term way down in the text.

All results in this set either have configure on the page somewhere or in links pointing to the page except for the video search results which would suggest that synonyms are even more powerful when optimizing for video.

The next step is to swap out the synonyms in the query to see if we can find a key phrase that is getting enough volume not to need query expansion. What you are looking for is key phrases that are not being expanded by synonyms.

Key Phrase Documents Query Expansion
enable remote desktop 4,670,000 yes
set up remote desktop 14,500,000 no
setup remote desktop 6,120,000 yes

Based on this data, set up remote desktop is the most competitive term as it returns the highest number of pages without query expansion. While it would be advisable to have configure, setup and enable appear on the page, you can rank for these terms with links.

One thing to note, “set up,” “setting up” and “setup” are all highlighted in the “set up remote desktop” and “setup remote desktop” data sets. This is not query expansion via synonyms, but rather an example of query expansion via stemming.

There are few interesting points with this data set. The first is that in “set up remote desktop” SERPs, “set” always appears adjacent to “up” in some form or another.

This can’t always be the case because if they were truly equal, then they would return roughly the same number of documents, but “set up” returns 14,500,000 while “setup” returns less than half the number documents. The reason is likely that documents where “set” and “up” appear separately are also included in the “set up remote desktop” count; however, Google doesn’t show any results past 1000, so it is impossible to check with this query.

One really interesting query that shows how awesome Google is is “set remote desktop up” which properly stems with “set up,” “setting up” and “setup.”

As SEOs, we should note that “set remote desktop up” doesn’t appear as an exact match anywhere in the search results, or at least not in the top dozen or so pages. So not only is optimizing for mis-spelled words no longer effective, but optimizing for bad grammar doesn’t work anymore either.

In the previous Query Forensics post on synonyms, the data used the same number of keywords and better illustrates the pattern of higher-volume key phrases returning fewer results because of the lower-volume key phrases getting expanded through synonyms.

Key Phrase Documents Query Expansion
cycling trails netherlands 60,300,000 yes
cycling routes netherlands 136,000 no
cycling paths netherlands 5,310,000 yes

In this data set, “cycling routes netherlands” gets the highest volume of searches, but it returns the fewest documents. So the best target in this set is “cycling routes netherlands” while trails and paths can either be secondary on-page targets or in backlinks.

Key Takeaways

The technique doesn’t work for every query and in many cases I can’t figure out why one query returns more documents than another. However, for sufficiently long-tail and low-volume queries, the highest-volume key phrase is usually the one that returns the most documents while not showing signs of query expansion via synonyms.

Google can detect and fix poorly formed grammar and pages with poorly formed grammar exactly matching the searcher’s poorly formed search will not be returned.

Pages can rank via synonyms if the SERPs page is showing query expansion via synonyms, but you will probably need links with that synonym. It is still better to have the synonym present on the page.

