Is Google Analytics Really Under-Reporting Twitter Traffic by 500%?

Twitter Clients Danny Sullivan has a flawed piece today titled β€œIs Twitter Sending You 500% To 1600% More Traffic Than

Tweet This

Written by Damon on July 18, 2009
Twitter Clients

Twitter Clients

Danny Sullivan has a flawed piece today titled β€œIs Twitter Sending You 500% To 1600% More Traffic Than You Might Think?” The sensational headline is no doubt good for traffic and his basic information and method is good, but the numbers are way off because he has a miniscule sample size.

Google Analytics, and other analytics tools, misreport Twitter traffic because many Twitter users use desktop clients that don’t show any referrer data. When these visitors open a link from their Twitter client, they get reported as direct traffic.

Amongst the analytics community there is a debate over how exactly how much traffic gets misreported. Tweet Stats shows 52% (at the time of writing) of visits coming from the web while TwitStats shows just 61% (at the time of writing).

Danny Sullivan’s Numbers

Danny does some testing to see how much direct traffic in Google Analytics actually comes from Twitter.

His method is to tweet a link to a post tagged with tracking parameters. He then compares his properly-reported tagged links against his Google Analytics referrer data and his Bit.ly stats against his log-files to get his numbers.

His final numbers?

Google records 9 visits tagged with the tracking parameters, but just 2 of those come from Twitter (as in the web site) thus under-reporting by 450% (he rounds it to 500%, I’m cool with that).

Bit.ly records 58 clicks, but there are only 32 tagged visits in his logs. The unaccounted Bit.ly clicks are caused by url-lengthener plugins that request the Bit.ly URL so that someone can see where the shortened URL goes before clicking but don’t follow the redirect. (Bit.ly deserves props for being very up-front about the gaps in their data.)

Log-files record 32 visits tagged with tracking parameters, but just 2 of those come from Twitter (according to Google Analytics) thus under-reporting by 1600%.

Google, in response to Danny’s questions about the differences, says that they are working on some issues caused by mobile devices.

Reconciling GA’s and Bit.ly’s Numbers

I give Google’s numbers more credit than Bit.ly’s.

I get notified every time there is a 404 on one of my sites. I’ve spent a lot of time investigating, and blocking, strange requests. Digging through log files and discovering all sorts of strangeness has given me a great appreciation for how much crap that Google Analytics blocks out.

There are a ton of bots out there, both malicious and innocent, that can quickly inflate your states. GA catches most of them. As a result, I’m inclined to believe the GA numbers whenever there is a big discrepancy.

Even with under-reporting caused by issues with mobile devices, I believe Google’s numbers are closer to the actual numbers than Bit.ly’s.

The Numbers in Greater Context

Danny’s 500% is in line with TwitStats’s numbers, and what I would have guessed because I can’t imagine not using a desktop client. But I’ve got some pretty good data, from a much larger sample than Danny’s, that aligns closely with Tweet Stats’s 52% web visits.

Soon after launching this site, I had a moderately influential Twitterer tweet my post on Involvement Device CAPTCHAs, thanks Naomi.

At the time I was getting very little traffic. Out of the traffic to that post on that particlur day (shown below), only 2 pageviews entered from elsewhere on the site and there was no search traffic to that page so the data is relatively unpolluted.

Google Analytics Twitter Stats

Google Analytics Twitter Stats

The BeTwittered, Netvibes, and TwitterGadget traffic are all functionally Twitter traffic (bonus points to anyone who writes a filter to group all of this traffic together).

I got 45 Twitter visits with referral information (41 actual visits, plus 4 functionally Twitter visits) and no more than 51 direct visits that actually came from Twitter. Because I was just starting up, I was getting practically no direct traffic and I’d already filtered out my own visits so nearly all, if not all, of these visits actually came from Twitter.

47% of my Twitter visits come from the web and show up in my Google Analytics stats. My Twitter stats are under-reported by about 200%.

It is impossible for this particular post to be under-reported by the 500% that Danny reports simply because all of my direct visits are already presumed (and likely) to be actually from Twitter.

Explaining the Differences

Why is there such a big difference between my numbers and Danny’s?

Here’s a hint. He’s only working with 2 Twitter visits in Google Analytics. If he had just one more visit reported in GA, his headline would be 300% to 1000% instead of 500% to 1600%. Compared with my 45 Twitter visits, his sample size is just too small.

Even though my experiment was accidental while Danny did a better job of setting up the experiment by tagging links and comparing tagged values, my larger data set on a page with less noise makes my data better.

The 47% of visits coming from the web on my post is also a lot closer to the TwitStat and Tweet Stats numbers which use an even better sample size.

Photo Credit: The Next Web


2 replies to “Is Google Analytics Really Under-Reporting Twitter Traffic by 500%?

  1. Traveller

    Oh, now it explains why I am getting bigger direct traffic on some days. Couldn’t understand from where it is coming. So it is twitter’s “fault” πŸ™‚

    Reply

  2. Andre Savoie

    Thanks for the well researched post. We just came across the same thing in our tracking codes are were starting to think we were crazy because the numbers didn’t add up. Our sample sizes are bigger, but the percentage difference is still in the range you described here.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *