I think you can guess where I stand on the topic, however, I do hope that the ‘Do Not Track’ header becomes widely implemented. If it does, there will not be a grey area anymore because it’s an explicit user choice to not be tracked. That will be a choice I will respect without question.
If you’re lacking patience, want an easy solution, don’t like to read and/or just don’t care about how Google Analytics works then this post isn’t for you. Maybe I should have warned you sooner. But, accept my apology for not doing so and check out a service like this.
If Your’re Adventurous, Keep Reading
I’m sure there are others. Those three are the first that come to mind. Leave others in the comments if you like.
How it’s Done
Disclaimer: I’m publishing this because I hope it will be useful. There’s a small chance I’m full of shit. I don’t think so, but it’s possible.
- Press Ctrl + F12 on the test page.
- Switch to the 3rd tab labeled “Network”.
- Press F5 to refresh the page. You’ll see calls for all the page resources required to render the page.
- Filter so you see just the images. Do this by clicking on the link in the bottom toolbar labeled “Images”. Look for a request called
__utm.gifand click on it.
- By default, it will open to the “Preview” tab. Switch to the “Headers” tab.
You’re now looking at the actual GET request and parameters which transmit all the tracking information to Google Analytics. This is the default information sent to track a page view, and much of it isn’t required. Here’s a summary:
A GET request is made for
http://www.google-analytics.com/__utm.gif. If you’re following along on the homepage of my site, you’ll see the following parameters and values. I’ve pasted them URL decoded (easier to read) and with comments describing what each represents.
utmwv:5.2.6 // Google Analytics code version utms:2 // I'm honestly not sure, but it isn't required. I've heard it's used to count requests per session. I'd love to know if you know. utmn:1509196652 // random number generated to make sure the gif isn't cached. utmhn:automateeverything.tumblr.com // the hostname utmcs:UTF-8 // character encodeing (not required) utmsr:1680x1050 // size of the display (not required) utmvp:1218x504 // size of the browser window (not required) utmsc:24-bit // color depth (not required) utmul:en-us // language (not required) utmje:1 // Java enabled, 1=yes 0=no (not required) utmfl:11.2 r202 // Flash version (not required) utmdt:Automate Everything w/ Bash, Linux & Command Line // Page title tag (not required) utmhid:405374938 // A random number used to link Analytics GIF requests with AdSense. (not required) utmr:- // referer, ~=none utmp:/ // URI utmac:UA-29271731-1 // Google Analytics Profile ID utmcc:__utma=234084878.479851276.1333418536.1333418536.1333418536.1;+__utmz=234084878.1333418536.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); // Cookie information. More details on this below. utmu:q~ // Again, I have no idea what it's used for but it isn't required.
Here’s my source for what the parameter/value pairs represent. Scroll down about 75% of the way to where it says “The GIF Request Parameters”.
Before we can do that, there’s one parameter for which I haven’t explained the values. Actually, I couldn’t find an explanation at all (from Google) about what the
utmcc= parameter is used for, besides the fact that it’s a representation of the stored cookies. Here it is again from the example above:
After some Googling, I pieced together the following information about what each part means. This was the hardest part about getting all this to work. I’ve replaced the actual values from the example above with descriptive names in hopes that it will be easier to understand what each represents.
utmcc:__utma=DOMAINHASH.RANDOMNUM.TIMEFIRSTVISIT.TIMEPREVIOUSVISIT.CURRENTTIME.NUMBEROFSESSIONS;+__utmz=DOMAINHASH.CURRENTTIME.NUMBEROFSESSIONS.NUMBEROFSOURCES.utmcsr=SOURCE|utmccn=CAMPAIGN|utmcmd=MEDIUM; DOMAINHASH // A static number that is unique for each site. Find it by inspecting the '__utma' parameter value on your site. RANDOMNUM // It's a randome number, you can generate this in any way you choose. TIMEFIRSTVISIT // Time of first visit represented as seconds since 1970-01-01 00:00:00 UTC TIMEPREVIOUSVISIT // Time of the visitors previous visit represented as seconds since 1970-01-01 00:00:00 UTC CURRENTTIME // Current time represented as seconds since 1970-01-01 00:00:00 UTC NUMBEROFSESSIONS // The count of total sessions for this visitor. NUMBEROFSOURCES // A count of the number of different sources the visitor has used to find your site.
date command provides a really easy way to get a current time stamp in seconds since 1970-01-01 00:00:00 UTC. Here it is:
date -u +%s
You can also generate a random number of proper length by running the following command:
< /dev/urandom tr -cd 0-9 | head -c 9
Now that you have an adequate understanding of the minimum amount of information needed to recreate the GET request for
__utm.gif, it’s completely up to you to figure out how you’re going to generate the values and keep track of session counts, source counts, detect referral sources, etc. If you’re using PHP on your site, you may want to have a look at Server Side Google Analytics. That isn’t an endorsement, since I don’t have any experience using it, but it was one of the tools I came across while doing my research.
I started working on this because I wanted to track open rates in Google Analytics for my email campaigns. I’ve been able to work out the kinks and confirm that this absolutely works when combined with Google Analytics Event Tracking. It’s likely that I will post the full details of that project as a follow up to this post, so make sure to check back here or follow me on Google+ for updates.
Please post your questions and I’ll do my best to help. As always, happy automation!