Google Analytics – Hiding your own visits

As a web developer I’m interested in the visitor statistics for any site that I manage, but with this new blog I find that my own visits are swamping the results. On commercial sites I’ve used IP address filtering to exclude employee visits, since they are usually coming from a limited number of access points related to company offices, but in this case I view my blog from several difference devices with dynamic IP addresses.

Trawling the web I found a solution using _setVar on googlelytics.net but this approach is now deprecated.

Several more searches gave me a solution that works using _setCustomVar which is the replacement for _setVar.

Step 1 – Adding the exclusion to your site

Firstly, I assume you have the standard Google Analytics tracking code somewhere on your pages, like this:

<script type="text/javascript"> 
  _gaq.push(['_setAccount', 'UA-????????-?']);
  _gaq.push(['_trackPageview']);
  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();
</script>

Then create a simple page with the following additional script:

<script>
  var _gaq = _gaq || [];
  _gaq.push(['_setCustomVar', 1, 'Me', 'yes', 1]);
</script>

(It doesn’t matter where the script goes, although I put it just below my tracking script which means I can also remove the var _gaq = _gaq || []; line).

The page content doesn’t matter (I just show a message saying my own visits are now excluded), but it shouldn’t be linked by other pages in the site or your sitemap, and should have a URL you can remember.

To exclude your own visits you simply access this page from every browser you use, and it sets the custom variable in a cookie that is used in the next step. If you clear your cookies or use a different browser you need to visit the page again before accessing the rest of your site.

Step 2 – Configuring Google Analytics

To use this cookie, we create an “Advanced Segment” that will filter tracking based on it.

  1. View the Dashboard report for the site you are interested in (the one associated with the UA-????????-? in the tracking code).
  2. In the top right corner there is a dropdown labeled Advanced Segments:, select this and it gives you the option to manage Advanced Segements.
  3. Select the Create a new advanced segment link which gives you a report builder interface.
  4. Expand the Visitors dimension in the green part on the left, and drag the Custom Variable (Key 1) entry onto the report dimension or metric area.
  5. For the Condition enter “Does not match exactly” and for the Value enter the name from your _setCustomVar call (in this case Me). Note it’s the name that you enter, not the value. It should look like this:
  6. Save the segment as “Exclude own visits” and you can now apply this to your reports by using the Advanced Segments selection screen as accessed in point 2.
  7. You can create the opposite filter by using Condition “Matches exactly” instead, and name it “Own visits”.

One thing I haven’t yet discovered is how to set an Advanced Segment to be the default view on the dashboard, comments on how to solve this would be very welcome.

Cause and Effect

One thing which really annoys me in articles is confusion or deliberate misrepresentation of cause and effect, so I thought I’d collect a few good examples to disect.

Of course this is just my opinion, which colours my reading of articles and may not be the original intention of the writer.

Man dies after Taser arrest near Bolton    (guardian.co.uk)

IPCC to investigate police use of Taser to subdue man, 53 – the third fatal arrest using stun gun or pepper spray in a week.

Clearly tasers and pepper spray are dangerous, three fatalities in a week is something worth investigating and the writer is right to raise serious concerns.

Or perhaps the first paragraph of the article is more relevant:

A man who stabbed himself in the abdomen has died after being Tasered by police officers.

I’m not a trauma specialist, but I’m fairly confident a knife in the abdomen is more likely to be the cause of death than being zapped by a taser, although I suppose being tasered as well may not improve things. I don’t imagine the chief constable who volunteered to be tasered (Top cop tastes a Taser) would have tried the same with a knife in the gut.

No doubt the post-mortem will clear things up, but it seems likely the journalist has conveniently ignored cause and effect to get a better headline.

  • Effect: Death
  • Claimed cause: Taser
  • Much more likely cause: Stab wound to abdomen

The BBC managed to find a more realistic title: Stab man Tasered by Greater Manchester Police dies

Algorithm Bashing

Strange article on the BBC site today – When algorithms control the world. I say strange because a lot of it seems just plain wrong and much of the rest just waffle, which is relatively unusual from Jane Wakefield.

For example, on film rental recommendations by algorithms:

“The algorithms used by movie rental site Netflix are now responsible for 60% of rentals from the site, as we rely less and less on our own critical faculties and word of mouth…”

OK, I can accept the second bit about word of mouth, but I think it’s fair to say most people used to base their film viewing on trailers, charts and possibly critic reviews. But if you base your critical reasoning on a trailer you’ve failed at the first step, charts are more an indicator of lead popularity and advertising budget, and critical reviews are effectively based on algorithms not understood by the user (i.e. opinion). At least the Netflix/Amazon/Blockbuster algorithms are based on my opinions/actions or those of real viewers, and I also suspect that the algorithms are far from being difficult to understand.

There’s also the usual bit of Google bashing, with no alternative solution proposed to indexing the estimated 13 billion pages on the web; my view is that the most popular predecessor, Dewey Decimal, isn’t really going to cut it although deweybrowse.org gives it a go. Maybe alphabetical by web address would be cool, like the old phone books, thus resurrecting the aardvark as an unlikely company mascot and consigning the pointless www prefix to history.

As for this one:

“Meanwhile, a transatlantic fibre optic link between Nova Scotia in Canada and Somerset in the UK is being built primarily to serve the needs of algorithmic traders and will send shares from London to New York and back in 60 milliseconds.”

Kind of like ticker tape but a bit quicker. OK, the trades were done by phone, but we’re still talking evolution here not radically new applications.

And algorithmic trading is a buzzword-friendly way of saying calculating the best trade, which I’m pretty sure the real human traders do as well, and in a similarly unpenetrable manner in most cases. Warren Buffet even publishes letters which go into some detail about Berkshire Hathaway choices, but you don’t see too many people taking this “algorithm” and making billions, there’s plenty more in the sage’s head to make it work.

Update: 26 August 2011 ~ 16:00

The corresponding Slashdot article has some interesting views.

Wiki: we have too many Wikis…

About eight years ago I implemented an intranet for a medium sized fund manager, and at that time we rolled the code ourselves to produce a pretty basic news, documents and pages site with a solid search function that most users were happy with. Not many people had experience of creating their own content, and a quick demo of whatever dreadful version of Sharepoint was around then soon put them off a “full feature” solution.

Now, every other person is on Facebook, Twitter and so on, and a significant number produce their own content in the form of blogs or personal sites. In that environment it’s difficult to implement a one-size-fits-all standard across teams even within a department, let alone across a very diverse organisation.

My current employer has an altogether different approach to team intranet sites and document storage which is surprisingly laissez-faire for a big organisation. There isn’t much standardisation and even the guidelines are pretty slim. However, what this means in practice is that in a team of thirty we have five team site/storage approaches all running in parallel that I’m aware of:

Of these only TWiki is officially deprecated, and even so there doesn’t seem to be any sort of deadline for migration.

I’ve used the latter three to varying degrees and can’t honestly say any of them is a hands-down winner. Jive is probably the easiest to use all round but its focus on socialising everything gets on my nerves at times and I’m not convinced scoring me on putting data into the system is anything other than a sales and retention tool for the software house. Sharepoint is often slated for being Microsoft centric and so on, but it does actually work pretty well and is much more integrated in the newer versions. Confluence as a pure Wiki is pretty good, but things have moved beyond that and people expect an integrated solution which it isn’t.

Unfortunately nowhere can I find a convincing tool to migrate between these options, and in a big organisation with a matrix approach to budgeting it’s impossible to close a knowledge system down if the migration can’t be automated at a sensible cost. This seems to be a definte area for a product – something that will export one or more knowledge systems into some centralised format and then import the data into the target(s).

Eight years ago it was viewed as an opportunity to have a clear out, and content was transferred manually. I wonder what my colleagues would think about doing that with several gigabytes and hundreds of documents…

Keyboard Hell!

I learned to type on a QWERTY UK keyboard layout, which despite the odd foray into US layout and resultant transposition of some minor characters (like @ to ", good fun for email addresses), was reasonably friendly for use as a developer.

Now however I’m using a Swiss German layout and all of a sudden the Alt Gr key has become a significant part of my life as I hunt for brackets (square and curly) where before it was only interesting after the introduction of the Euro.

I can’t help wondering if this has an impact on developers using this keyboard layout, or if they get used to having two shift keys. Friends have mixed views, and naturally research on the web gives a similar wide range of results, although the one area almost universally agreed is that having to use a different machine with a different layout is a pain in the posterior.

This hit home when I tried to use a couple of VMs that weren’t aware of my locale, with Swiss German keyboard labeling but UK layout in one and US layout in the other. At that point a Das Keyboard would have been useful, as would slightly less secure passwords without symbols – typing them into vi and then pasting rather defeats the purpose 🙂

Update: 21 February 2012

I’ve been using a Swiss German keyboard for six months now, and I am starting to get used to it even for programming tasks.

The one thing which still drives me nuts though is the ~ symbol. Not only does it require AltGr, but it’s also an accent key even though to the best of my knowledge German doesn’t use either ã or õ.

Free Wifi

I’m on holiday, and the holiday park where we’re staying has free wifi in their restaurant area, which is unfortunately not very reliable. Initially I was very annoyed about this – if you’re going to do something (and advertise it on your website), do it properly otherwise it’s just going to get up customers’ noses.

However, looking around last night, just about every person between the age of 14 and 40 had a smartphone in front of them, no doubt all connected to some poor mid-level wifi system that was never designed to be a carrier quality provider for a hundred or more subscribers. I wonder how many they estimated when they put it in – counting tables and assuming one laptop per table on average I reckon 20 max, so maybe not building in 5x or 10x redundancy is forgivable.

I remember reading somewhere recently that the average household has more than one mobile phone per adult, although of course with the available wifi my chances of finding the article before drop kicking my laptop in frustration is too small to risk. Suffice to say even a reasonable proportion of those phones being “smart” must be a scary statistic for small wifi providers; just in my two-adult family we have six active phones (not including the several holdovers each from old contracts), of which four are smart to varying degrees. This may seem ridiculous, but consider one each in two countries, a work Blackberry and a spare for visitors and it soon adds up.

Of course I could tether to my phone, but I’m in Europe and out of my home country, so ten days tethered roaming would likely lead to a bill larger than the cost of the holiday; if I’d thought ahead I could have picked up local USB 3G dongle, but then I was expecting wifi…

What is unforgivable in my view is that I’m typing this in notepad because they have a firewall in place that blocks my blog editor. I can’t imagine the blog admin page has much to offend a firewall, and my webmail is fine attachments and all, so the real security holes are still nice and open… Why do access point providers do this? It’s not even a check for inappropriate content, which you could perhaps argue is valid at a holiday camp, but “security risk” which I know for a fact it isn’t since I put the content there. Why would the firewall be any better at spotting this than my up-to-date virus software, and more to the point it’s a public wifi – some of the systems attached are bound to already be infected with something and the firewall isn’t going to help a bit with that.

Oh well, it’s annoying but I’m on holiday so spending a couple of hours getting round it is likely to get me in trouble with the family, notepad will do for now.

Hello World!

So here I am writing a blog entry.

The process of getting here has been rather convoluted, but an interesting example of coincidences and how they can lead to a position not intended but probably beneficial.

My previous mail provider had support for webmail, but the interface was pretty dreadful, so given I have a fast cable connection I thought of using Roundcube on my own machine to give myself a decent interface via IMAP. Since I have some spare time on the train, I fired up Fedora on VMware Player and set to work to see if this was any good.

I fairly quickly decided that it wasn’t worth the bother – compared to Gmail or Yahoo the available open source webmail systems aren’t anything to write home about, and using iPhone on the road and Outlook at home is difficult to beat. However, as an aside I checked out MovableType and thought I’d give it a go. A few minutes later I had an installation up and running, so here we are… apart from the fact that this clearly isn’t sitting on my laptop anymore.

During the above experimentation I managed to mess up my mail – by playing with DNS a bit too aggressively I transfered my provider, thus killing my mailboxes. Despite the mailboxes being paid for separately, and inherently independent of DNS, my provider deleted them when the name servers changed and was deeply unhelpful when asked to restore them.

However, new provider turned out to have a decent basic hosting offering with mailboxes, so emergency fix was to grab that offer, and less than 4 hours later I had my email addresses back. The new provider also turned out to be very responsive – under 15 minutes for help vs 2 days and likely to be useless from the other.

Net result was beneficial.

  • I now have hosting, not just email.
  • New provider is more responsive by a factor of about 100.
  • I have a blog on a real host, not my laptop.
  • They even have Roundcube – OK, it’s not Gmail/Yahoo but it’s better than before.

Finally, my Outlook client had all my emails backed up bar maybe two. Phew.