Measuring Website Usage With Google Analytics, Part I

Knowing where to get started with reporting website statistics can often provide new webmasters with something of a challenge. In this post, I’ll quickly review the guidance provided by the Central Office of Information on Measuring Website Usage which:

describes a common approach to measuring website traffic [for central government]. This enables departments to answer Parliamentary Questions and Freedom of Information Requests about website usage consistently and reliably

I’ll also start to explore how to generate reports that satisfy those guidelines using Google Analytics.

The proposed metrics “are defined according to industry standards set by the Joint Industry Committee for Web Standards (JICWEBS)” and specify the following minimal level of reporting (Measuring Website Usage – Reporting requirements):

  1. The following web metrics, as defined by the Joint Industry Committee for Web Standards (JICWEBS), must be measured for each and every publicly accessible website operated by an organisation:
    • Unique User/Browsers
    • Page Impressions
    • Visits
    • Visit Duration
  2. Central government departments must measure Unique User/Browsers, Page Impressions, Visits and Visit Duration starting from 1 April 2009 for every website open on 1 April 2010.
  3. Executive agencies and non-departmental public bodies (NDPBs) must measure Unique User/Browsers, Page Impressions, Visits and Visit Duration starting from 1 April 2010 for every website open on 1 April 2011.
  4. The following information must be provided to COI at the end of each quarter:
    • Number of monthly Unique User/Browsers
    • Number of monthly Page Impressions
    • Number of monthly Visits
    • Number of Visits of at least two Page Impressions
    • Total time in seconds for all Visits of at least two Page Impressions
  5. Each report should contain figures for each of the previous three months. This information should be provided in the format shown in the reporting template in Appendix A.COI Website usage reporting template http://coi.gov.uk/guidance.php?page=237
  6. All figures should exclude internal web development activity, performance monitoring, automated broken link detection and other types of non-human activity (e.g. robots and spiders). Further details on what to exclude are found in the Page Impressions section.

So what does Google Analytics offer “out of the box”?

Headline report - Google Analytics

The Visitors Overview repeats these figures and additionally provides an indication of the number of ‘unique’ visitors:

Visitors Overview

At face value then, it would appear that the Google Analytics are providing at least some of the required stats (though we need to clarify that the numbers as recorded by Google Analytics conform to what the COI has in mind for those reports as described in their guidance on the Minimum standard for web metrics!) But what does that guidance relating to “at least two web pages” mean?

To understand the emphasis on “at least two pages”, it’s worth reflecting on the notion of bounces and the bounce rate. Bounce rate refers to the proportion of visitors to a site who only visit one page on a website before leaving that site, and as such tend to leave no meaningful analytics behind.

According to the ClickTale blog (What Google Analytics Can’t Tell You – Part 1), Google Analytics “has no way of knowing how long a bounced visitor, who only visits one page, spent on your website”. That is, it appears that the time spent looking at a page appears not to be based on the difference between the time when a page has fully loaded (and generated a trackable onload event) and its unload event; instead, it is calculated as the time between two loading one page and clicking through to and loading a second page on the sam site.

Which is why the emphasis on collecting stats from at last two pages: given the current crop of analytics tools that struggle to do anything meaningful with single page visits, specifying a two page visit means that not only visits to the site that are likely to be meaningful are reported, but also that the reports are more likely to contain meaningful data too. (There is an obvious problem here: if visitors visit two pages, and quickly click to the second from the first before exiting the site from the second page, the time spent on the second page won’t be captured? See for example Time on Site & Time on Page – Google Analytics metric mystery)

One of the nice things about Google Analytics is that it lets you create custom views, or “segments” of the data in which you can specify things such as the minimum number of pages visited when generating a particular report. In order to do this, you specify an “Advanced Segment”. Here’s what an Advanced Segment for a “minimum of two pages visited report” might look like:

GA Advancd segment - visited at last two pages

Applying this segment to the same data charted above gives these results:

Segmented goog stats

GA segmented view

So for example, in this version of the report we see that the average number of page views and the average time on site has gone up.

Something I don’t think Google Analytics report is the total time on site. Bearing in mind the lack of data regarding the time spent on exit pages, the best we can do is multiply the number of visits by the average time on site to get an estimate of the total time on site.

With just this single advanced segment, a simple calculation, and the out of the can reports from Google Analytics, I think we can deliver on the suggested stats based on a literal reading of the headings, though in a follow up post I’ll check to see if the more detailed spec on the metrics matches the way that Google ANalytics defines its metrics.

PS Unfortunately, the segmented report appears to have lost the number of absolute unique visitors (although I think the recommended report wanted the number of uniques, including bounces, to the site?) Anyway, let’s play: the number of visits gives the upper bound on the number of unique visitors, but can we also estimate the lower bound? One heuristic might be to look at the number of visits and uniques in the original report (176 uniques, 245 visits), see how many visits were lost in discounting the bounces (245-104 = 141), assume these were all unique and subtract these from the original number of uniques (176-141=35). I think this gives the lower bound on uniques as recorded by Google Analytics for non-bouncing visitors?