Tuesday, 3 May 2016

Using segments in Google Analytics to identify external users

As Google Analytics guru Avinash Kaushik frequently points out, "All data in aggregate is cr*p", and if you want to get anything useful out of it you need to segment it.

For us, a fairly simple-sounding bit of segmentation is to look at how external users use our site compared to our internal audience of staff and students.

Although we can segment quite easily based on whether a user was on campus or not, that doesn’t give us a completely clear picture, as staff and students both frequently access our site from off-campus. Students use it from private rented accommodation and when they’re away from York during vacations, and staff use the site from home and when travelling on university business too.

Our new approach is to try to infer whether someone is an internal user or not based on factors such as the content they look at and the things that they search for.

Our 'Probably an external user' segment


Some of the conditions used to identify external users
So to try to identify external users, we've got an advanced segment is set to exclude sessions that meet any of the following criteria:

We're the service provider

An easy one to start with - exclude any sessions where the service provider is University of York.

Looked at pages that only internal users can see

Another easy one: exclude any pages that are behind authentication, which users wouldn’t be able to access without logging in. For example, the Timetabling Gateway or Student Enquiry System.

Looked at pages that that are aimed at internal users

This one is less clear cut as we have a lot of internally focussed pages that are publicly accessible, so external users can end up viewing them, accidentally or otherwise. We also know that there are some cases where our information architecture isn’t quite right and we end up directing prospective students to information that’s in the current students branch.

But still, if someone looks at anything in www.york.ac.uk/staff and www.york.ac.uk/students then it’s a pretty good signal that they’re an internal users, so we exclude them. We’re also excluding any URLs that include the words intranet, internal or current-students (with some regular expressions to catch variants with and without hyphenation or an ‘s’ on the end).

Searched for things that internal users use 

Again, it’s not 100% clear cut, but if someone searches for one of our internal systems then it’s likely that they’re a student or member of staff.

So we exclude any sessions where the user searched for things such as VLE, Moodle, eVision, VPN, email or Planon.

Using the segment

Here's how our audience overview looks with this segment applied (the orange line) compared to all of our traffic (the blue line):

It's not perfect (but that's not the point), but you can see right away that usage patterns are different, with much less of a drop-off at the weekend for external users.

Further reading


No comments:

Post a Comment