Asking for personal data: doing the ‘right thing’ vs the ‘usable thing’

If you own a website, you probably collect personal data. Email addresses and the traces of users left by their cookies, if nothing else.

Many sites ask for a bit more. For instance, the Royal Opera House asked for my date of birth when I registered this morning.

And very properly they explained why: they want to make sure I’m old enough to register for their service. (Yes, there is a digital age of consent in the UK). They also explained that they collect demographic data on users for reporting. I presume this is so they can explain to funders that their audience isn’t entirely made up of middle-aged middle-class folk. (Oh, wait…)

The Royal Opera House does the right thing when collecting personal data.

Only ask for data you need.

The Royal Opera House doesn’t need gender, so doesn’t ask for it. Either openly with a list, or covertly by asking for Mr/Mrs/Miss/Ms etc.

Limiting the data you ask for is a good idea from many angles. Fewer boxes for people to fill out tends to create a more usable website. It’s also less data to store and protect. If you collect personal data, you’re responsible for looking after it. Why bother, unless you really need the data?

Explain why you need it

Plus how you’ll use it, who you might share it with and why, and how I can check what you store and ask you to delete it. If you’re collecting personal data you need to state these things openly on your site. If you aren’t prepared to say what you’re doing with the data, you shouldn’t be doing it.

The Royal Opera House does this in a nice way. It has a ‘Why?’ link under the date of birth box. That opens a brief statement explaining their reasons for asking, just at the point where I wanted to know.

Ask the question using appropriate language.

Not hard with dates of birth, but it can be a minefield for more sensitive data like gender and ethnicity. It’s even harder when ideas on appropriate language change.

For instance, some people will argue that ‘Other’ or ‘Trans*’ are the correct alternatives to Male or Female. But when I heard @Kitation’s excellent talk Asking about Gender, I learned that the option preferred by non-binary people is ‘Non-binary’. Or a whole range of other more useful and specific terms, depending on why you’re asking. I also frequently refer to How you can make the gender question on an application form more inclusive.

Even when you know all the guidelines, you might need to compromise.

By ‘you’ I mean ‘me’. Quite recently.

I’m working on a project that needs to collect student data. Like many websites, some of the data is necessary to deliver a service and some of it isn’t. Like many education services, we collect the unnecessary stuff to understand audience demographics and equality of access.

So far, pretty normal.

So I started with my usual guidelines, and Male/Female/Non-binary. But as we dug into where our data was coming from I changed my mind.

Our data is provided by teachers, from lists they already have. And those lists aren’t going to have non-binary as an option.

In fact, they probably won’t have anything other than ‘male’ or ‘female’. The UK Department for Education doesn’t recognise anything else in its school census data collection guidelines. So far as the DfE is concerned, students are either male or female — though at least in the UK they get to pick. “In exceptional circumstances, a school may be unsure as to which gender should be recorded for a particular pupil. Where this occurs, gender is recorded according to the wishes of the pupil and / or parent.” Which made me wonder what happens if the pupil and parent disagree.

Should we be right, or should we be normal?

Other data collectors in the UK Education sector do recognise a third option: other. If we used non-binary, whenever we matched our data with other education data, we’d have to convert non-binary to other. That’s not so hard. But we also guessed that teachers are likely to use other, given that’s what their sector uses.

So: the data that teachers have may only include male or female. It may have other. It isn’t likely to have non-binary. If we insist on non-binary as the right option for our system, teachers will have to edit this value for every student who doesn’t identify as male or female. (Or they might just leave this data out entirely.)

Right isn’t just being weird. It’s being difficult.

Insisting on the correct form feels like it’s creating friction for our user, the teacher. Maybe we should be firm, and insist on the ‘right’ form. But we’re not trying to make a statement. We’re just trying to gather good data to support the expansion of a great education programme.

So we’re going with male/female/other. Not the right thing. But I think it’s the most usable thing and will result in better data for our organisation. (Though user testing might tell us different.)

Have you had to tackle a trade-off between the ‘right thing’ and the ‘usable thing’? What did you do?

Author: Emily O’Byrne

Collect by: