Let’s start with my confession. I have been a user researcher working with games for over ten years. I have run hundreds of studies, and overseen thousands of hours of playtests (over 25,000 player hours at last count!). And yet, I know very little about stats.

There are two quantitative research things I know how to do. Today, I will explain both.

#### Get future issues direct to your inbox

### Comparing two sets of numbers

The first thing is how to compare two sets of numbers. I was taught this by Cyril and Mirweis at PlayStation, and I am grateful to them both for teaching me the only stats I know – how to compare two sets of numbers. This is useful when comparing things such as ‘how many times did the player fail’ or ‘how long did it take people to complete this level’

This method is appropriate for when the data is *numerical* rather than *categorical *(or* ordinal)*. Here’s a short explanation of what that means.

When you have some numerical data, it’s quite common to want to compare it. This allows you to learn “is there a difference between these two things”, and then inspire conversations such as “do we want players to fail more times on this level than on the next one?”.

To do this, you want to find the average, and then work out some confidence intervals to anticipate whether the difference between them is real or whether it was potentially caused by not measuring enough people.

So, after counting how many times people died on level 3, you can take an average – which looks like this.

We can see that on average, players died around 2.5 times on level 3.

We can then do the same thing for the next level.

(This is probably a good moment to mention there is a template that does the maths for you later in this post…)

Looking at the average for Level 4 shows us that people died on average more often on Level 4 than they did on Level 3.

But we don’t know if this is because Level 4 causes more deaths, or just random chance that it occurred in this study.

To identify that bit, we calculate confidence intervals. Which looks like this…

And we can see that the confidence intervals (the uppey-downy bits) overlap. The top of Level 3 overlaps with the bottom of Level 4.

Level 5’s confidence intervals do not overlap with any of the other levels. If the confidence intervals don’t overlap, there is a real difference between them. It’s true that more people died, and will die, on level 5 than level 4.

This hopefully means that Level 5 harder – although you should watch people play to understand actually why the difference in deaths occurred.

If the confidence intervals don’t overlap, we can’t tell if there is a difference. This is the case for Level 3 + 4. This either means that the number of times people die are the same, or that we haven’t seen enough players to draw an appropriate conclusion.

(There are probably errors in the terminology above, but as I said, I know little about stats – I just know how to compare two sets of numbers).

I use this all of the time – to count and compare deaths, completion time, etc. I made a template that you can duplicate to see the formulas required, and to have a go at doing it yourself.

#### Go deeper on quantitative research

Beyond this one technique, I’ve found two other tools very helpful.

Adjusted wald calculators like this allow you to state your completion rate (e.g. 3 out of 10 people encountered this issue), and from that anticipate how many people in the real world would encounter the same issue (between 10% and 60% apparently).

And the book ‘Quantifying the User Experience’ which has lots of nice decision maps like these, which tells me what tools I should (and shouldn’t) be using … and includes a crash course in stats to explain how to do them!

### Avoid common quantitative research errors

The second thing I’ve learned is a collection of things not to do. By recognising some stats errors, it helps me know when I should seek out someone better with stats than me to help out.

Avoiding common errors include:

- Don’t do the kind of maths I described above on ordinal data (such as likert scales). People often do, and get away with it, but it’s somewhat inaccurate as you’re treating categories like they are numbers.
- Think about the sampling bias you have created in your study, and don’t over-emphasise how representative your conclusions are
- Don’t assume that because you are measuring what players say they think or do, you are actually measuring what they think or do.
- Recognise that when you are limiting the options you allow people to select from, you are limiting the range of results you will get back, potentially distorting the truth.
- Avoid dogmatic rules about sample sizes. There’s lots of rules out there that have become dogmatic (‘quant studies need 30 users’, ‘qual studies need 5 users’), and many people repeat them without understanding the reason behind them. Understand why those guidelines exist, think about what you are trying to learn, and make conscious decisions rather than following ‘rules’.

### The job is not just ‘qualitative research’

I sometimes encounter the idea that user researchers are synonymous with qualitative research. I don’t think that is appropriate or correct. Even if you are more comfortable with qualitative research, you shouldn’t allow your skillset to determine the method you apply for answering research questions.

Instead always lead with ‘what does the team want to know’, and then ‘what is the most appropriate way of discovering that’. If that method isn’t one you are comfortable with, use it as an opportunity to learn how to do a new thing, ask for help from the community, or bring in some help from someone who is comfortable with it. Our job is to “help the team make evidence-based decisions”, regardless of the methods we are most comfortable with.

### What quantitative research skills should I be ready for in the job interview?

If you can answer the following questions, I would say you would be a stand-out candidate…

- What is p-value?
- How would you compare the difficulty between two levels? What would you measure, and how should that be interpreted?
- How would you measure if players are enjoying a game?
- How would you handle being asked ‘I think this study should have a larger sample size’?

You will notice that these questions are often not about ‘how do I do the stats’, but much more interested in ‘when is quantitative research appropriate, how should it be applied, how should I explain things to my colleagues, and what are the caveats for this kind of work’. Which I think is where the real challenge lies!

## Thanks, and hello…

Thanks for making it to the end, and particularly thank you to all of the new subscribers – we’ve had over 500 people sign up to this newsletter over the last few months. Many of the new readers joined this month, so welcome to all the new people starting their games user research careers.

I’ve written a book about how to be a games user researcher, do take a look if you haven’t already. If you had, I’d really appreciate a review on Amazon – it does have a huge impact on the book, and I value it very much.

As always, do email me or tweet me with feedback, questions, etc and I’ll see everyone next month!