An important part of our job is providing in-depth information about tests to prospective clients. Recently, an international corporation reached out with a Request For Information, and specified: “only occupational assessments with externally validated reliability and validity data will be considered.” Experience tells me that when clients make this statement on an RFI they are looking for a particular accreditation, and while this is perfectly understandable there are reasons why my inner pedant finds this problematic.

So where’s the problem in requiring that a test is valid before you buy into it? Well, it’s not a problem at all. The problem is more what constitutes “externally validated reliability and validity data” when taken at face value. By understanding where this notion of ‘validity’ comes from, we can analyse tests on our own terms rather than simply taking someone else’s word for it, leading us to make more informed choices and achieve better outcomes.

“The ‘Registered Test’ status is seen as something of a kitemark by prospective clients.”

Here in the UK, test publishers submit their tests for review through the British Psychological Society (BPS). EFPA (the European Federation of Psychologists’ Associations) have laid out a set of standards to guide the review process. Your submission is inspected and rated according to a standardised model from the EFPA and once complete, the BPS will publish their findings in a ‘Test Review’.  If your test displays certain minimum standards, it can also be ‘Registered’.  Many see ‘Registered Test’ status as something they can rely on to ensure test quality, even although the reality is that this varies greatly.

“True ‘external validation’ would be where an independent body takes your test and gathers their own data for their own analysis”.

Maybe this is where the pedant in me gets in the way; such a review does not provide ‘an external validation of reliability and validity data’. For me, a true ‘external validation’ would be where an independent body uses your test to gather their OWN data, does their own analysis of internal consistency, retest reliability, construct and criterion validity etc.

It would, of course, be hugely time consuming and impractical for a body like BPS / EFPA to gather their own data.  What they do instead is review the information and data gathered and presented by the test publisher, which is not a bad place to start at all. I just want to be clear that in terms of what external validation is, data from the test publisher regarding their own assessment is not technically an external analysis.

The kind of data presented by test publishers is also technically complicated by nature, so again due to time constraints the review doesn’t reflect an in depth analysis. Rather, it provides an overall seal of approval from an authoritative voice to say ‘I have looked at the information presented and against a defined set of standards the test is Excellent / Good / Adequate / Inadequate / not enough information presented to rate’, (these being the rating classifications from EFPA).

Whilst this can be really helpful for non psychology professionals to gain a sense of the test’s qualities as described by the publisher, a closer look might reveal that not all tests are made equal even though they are subject to the same standardised review process.

To illustrate my point, one BPS Review I read recently showed that the test publisher presented a number of studies with data, also adding that  ‘ and we’ve also done another ten studies this year’.  No data was presented for these ten studies. They may have shown the test wasn’t predictive of anything, we simply don’t know. The spirit of the reviewer’s comments was that this was enough to be considered as “Good” by EFPA standards, presumably because it cites ‘a lot of studies’. In my view, these ten studies shouldn’t have been mentioned as part of the review at all as they were entirely without substance.

So, Test Review and Test Registration is a great scheme for what it is intended to be; a device to help those without specific technical insight gain some understanding of a test’s qualities.  However, it is not an ‘external validation’ as such. And as I have been hinting at, relying on the review alone doesn’t necessarily guarantee the integrity of the test.

As an educator I have delivered many courses on how we can evaluate the scientific studies that underpin test data, and it is often a revelation for students when they start to be able to critically evaluate tests for themselves.

I’ve witnessed learners become quite confused as to why long standing tests endorsed by international and ‘big name’ businesses can fail to stand up to scrutiny, because ‘How can a major corporation be wrong?’ In developing test user training, I believe that teaching critical evaluation skills is as important as teaching people how to use the tests and understand results.

I’ve also written about why I feel test user training is important previously (view article), and this adds to the message that the best person to evaluate a test is you the trained user, on behalf of your organisation.  Of course, test reviews are helpful and we support and engage with them at Podium, but we also encourage end-users to evaluate tests for themselves.

By saying ‘we’ll only consider externally validated assessments’ without any further investigation you may end up with a test for which the publisher did ‘ten studies last year,’ because, that’s impressive isn’t it?