Crap models and laughable claims: Immigration NZ’s spreadsheet fiasco

After the minister suspended its profiling pilot, Immigration NZ has released a spreadsheet and a briefing note. Tze Ming Mok is deeply, deeply unimpressed.

An RNZ producer contacted me last night to share a PDF of Immigration NZ’s profiling spreadsheet and briefing note that they had released to the media. She had been contacting a few nerds to get a sense-check on the “stats”, such as they were.

And wow, it was just as dumb as INZ had been playing it up to be after RNZ first reported on it. What they have released is an arbitrary points system. But, guess what? It’s still clearly discriminatory on the grounds of age and gender.

It’s impossible to know whether the spreadsheet did at any point, as implied by the official who originally spoke to RNZ, contain columns and data that profiled based on nationality. Immigration NZ is at pains to say in the briefing statement that they “never” racially profile, and of course, they have to say that. From my firsthand experience as a former frontline officer of INZ, however, that is a laughable claim. Specific countries are even described as “high risk” within the briefing, in context of the whole rationale for the profiling work in the first place.

I suspect that grounds such as nationality were included in the deleted personal characteristics column, or coded in the “other concerns” column. Immigration lawyer Alistair McClymont expressed similar concerns on RNZ today. The senior INZ official initially interviewed by RNZ specifically stated that nationality was used, and government chatter on this last week focused on trying to distinguish between country of origin risk-assessment and “racial profiling”. Anyone smell a rat?

Now, as points systems go, in nerd talk it is still a statistical “model”. As data scientist Aaron Schiff noted to me, “So it’s a linear model with a coefficient of 1 on every variable? That is still a model, and a possibly terrible one.” Statistics professor Thomas Lumley observed on Twitter this morning that “’the model’s pretty crap and isn’t really based on data’ isn’t a defence; it’s an indictment”.

Here’s how the pretty crap model works. The points are tallied up, minus the visa category, and there are arbitrary cut-offs to assign the individual to low, medium, and high priority for deportation. A number of the criteria are sensible enough. The inclusion of DHB debt is disturbing, on both privacy and humanitarian grounds. But the difference between being a man or a woman, all else being equal, will result in individual cases being “high priority” for deportation or not. Similarly, being one year older or younger at whatever the age category thresholds are, all else being equal, will result in individual cases being “high priority” for deportation or not. This is discrimination.

As there is no explanation provided about the variable values, I have no idea whether they are discriminating against men (assumed to be more violent?) or women (have babies that drain the health system?). Similarly, I don’t know if they are discriminating against younger or older people, or both (assumed to be more in need of health services?), in favour of working-age people (assumed to be more likely to commit crime?), or vice versa. If only they had provided a key to explain the rationale behind their discrimination!

Given the language used by INZ in the initial story about this on RNZ, which discussed “modelling” and “prediction” at length, it’s still unclear to me whether there was statistical risk modelling work done in the background, to do some kind of validation of risk associated with certain characteristics that they based some of the scales on. I mean, they spent 18 months working on just this spreadsheet?

However, we can see that some characteristics are just binary out of convenience (eg gender), or simply categorical in terms of age, but the “model” remains additive, so it’s doubtful that there has been any actual thought put into how much extra “risk” being a particular gender, or being in a broad age group, actually represents.

If they are denying using a more complex statistical modelling to predict risk based on these characteristics, then the points assigned for age and gender are completely arbitrary and based on discriminatory stereotypes and not on the actions of the individual.

If they did use more complex statistical modelling to predict risk based on these characteristics, the points assigned are less arbitrary, but still discriminatory in exactly the same way.

So it’s funny that the minister has gone to such lengths to protest that it’s not “an algorithm” and not “a model”, and “just a spreadsheet”, because it’s exactly as bad, wrong and discriminatory, but maybe just more dumb.

And now we look forward to the full OIA.

Tze Ming Mok has a background in human rights and social research methods. She is also a former sworn officer of the NZ Immigration Service.  


This section is made possible by Simplicity, New Zealand’s fastest growing KiwiSaver scheme. As a nonprofit, Simplicity only charges members what it costs to invest their money. It already has more than 12,500 plus members who, together, are saving more than $3.8 million annually in fees. This year, New Zealanders will pay more than $525 million in KiwiSaver fees. Why pay more than you need to? It takes two minutes to switch. Grab your IRD # and driver’s licence. It really is that simple.

Related:


The Spinoff is made possible by the generous support of the following organisations.
Please help us by supporting them.