Trusting machines to predict citizens’ need for targeted resources can be damaging and increase bias. New Zealand has no choice but to get onboard.
When you think about it, a lot of the services the state provides are ones that you might not wish to be party to: criminal prosecution, incarceration, tax investigation, deportation, and child protection services all come to mind. Being on the receiving end of these services when you really don’t qualify can be harmful – perhaps an example of that puzzling phrase “unnecessary harm”. Even the process of establishing that you are not in “need” of such services can be unpleasant, expensive, and stigmatising.
Most people are law abiding, so directing these services where they are needed is like finding a needle in a haystack. Relying on human expertise entirely can introduce bias or blind spots and doesn’t allow scarce resources to be directed to best effect since you don’t have a view of the totality of any issue, nor necessarily what combination of policies and actions most usefully address it.
Using algorithms to target resources can and has caused harm.
Since the Cambridge Analytica scandal, we are more aware than ever that the information which comes to us via social media or internet search is selected by algorithms trained on our past behaviour and the behaviour of those like us. Generally, the data upon which such algorithms are trained are biased, and if the creators of the algorithms aren’t diligent, such bias will get built into the algorithm. The algorithm then becomes an instrument to perpetuate bias or prejudice. In the Facebook and Microsoft image libraries, it’s more common for people to be labelled as women in images containing kitchen objects; this builds in a propensity to identify men as women when they are next to stoves. In the US, Google searches for African-American sounding names such as Trevon, Lakisha, and Darnell are 25 percent more likely to return arrest-related advertisements than for those with white-sounding names.
In 2014, the county in Pennsylvania that contains Pittsburgh initiated a project called the Allegheny Family Screening Tool (AFST) – a statistical model to predict when a child is at risk of abuse or neglect. It was deployed through a call-centre – a child abuse and neglect hotline – and as a call came in the hotline worker was shown a “risk score” intended to inform the operator’s work. The model was frequently wrong, yet there was a tendency for the hotline workers to take the “advice” of the model, even though they were supposed to exercise their judgement. Note that being wrong in this case means that children who needed help didn’t get it, or that the families of children who were not in danger were needlessly subjected to child welfare services.
Also, several of the predictive variables were biased towards African-Americans simply because of existing bias: 48% of children in foster care in Allegheny were African-American, yet African-Americans made up only 18% of the population of children. Race wasn’t used as a predictor, but several variables used were strong predictors of race.
What does this have to do with New Zealand? As it happens, the AFST was developed by an international team including researchers from AUT. I have attended conferences where the AFST was presented as a great success; it is difficult for a non-expert to know what to think. The AFST mimicked a model built by the same researchers for the Ministry of Social Development in 2012. That model was extensively reviewed both from a technical perspective and from an ethical one, and at the end of the day the model wasn’t used. It’s worth remarking that the ethics review by Prof. Tim Dare recommended actions that would have mitigated most of the harm done in Pennsylvania. The AFST needn’t have been harmful, it was just badly built and poorly deployed.
NZ is too small to not use this technology.
It’d be tempting to ban all use of algorithms for targeting “services” where a false positive or false negative is harmful. That wouldn’t get rid of bias and prejudice, but it would reduce the scale of the impact and not systematise this prejudice. Unfortunately, New Zealand does not have that luxury. Because of our small size and relative lack of wealth, our future relative standard of living depends on the effective adoption of algorithms for resource allocation.
Government resource expenditure falls into two categories. It’s either expenditure on things that scale with population, such as front-line delivery (police, teachers, firefighters etc); or expenditure on things that scale with the complexity of the society, such as whether we have a regulated currency, a legislature, social support programmes, economic development policy etc.
Unfortunately for us, government resources (money) more or less scales with population (though Singapore or the OPEC countries buck that trend, for example).
We are a complex society, at least as complex as much larger or wealthier countries – think Japan, Australia, UK, even Singapore – so we have just as much need for the stuff that scales with complexity as those bigger countries. Except we have much less money to spend or people to do the work.
Something has to give. If we don’t find smart ways of efficiently delivering services and making decisions, we won’t be able to attend to all the needs of people living in a healthy, prosperous and happy society. And over time this relative lack will result in other countries having better standards of living, all other things being equal.
Which means our young people will leave for places with better jobs, education, and healthcare, and we’ll be less attractive to immigrants. It’s a downward spiral from there.
That’s the downside of not using AI and the like to do things smarter. But there is an additional upside if we do adopt this technology. Large countries have to struggle with issues that we don’t, at least not to the same degree: coordination and communication, physical distance or multiple time zones, jurisdictional issues, or extreme societal heterogeneity. These issues can be a real drag on efficiency and effectiveness. Perhaps the judicious and effective use of algorithms in the public sector will level the playing field, or even allow our small size to become an advantage.
While there are justified concerns about using algorithms for the targeted delivery of government services, we really have no choice in the matter. We just need to figure out how to do it well and ensure that the public servants responsible have sufficient maturity and expertise to do the job.
We need fewer generalists and more specialists in public sector leadership.
We dodged the bullet that hit Allegheny, Pennsylvania, but arguably only because our senior public servants are risk averse, not because they acted prudently. The ethical reviews of the MSD model provided a set of recommendations that, if implemented, would have mitigated most of the risk. So, why were these recommendations not taken up and the model deployed after improvements were made? Why not get some value out of the significant investment made in the MSD model?
I speculate one reason is that the public service is largely led by generalists, so it’s rare for specialist skills to be present when they’re needed. It’s harder to take measured risks when you need to rely entirely on someone else for your information. The State Services Commission has a deliberate policy of selecting generalists for public sector leadership, ostensibly to promote stability in the public service by forming a large pool of experienced leaders able to be parachuted into vacancies as needed. This policy, while directed at chief executives and their direct reports, will have an effect on every layer of management, and lead to specialists feeling discouraged from pursuing leadership careers. We expect hospitals to be led by doctors, universities to be led by academics, laboratories to be led by scientists – why should positions in the public service, accountable for technical work such as building machine learning tools, be led by generalists?
Whether we call it AI, neural networks, algorithmic decision support, machine learning models, or predictive analytics, the public sector must adopt this technology if we are to flourish as a nation. But there’s great risk if it’s done poorly. To ensure that it’s done well we need appropriate checks, balances and ethics frameworks – which, to their credit, the public sector is already creating – and we need those responsible for this technology to understand it at a level that they can provide effective oversight while pushing forwards.