It is unknown how many of Stats NZ’s stick figures completed the census
It is unknown how many of Stats NZ’s stick figures completed the census

SocietyJuly 10, 2018

Stats NZ under fire over ‘very serious’ shortfall in digital first census data

It is unknown how many of Stats NZ’s stick figures completed the census
It is unknown how many of Stats NZ’s stick figures completed the census

Statistics NZ has conceded that the 2018 census response rate may be down by almost 5%, sparking concerns that some groups might not be captured by the survey. It has also led to a delay in the release of first results till March next year, and an apology from the government statistician

The 2018 Census, conducted using a much heralded “digital first” model, has fallen short, with the announcement of a 4.5% decline in responses, a rethink on methodology and a delay in the issuing of results.

The need for further “imputation”, in which supplementary information is used in an attempt to plug the census gaps, has raised concerns that some groups of New Zealanders could be unsatisfactorily captured in results.

The shortfall, enough to require what Stats NZ called a “revised methodology to compensate for the missing data”, invites fresh questions about the wisdom of emphasising online responses in 2018, with the old-fashioned door-knock approach only used as a follow-up.

The announcement came in a paper published late last week by Statistics NZ, titled “2018 Census: Potential impacts of revised methodology”. It confirmed the announcement – largely unreported – that the census had seen a significant drop in response. “Interim figures show that we have full or partial information for at least 90% of individuals, compared with 94.5% for the 2013 Census,” the paper explained.

While the target for issuing first results had been October 2018, that had been pushed back. “Because individual responses are lower than we had planned, we need more time than we’d originally anticipated to draw on other information sources and new methods to achieve the highest quality dataset … We are now looking towards a first release of census data in March 2019.”

Indications of a 4.5% drop in response were “very serious”, said Thomas Lumley, professor of statistics at the University of Auckland. “The point of the census is that it’s complete, and it’s what you benchmark everything else to. Ninety per cent is really not good.”

Statistics NZ said that in some cases it will need to draw on “information from the 2013 Census and administrative data to populate missing variables”. Among the “administrative data” sources used for imputation are the Department of Internal Affairs, MBIE, the Department of Labour, the Ministry of Education and Inland Revenue.

“The government has a lot of data on nearly everyone,” explained Lumley.

“Anyone who has for example paid taxes or got medical care or immigrated or anything like that, there are records the government has. And a lot of that data is linked together in the Integrated Data Infrastructure [IDI] …

“Suppose you need to know somebody’s income. If you know their age, and where they live, and their occupation, that gives you a range of plausible incomes, and you can either do a best guess or you can have multiple plausible values, which accounts for the uncertainties, and that lets you fill in quite a bit of information,” he said.

“It helps that the administrative data is fairly complete. But there is going to be some people who are double counted and some aren’t counted at all. The real problem is if there are groups who are poorly represented – if for example there are particular ethnic groups or particular age groups who are missed more often. If you’ve got a small minority group and a 10% hole, you can lose quite a lot of people into it. So it depends how good the coverage of the administrative data is to get the counting right.

“To some extent some vulnerable groups will be captured quite well, because the IDI has more data on them. So people on benefits, for example, the IDI will probably capture quite well. But some other groups may be underrepresented.”

The full impact would not be measurable in the short-term, said Lumley.

“The thing about imputation is while it can work really well, it’s very hard to tell how well it has worked in a particular setting … In five years’ time, looking at this data retrospectively, assuming the next census works better, they’ll be able to redo imputation and look retrospectively, and then the data will improve. But the next five years it’s going to be lower quality data, and we won’t know where it’s lower quality.”

Lumley said it was unknown whether the digital first approach was to blame for the drop in responses.

“What it may reveal is that they tried to save too much money with digital-first. Because if you’re going to do a census at all, it’s worth paying to get it right. There’s a lot of value to a good census and much less value to a not very good one.”

Government statistician Liz MacPherson acknowledged there were shortcomings in the “digital first” approach, in which the census was predominantly conducted online, with the traditional door-knockers coming into play only a follow-up phase in an attempt to reach those who had not responded, and issued an apology to those who struggled with the new system.

“While we can’t be sure yet why we have a lower response from individuals, there are a number of factors we will explore as part of our planned review. We already know we didn’t get everything right,” she said in a statement.

“We built new systems and processes to run this census, and while the majority of New Zealanders were able to take part without a hitch, we know that some people did not have a good experience this year. I have had mixed feedback from people. For some it was the easiest census ever; for others it has been a frustrating experience. For that I am sorry.

“As with every census, we will be undertaking a full independent review to ensure we can make improvements next time.”

In response to questions from the Spinoff, Statistics NZ said that a 4.5% figure was not yet reliable.

“It’s too early to confirm the final response rates yet. We expect to confirm the final response rate with our first Census data releases, scheduled for March 2019. In each census the final response and coverage rates are calculated by the Post Enumeration Survey which estimates the number of people the census should have counted,” it said in an email.

It added: “Stats NZ applies robust analysis and processing to the data to ensure it is accurate, and of high quality. The interim individual response rate we reported on was based on full or partially completed individual responses from field operational data.

Stats NZ said it was confident that the reduced census reponse would not make the data incapable of “accurately representing hard to reach groups”.

“We have been investigating the use of administrative data to complement census data for a number of years. This work underpins the revised methodology for 2018 Census and will deliver high quality data and high value data for our users,” it said.

Statistics NZ pointed out that there was “a long term, international trend of declining census response rates”. Because of this the 2018 Census “has a strategic objective to make more use of administrative data to improve the quality of census data”.

Asked whether the drop could be attributed to the digital-first shift, Statistics NZ said: “The digital-first approach was a big change for 2018 Census. We set a target of 70% online participation and our interim figures are showing that more than 82% of responses were online.”

Asked whether budget constraints had an impact on the drop in response, Stats NZ said: “We are undertaking a comprehensive review of the 2018 Census operation to gain a full understanding of why people did or didn’t respond. This will include a review of the budget.

In last week’s paper, Statistics NZ described the chief advantage of a national census is that “data is available at a neighbourhood level and provides detailed characteristics of small population groups. For the 2018 Census we used a new model for collecting the information. We focused on online participation followed up with postal reminders and household visits for those who had not taken part on census day.”

It added: “One of the goals of the 2018 Census was to improve data quality while modernising. The objective was to ensure accuracy of national counts and reduce variation in subnational response rates. Our interim calculations show that we have not reached our target coverage rates of 94 percent or higher …  However, our interim calculations show full or partial information for about 90 percent of individuals. The interim response rate varies across subpopulations and small geographic areas. To be able to provide good-quality information for all subpopulations and small geographic areas, we need to have very high coverage and response rates across all of New Zealand.

“Given the interim position of individual response rates for the 2018 Census, we are looking at expanding our imputation approach. We are investigating how we can impute households, and cases of item non-response. Both item and unit imputation will improve data coverage and, occasionally, data quality, but not for all census variables. If we do not impute, there will be large amounts of missing data that will affect the overall quality of the dataset.”

Keep going!