The process by which information from the Treasury website was extracted has been the subject of much speculation, and a lot of confusion, writes Alexander Stronach in a post that has exploded since originally being published at The Understatesman
The Treasury data breach has been a shitshow. I don’t think I’ve ever seen a bigger disconnect between the experts and the pundits, and I don’t say that lightly. I’m not a security guy, for what it’s worth: I’m a writer at a tech firm, but I’m fascinated by security and over the last few days I’ve been talking to people who actually know their stuff. Almost unanimously they’re calling this a breach. Almost unanimously, the pundits are off shouting that it’s “not a hack!”.
Right from the start, I’m setting a rule: we’re not going to talk about “hacking”. It means totally different things to the IT sector (anything from coding at all to randomly kludged spaghetti code that really shouldn’t work) and the public (a man in a trenchcoat saying “I’m in!”), and most InfoSec types shy away from it anyway. I’m not going to bore you with the whole hacking vs cracking debate, but we’re going to call this thing what it is: a data breach.
So what happened? This is a web server:
Its job is to display web content. Every time you go online, you’re accessing content from web servers. Simple enough? This is a staging server:
It serves as a testing environment. Content intended for the public but not yet released goes on the staging server to make sure it runs smoothly for when the time comes to make it public. Some staging server content never goes live: it either didn’t work as expected or it wasn’t meant to be there, or something changed and it got pulled.
Treasury cloned their web server to use as a content staging server, then added the budget to it for testing. The problem is, they also cloned the index configuration: the instructions that the search used to store search data. Both web and staging server stored their search information in the same place and SOLR—the program running the search function—wasn’t properly configured with this in mind. That gave the web server access to the search information about documents on the staging server via the search bar, though not the staging documents themselves.
To illustrate, here’s the Spinoff on Sunday:
See how you get the title and the first few lines? Using the exploit on the Treasury’s site, somebody pulled snippets of the budget like that from the staging server. Critically, to do this, you would need to know the title of the section. You search for a specific heading in the web server, and it comes up with the title and the first 4-5 lines. It was, all things considered, a pretty small hole:
- It required the attacker to know the content was on the staging server
- It required the attacker to know the specific wording on the staging server
- Even then, it only gave them snippets
So what happened? Well, a leak. The actual leak. The budget didn’t leak: the budget’s search index leaked. That’s essentially a table of contents. The budget ToC being out in the open covered points 1 and 2 above: the fact the budget was ready to go public (thus, probably on the staging server) and a list of searchable titles and subtitles.
“Leak” is a strong word, too: it used the same headings as the 2018 budget. I’m still a little fuzzy on whether the actual index leaked (as in, got sent to the wrong place/got left out somewhere irresponsible/got made public too early) or whether somebody just heard it was the same as last year’s via the Thorndon grapevine and started punching in queries.
What about #3? Well, that’s why there were 2000 searches. They pulled 2000 snippets and put the budget together like a jigsaw. It’s not “just a search”: it’s using a leaked search index to perform 2000 searches, to take advantage of an exploit that pulled small pieces of content from a staging server, then stitching that content together in post. It’s not something Johnny Q Public could do by accident. It’s not an “open door” at all. That’s also why National got some details wrong: they didn’t have a complete picture. They had a very good outline, though. All the titles and subtitles, and the first few lines after each.
It’s all a bit rubbish but – to quote InfoSec luminary Adam Boileau (aka metl) – “it’s not rubbish if it works“.
Metaphors about the door being unlocked do us no favours, unless we really want pundits to be better-equipped to twist the actual events. Whether or not it’s a “hack” doesn’t really matter: it’s an intentional attempt to gain access to private data. It utilised an exploit to pull content that wasn’t meant to be public. It’s a breach. More than that, there are established protocols for what happens if somebody finds an exploit in government software. These rules were written by the National Party in 2014, and National failed to follow them. Their failure to follow protocol merits investigation: they let the particular use of an exploit go undetected for their own political gain. Even if the content was delivered to them anonymously by a no-good samaritan, they bear at least partial responsibility for this because they went public instead of reporting it.
Where did the Treasury fuck up?
- They should’ve considered their SOLR configuration when they cloned their data to the staging server.
- They probably shouldn’t have cloned their web server to begin with—making a staging server from scratch with the same dependencies might have been a pain in the ass (I’m honestly not sure: I don’t know what their dependencies look like) but it would’ve been a lot safer.
- They could’ve been jazzier about this year’s subtitles.
Where did the National Party fuck up?
- They identified an exploit but—instead of following CERT protocol—they used it for their own personal gain.
Still, I don’t believe Simon Bridges has committed a crime, nor has he committed breach of confidence. He has violated his CERT obligations, which at worst means he’ll get a strongly-worded nonbinding letter from MBIE telling him not to do it again. He did a bad thing, but not all bad things result in him being removed from parliament in a paddy wagon. To quote one of my anonymous sources: “he’s an asshole, not a criminal.”
It’s ridiculous that pundits are calling for heads to roll. At the end of the day, it wasn’t a big deal. Grant Robertson shrugged and moved on. The Treasury were right: what harm could somebody actually do by using that exploit? Release a half-complete version of the document a day early?
By the by, it’s not dodgy or extreme that anybody called it a ‘hack’. If there’s a problem with the word, it’s not that it doesn’t mean this, it’s that it does mean this because it’s a vague word that means wildly different things to different people. Not all hacking is a man in a trenchcoat typing into a green/black Linux CLI then saying “I’m in!”—remember, it’s not rubbish if it works. Makhlouf and Robertson could’ve maybe been more precise with their language but that’s not a crime either.
And then, of course, the pundits got to it. Either the Treasury were little angels who did no wrong, or they were cringing fools who dropped a box of printed budgets off at the top Lambton Quay. What we actually have here is a pattern pretty typical of data breaches: a small screwup like improper SOLR config let an attacker access to data they shouldn’t have had. I’m sure somebody is going to shout at me that it wasn’t a small mistake, but unless they can explain how to correctly configure Apache SOLR in a Drupal installation so it doesn’t allow partial read access to cloned data in a staging server then they can fuck right off with their piety and condescension. It’s a screwup for sure, but the people talking about “open doors” need to pull their heads in.
What’s really happening is that the pundits smell blood in the water, and they don’t care what actually happened—they just want an excuse to sink their teeth in.
Same old #NZPol, I guess.
Credit for assistance to Sana Oshika, and several others who preferred to go unnamed.