As open data continues to play a pivotal role in supporting the response to the covid-19 crisis, Datopian spoke with Open Data NZ’s, Paul Stone, on the extraordinary work they’re doing in supporting open data to address real world challenges.
Audrey: Welcome Paul, it’s great to have you with us at Datopian. Could you please tell us about Open Data New Zealand and your role?
Paul: Well, we’re all about encouraging and supporting government agencies to proactively release open data. Back in 2011, we had a declaration passed by cabinet on open and transparent government which directed central government agencies to do that [the wider state sector encouraged and local government invited]. So we’ve set about essentially trying to win the hearts and minds of government agencies to help see the value of open data. And we do that in a number of ways.
We established a data champion network - data champions are people appointed by the chief executive from each agency to be our conduit into that organisation. Now they don’t have to be a data expert at all - they just need to know their business and be influential across the organisation, and we help them to see the value of data and support them where needed. Through that network we communicate through newsletters, occasional workshops and we do one on one chats where we can. We’ve started with the central government in Wellington because that’s easier, but we try to get out to local government as well.
The one thing that they all have in common, is that they all operate on public funds. So, one of the key messages I say to them all - whether they are directed or invited or encouraged under the declaration - is that they are all managing the public asset.
And that’s our key message over the last 2 years. We put it simply and frankly,
# “The data that you have in your organisation is not your organisation’s - it’s our data, it’s everyone’s data - and you have the role of being custodians of a public asset. And so you have a responsibility to maximise the value from that asset. And that is where open data comes in.”
Closed data is data that can be reused a small number of times within an organisation. As soon as it’s shared with other organisations under trusts, it can start expanding on the value generated from that data. Once data is open, there is no boundary. It can be used an unlimited number of times by an unlimited number of people. So the potential for value generation is maximised.
And then of course, not all data can be open, some data needs to stay closed, some data should only be shared under trusts. But we need to start looking by default at how data can be open. Data about people at a micro-level, when it’s about individuals, needs to be protected. But we don’t stop there - we need to think about how that data can be aggregated in a way that provides value in a broader sense.
So much of what I do with Government is getting hearts and minds towards the bigger picture about getting the best out of data for New Zealand.
Audrey: Paul, could you tell us if you’ve seen a shift in public sentiment or attitudes on open data over the years?
Paul: That segues into where open data NZ fits - and that’s Stats New Zealand. In 2017, The Government Chief Data Steward (GCDS) role was established. In fact, one role was split - we used to have a government Chief Information Officer (CIO). and then the States Services Commissioner split that role and created a Chief Government Digital Officer and a Chief Government Data Steward. That has its pros and cons - but what it’s done is create a focus on data, and that role was allocated as a functional lead across government. That is essentially where an agency is given responsibility to lead development with a ‘whole of government’ approach.
So the Government Statistician was given the role of Government Chief Data Steward. And that role isessentially to try and lift the capability across government on the management and use of data, as well as keep accelerating open data. So the open data programme moved into part of supporting this role.
And so we’ve been trying to embed open data as part of the everyday management and governance of data by government agencies - and also embed the publication processes to be open data wherever possible. We’ve also changed our strategies a bit to doing data inventories, because to get good quality open data we need to start with well managed data. You can’t manage what you don’t know you’ve got! And what we’ve found is that very few organisations really know and understand all of the data that they hold.
Audrey: What’s the main reason you see behind that ?
Paul: I think it’s been a slow evolution towards data - in the 90’s we were very much document management oriented, and that was the big driver in trying to be better record keepers. Data is just ‘beneath the surface’ - and so what we’ve ended up with is a great mess of data and all sorts of repositories from hundreds of excel spreadsheets maintained on people’s network drives to systems that have databases behind them that are locked into proprietary access.
And that’s where the open data message has been helping agencies unlock the data in their own systems - by helping them see that open standards and open API’s can actually free up the data within their organisation for their benefit as well as get it out into the public. So I think it’s been a slow evolution, but it really comes down to individuals. You get pockets of great work and then they can fade away again.
But certainly over the last 12 months there’s been a lot more awareness. We were offering agencies up to $40,000 worth of data inventory work - and we made that offer a couple of years ago - and the uptake was really slow. Then all of a sudden people suddenly got it and realised the value of what we were offering!
And there are some lessons learnt! For example, agencies have discovered that they’re sometimes not sure if they actually own the data that they hold - especially if they’ve collected it from industry. They also realised they sometimes collect duplicate data and they’ve realised data gaps - so it’s been a very valuable exercise!
There are two reasons why we’re doing the inventory - one is that we help agencies manage their data better, but the other aspect of what we do is to try and understand the demand for data. If we can understand that demand, it helps us make the case to government agencies to put more resources into releasing it.
Everytime we go out to conferences or workshops in the public domain, and ask, “Hey, what data do you want?”, we quite often get the question back, “What data have you got?” And the reality is people don’t know what data Government holds or even which agency holds the data. For example with student loans - do you go to the Ministry of Education? Do you go to the Ministry of Social Development that gives out the benefits? No, Inland revenue has it! So with the inventory, our aim was to have that data in the public so that people could know and understand what data is held by which agency…
One of the variables of the inventory would be whether it was open or not, but the inventory would be all the data that was held - so long as it’s not classified - simply to list the data held in order to enable a conversation about appropriate access. So if somebody had a problem to solve and they discovered in the inventory that an agency had just the data they need - and it’s not open now and might not be able to be open - they might be able to have a conversation about how to use it in a trusted arrangement to solve the problem they’ve got. We’re still pushing for open data wherever we can, but it’s helping people understand that you can still create that value or solve problems with shared data.
Audrey: Paul, what do you see as the biggest challenges that data practitioners face today?
Paul: I think data quality is a big issue, And I think the challenges are that there are so many standards. It’s almost that we need to simplify the landscape of standards - because it becomes a real challenge for those managing the data at the operational level. But the standards are also there for a reason, and they help with interoperability. So I guess that’s the crux- getting data from across government to be interoperable, to minimise the amount of cleaning and re-structuring of data in bringing it together. And the current situation with the virus is a great example of that.
# Open Data and the Covid-19 Response
Audrey: Could you tell us more about how open data has been really helpful in the response to the current covid-19 crisis?
Paul: Yeah, I guess this is a situation where there’s a real mix of both shared and open data. Where I’m seeing open data being sought after is to initially communicate the situation. People are building dashboards and visualizing the data to help clarify the messages. The reality is that we’re “information first”- when Government wants to communicate information they tend to go to publishing a static web page.
At the very best we get html tables of information, which are hard to re-use - or we get pdfs. And that’s been the real challenge - people want to be able to take the information and repackage it and communicate it to help a broader range of people understand what’s going on. Or look for different insights and ask different questions.
There’s two sets of data - there’s data that’s being collected now during the crisis, but there’s also data that’s been available before the crisis. And there’s one website that a group of volunteers have built from census data examining the different categories of work that people do - across different demographics like age, gender and ethnicity - to show the impact on people that are vulnerable to the isolation.
So both of those are examples around having insights into what the current problem is. One example is using the current data coming out - and they’ve had to scrape it from a website initially - and we’ve managed to get it a little more re-usable in the process. But the challenge is you’ve got people working under the hammer in a crisis situation trying to get information to Ministers - and they’re not going to re-invent their processes now.
I think there’s a real opportunity to get governments around the world before the next time - and there will be a next time - to be more ready to publish data by default. And then from that allow information to be evolved for all purposes - for internal government purposes, but also to allow the community to use the data for their own resilience.
I recently ran a workshop, “Open data for resilient communities”, because I’m seeing a real need for people to find a way to help. There are a lot of people out there that can do stuff with data - and that’s their way of
contributing. So the need to communicate better comes from a need to help others at
this time. It’s our own way of feeling better - when we’re helping others.
Another example, Wellington City Council have brought together data and created a map called the social services map. They’ve plotted all of the social services you can get access to around mental health support, women’s refugees etc. on a map, as well as all the essential services that are running, such as service stations, pharmacies, police stations and the supermarkets etc. This is how we use the data that is already open to support people.
Leading on from that I’m starting to see bank economists releasing graphs about credit card use and which industries are profiting from the current lockdown situation. And that highlights the real need for the private sector to be thinking about releasing data to support everybody to make good decisions through this time as well.
More about Open Data NZ’s workshop on “Open Data for resilient communities” may be found here.
Datopian delivers outstanding solutions that enable your organization to realise your data’s potential. From hosted data portals powered by CKAN to specialised data engineering, from agile data practices to data strategy development, Datopian empowers you to transform data to insight.
© Datopian (CC Attribution-Sharealike (by-sa))