Making information public always bears the risk that people discover mistakes. But that’s what will make it better, too. It’s what happened when we started publishing open EITI data on a large scale.
I wish I came up with the slogan used in the title of this blog. But the credit goes to the Government Digital Service (GDS) team in the UK. It’s one of their design principles. I learned more about it when I visited them in March this year, and the mind set of giving things away for free will improve your work guides everything they do. I’ll return to that later.
About a year ago, the EITI took the step to publish EITI data through our website. And even though we could have spent more time tinkering with it to make it better, it was the right move to go ahead and publish it via our API.
We’ve made our data available before of course. You could download the excel sheets per country and year, or even all of them in one massive excel file, on our old website. This time around though, we are using an API.
APIs (application programming interfaces) allow other people to pull our data in a raw form and analyse it with their tools, or use it with their data. With “raw” I mean that it’s not neatly transposed into excel sheets (as we also offer in our google folder) but just a string of values separated by commas – machine readable. For some users of data, this format is much more convenient than layouted excel files, because it can choose what data points they are interested in and feed these directly into their own systems and databases exactly as they want it. This allows them to crunch data covering many years in many countries.
The data that can be queried by our API is collected through excel sheets by all our implementing countries that have published an EITI Report. At the International Secretariat, we do a quality check of that data and then upload it to our website. Once the data is in our database, the API feeds it into our graphs and charts (on our country pages for example). Also, others can pull it directly.
Our data is “fuzzy”
One of our data “super-users” is the Natural Resource Governance Institute (NRGI). Their data team has been of invaluable support in giving feedback on our – uhm, not always so tidy – data.
For example, they found that the company names listed in the sheets vary across years in the same country, even though it’s probably the same company reporting. One year, a company active in the Democratic Republic of the Congo is called “BOLFAST COMPANY", and another year it’s called "BOLFAST COMPANY SPRL".
We face the same issue with names of government agencies and ministries. NRGI calls this data “fuzzy” (are you also picturing this?). The team shared a list of companies that lists the fuzzy names and could thus remove that problem for the countries that are high priority to them. Check out the way they’ve prepared EITI data here.
Not only have they helped us identify issues with the data, they have also helped us identify ways to avoid entering different names for the same company or government agency in the first place. You could, for example, add a data validation step for when the excel file is uploaded to the website and the API – a pop-up box saying “similar company name found – create new one or change?”. Or even better, have a little programme that independent administrators (the ones who put together the EITI Reports in countries) use to send us the data and have data verification happen right then and there.
Only data that is used is useful
Of course, the NRGI team has its own interests. If our data gets better, they spend less time cleaning it up. And that’s exactly the point. We want people to use what we produce.
Otherwise it’s, well, useless.
We want EITI data to be used so that extractives can be harnessed for the benefit for all. You can use EITI data to track payments from companies to governments, combine our data with macroeconomic data to understand relationships. Or whatever idea and interest you have. The main point is that you make it available in a good enough quality that people can use it.
By being open and interested in feedback, we can make our work better. So let’s make things open. It makes things better.
Anyone can copy our website code
We've also recently published our website code on GitHub. If an EITI country is building a new website, it can use our code. Basically, anyone can copy the EITI website. The website itself was built by Development Gateway and is licensed under the GNU General Public License v3.0.
And with this, back to London
Oh, and about GDS. Their default is sharing. They have a blog that tells the stories about what they learned in leading the digital transformation of government. On the GDS tumblr account you can download their remarkable posters and sticker templates. I’ve printed the “make things open” poster and hung it up the office – sends a somewhat subliminal message, as do stickers on laptops. They have also developed an excellent style guide for all content published on gov.uk. Follow them on twitter to stay tuned on their excellent work.