No shortcuts for good data

Data. Precise, complete, structured, unbiased. That’s the dream; although, if you’re into Data Values, you know this isn’t always the reality.

We work with the data we can get, not with the data we want. And the journey from the former to the latter is fraught with challenges that are rarely technical, and more often than not involve politics, interests, priorities, and resources. Getting to the point where most data is not only available but open, and inclusive in the way that the Data Values Project proposes, requires us to work with people and their motivations, not only technical standards.

That’s probably the reason why – like so many other people – my work on open data quickly got me involved in open government. The early dream of all data being open by default merely by technical achievement (and the automatic creation of value through that publication) is now well over a decade past due, joining a long list of abandoned techno-optimistic trends (yes, “Big Data”, I’m talking about you; and “AI”, don’t laugh, you might be next). Open data is still a great idea (don’t get me wrong), but to reap its very real and proven benefits, the human and social factors that hold openness back have shown to be much bigger challenges than the technical ones.

When faced with the option of working on issues with available data, or working on issues I cared about that required significant effort to acquire the relevant data somehow, the choice was clear; I’d rather work twice as much on a relevant issue than an easy win on something nobody cares about. Data Values exists precisely because what’s really important is not always reflected in available data, and most projects that use open data need some form of involvement from the data “consumer” to improve the source data or complement it with other available sources.

Thanks for reading The Data Values Digest! Subscribe for free to receive new posts and support our work.

When Data Values are an afterthought

Rarely – if ever – am I able to search, find, download and use a dataset I need in a straightforward way. Most of the data I have used through my work with Data Uruguay has actually ended up being made open after processes of collaboration and co-creation with partners who own or collect the data, but this is only possible when the value of the data has been visualized, political obstacles overcome, technical shortcomings fixed, and resources obtained. Those processes don’t usually aim to publish the data, but to create a tool, a site, or something that uses that data and this allows us to justify its opening and publication, a happy by-product rather than the original aim.

It’s not the easiest or the shortest route, but this strategy has allowed many civil society organizations to have insight into the decision-making stages where Data Values are needed in data publication. If we’re designing a product together, I can show you why gender or ethnicity is valuable in the data you’re handling, even if you don’t currently need that information for your internal use. 

A good example of that is something we’ve worked on in Uruguay with Agesic (Agency for Electronic Government, Information and Communications Society), through OGP’s open government plans, involving data from the National Civil Service Office (ONSC). They handle public servants’ contracts where ethnicity and gender identity are not relevant to their work, and even add a risk of possible discrimination.

However, two laws created a quota for public service reserved for people of Afro-Uruguayan ethnicity and transgender people. Although the responsibility for meeting that quota belongs to each agency and ministry, ONSC is the only centralized source of information to support compliance. Notable advancements were achieved by modifying the way ONSC registers and publishes data, allowing government and civil society to track compliance with these laws. 

If you ask me for a feature in our joint open data-based application that can’t be developed because the quality or granularity of the data is insufficient, we can start talking about improving that. If exporting your data out of your own systems every year to update the website is a huge headache, you will most likely consider “open by default” the next time you update your systems in order to save yourself all that time and effort. And so on.

Those opportunities for transformation are trickier in the private sector, where “the client’s always right” and the wrong incentives might be in place (if something’s a huge headache every year, it might also be quite lucrative). That’s why co-creation with civil society is so important, and why civil society needs to find ways to build equal partnerships (e.g. obtaining external project funding) to grow our stake in the project, and grow our influence on data structure and management. The more you are able to have inputs at the project’s design and development stages, the more you can nudge or even force long-term systemic changes.

The road to good data is long, but worth pursuing

In Latin America, as in most of the Global South, challenges start much earlier than data publication. Data has to be collected, and the infrastructure and resources needed for robust data collection, management, and publication systems are rarely in place. Sometimes, if you want the data, you have to collect it yourself, so that you can shape that data from the start. And sometimes things go really well and that data (structured around Data Values) ends up being adopted officially – as it happened to us with the gender-sensitive classification of streets and public spaces we use in ATuNombre.uy

Back in 2015 Data Uruguay wanted to use street naming as a way to show structural inequities for women. Streets are one of the most visible ways to recognize relevant people, and they reflect centuries of women and non-cis identities’ invisibilization. Data on the names of the streets was available, but we had to add a classification layer for what they were named after (manually, mind you).

After several iterations of the project and collaborations with Intendencia de Montevideo, the data we created was officially incorporated and published in 2023, as part of a commitment within Uruguay’s 5th Open Government Plan. But most importantly, the analysis based on that data led to normative change that now mandates that three of every four streets named after people from that point on, must carry the names of women or non-cis people until the distribution goes from the current 7-93 to 50-50.

This sort of work needs patience, dialogue, flexibility, and collaboration. It unavoidably involves the risk of not getting what we want out of it, but when it works, it builds much more than products or datasets. It builds relationships that can keep shaping data publication for a long time.

Getting this right is a long game. We’ve had plenty of time for low-hanging fruit and easy wins; the road ahead needs deep involvement and hard work to transform structures, prejudices, and institutions that have survived for centuries. A lot of patience too, because on top of taking a long time, sometime in the future we’ll have a whole new definition of what “good data” means.

Thanks for reading The Data Values Digest! Subscribe for free to receive new posts and support our work.

Related Articles

Responses

Your email address will not be published. Required fields are marked *