The hidden dangers of diversity data

On January 7th, the UK’s Office for National Statistics (ONS) published the first batch of results from new census questions on sexual orientation and gender identity. The addition of these questions to the 2021 English and Welsh census marked the first time a national census asked about sexual orientation (though Malta, New Zealand and other countries have since asked or plan to ask about sexual orientation) and among the first for gender identity, following countries including Canada, Nepal, Pakistan and India.

The most surprising result from the UK census was how closely data aligned with previous national surveys of lesbian, gay, bisexual, trans, and queer (LGBTQ) people. Around 1.5 million people aged 16 and older (3.2 percent) identified with a sexual orientation other than straight or heterosexual. For the gender identity question, around 262,000 people (0.5 percent) said the gender they identify with is not the same as their sex registered at birth.

While there’s much to celebrate about the recognition of some LGBTQ people in more national data collection exercises (more on these omissions later), warm words about “being counted” obscure some of the more dangerous trends at play. There’s a history of using the language of diversity and inclusion to prevent the introduction of progressive policies and legislation. A few weeks after the publication of census data, the UK Government’s Secretary of State for Scotland, Alister Jack, torpedoed legislation passed by the devolved Scottish Parliament to simplify the administrative process for trans people who wish to change the sex marker on their birth certificate. As justification for his actions, Jack cited abstract and unevidenced concerns about ‘the potential impact of the bill on women and girls’. No data exists to substantiate Jack’s claims, underscoring how governments and other policymakers pick-and-choose data to advance particular political agendas.

Share The Data Values Digest

Who is missing?

The UK’s diversity industry is booming (think training and mentorship programs, charter marks and accreditation schemes, awards and league tables). Data—whether from census, staff records, or surveys—on identity characteristics is the fuel that powers this suite of metrics, indicators, baselines, and targets. There’s increased interest across all sectors in “asking the right questions” and in creating detailed and disaggregated datasets. The thinking behind this is perhaps the belief we can remedy historical injustices by translating messy problems into neat categories or the view that we need more knowledge about something before action can follow. But with this apparent turn to increased visibility for historically marginalized communities, inclusion is only partial. For the LGBTQ people now counted in the UK census, they were invited to join a long-established system with existing rules as to “who counts” and who does not.

I have previously written about who was not counted in what ONS described as “the most inclusive ever” census, including non-binary people (who were required to identify as either ‘male’ or ‘female’), people under 16 years old, and individuals not “out” to others in their household. The appearance of inclusive action means that the organization responsible for delivering the census can present itself as having done something for LGBTQ communities. This has the effect of declawing more radical challenges to existing systems, such as whether the state should collect information about people’s sexual orientation and gender identity and whether a survey or census can ever present a comprehensive account of people’s identity characteristics.

Leave a comment

Counting is not enough

Data is presented as a powerful tool in the fight against inequality, exclusion, and injustice. “We can’t fix what we don’t know,” as the saying goes. Whether organizations are attempting to address inequalities associated with poverty, race, disability, or geography, the promise to plug data gaps is a common feature in organizational action plans. In some situations, improving the coverage of existing datasets to include historically excluded communities is explicitly a prelude to further action. But when plugging data gaps is understood as an end in itself, this does not go far enough.

We need to depart from the assumption that more or better data about those previously left behind is a prerequisite for change and that—after reaching data saturation—collecting more data is the best use of limited resources.

As the recent census results illustrate, in many UK industries and sectors, we already have enough knowledge and understanding of the problems that face many LGBTQ people. Think about the 24 percent of young people experiencing homelessness who identify as LGBT+, the 50 percent of LGBTQ people with experience of depression, or the 93 percent of trans people who report that media transphobia contributed to their experiences of transphobia from strangers on the street. Once enough is known about a problem, collecting more data and publishing more statistics does nothing to resolve these injustices. Diversity data is not intrinsically good or bad.

Despite its potential for good, when data is only used to describe problems, it can stall meaningful action and support the interests of those invested in the status quo.

Thank you for reading The Data Values Digest. This post is public so please share it.


Data practices as a new frontier for diversity and inclusion

Tweaking the design of data practices and systems is a new frontier for diversity and inclusion, whether it’s changing a list of response options or reconfiguring digital architecture to not presume a default user is cisgender and straight (or male, white, non-disabled and affluent). But reforming data practices is not enough when they continue to exclude the most minoritized in minority groups or fail to translate into meaningful action.

Talking about “diversity and inclusion” can sometimes serve as a smokescreen for ulterior motives, particularly when a gap exists between data subjects, producers, and users—groups that are not always equally invested in using data to change the world for the better. There’s a missing step between using data to describe a problem and using data to inform actions that respond to the problem described. The existence of detailed datasets does not necessarily change the thinking of those responsible for making decisions that impact LGBTQ lives. Decisions about who to count, what to count, and how to count are not value-neutral but bring to life a particular vision of the world around us. Data should reflect the vibrancy, messiness, and ups and downs of our experiences, not define who we are or how we live our lives. 

Leave a comment

Related Articles


Your email address will not be published. Required fields are marked *