The Practitioner’s Guide to Data Ethics

By Michelle Seng Ah Lee, AI Ethics Lead at Deloitte, PhD student on algorithmic fairness at the University of Cambridge, and DataKind UK Ethics Committee member.

Today, we are launching the Practitioner’s Guide to Data Ethics, prepared by DataKind UK volunteers.

Why did we run an Ethics DataDive?

We knew there were many tools available for data scientists and developers to use to embed ethics into their models, but we wanted a resource summarising what is out there. Our aim was to complete an intensive, expert-driven, exploratory analysis of the current open-source tool landscape in algorithmic ethics in just one day (learn more about how we did this at the end of this blog)!

The overall objective was to explore the existing open-source tools that seek to assist data science practitioners with ethical challenges. They would focus on one of five topics in algorithmic ethics that were selected by the DataKind UK Ethics Committee:

  1. Checklists

  2. Communication Strategies

  3. Explanation

  4. Fairness

  5. Natural Language Processing

We asked the participants to find toolkits in their topic area, score them on a number of metrics of functionality and user-friendliness, and write down their assessments.

What were the key findings?

In each section of the report, you can find breakdowns about the tools including which were top-rated for functionality, user-friendliness, technical experts, non-technical beginners, and beginner data scientists. There is also a scorecard rating each tool out of five for attributes such as being user-friendly, scalable, and easy to integrate with other systems. And there are some overall pros and cons to help you compare each tool. The participants also discussed trends and gaps that they spotted in each topic area.

Here are a few highlights:

Fairness

  • Gap between real-life considerations and the academic vacuum use cases

  • Gap between lack of education among practitioners on what is essential in fairness evaluation vs. assumed expertise by tools

  • Lack of consistency in methodology — wildly different tools, approaches, techniques

  • Lack of tools tackling the end-to-end fairness

  • Lack of regression implementation vs. academic theory

Of the six tools looked at within the Fairness group (the highest number of tools in a group), common trends were that there is a gap between real-life considerations and the use cases that academics presented for them. This was also reflected in a lack of regression implementation compared to academic theory about the use of each tool. There is also a gap between what practitioners are educated about as essential in fairness evaluation, compared to the level of expertise the tools assume people have. There was a lack of consistency between tools, with a huge variance in their methodology, approaches, and techniques. They also lacked ways to tackle end-to-end fairness, generally being designed for specific points in a data pipeline.

Ethics Committee volunteer Michelle Seng Ah Lee, who led this group, took the lessons from the event to publish a paper on the Landscape and Gaps in Fairness Toolkits.

Natural Language Processing (NLP)

  • The natural language processing toolkits were concluded as too nascent and lacking in robustness to be able to be implemented into a developer’s workflow without major modifications

  • There is a lack of NLP-focused package or library with decent documentation around detecting and removing bias that is model agnostic

  • Open source tools not robust or standardised or well-documented

There were three NLP tools, and the overall conclusion was that they were too nascent and lacked the robustness needed to be included in a developer’s workflow without major modifications. There is also a lack of NLP-focused packages or libraries with decent documentation around detecting and removing bias that is model agnostic. Finally, the open source tools were not standardised or well-documented.

Checklists

  • Ethics needs to be an iterative process, and a lot of them are one-off.

  • Limited calls to action / clarity.

The group assessing Ethics guidelines and checklists had five to choose from. However, ethics needs to be iterative, and a lot of them were one-off workflows. They also had quite limited calls to action, and didn’t provide a lot of clarity for how to implement next steps.

Communicating ethics

  • Existing tools are focused on the US.

  • The tools we looked at required some familiarity with machine learning models and might be intimidating for complete newcomers.

  • The tools we examined were useful, but would need to be complemented with other materials in order to persuade audiences that ethics is a necessary part of development and needs to be embedded throughout a product life cycle.

There were only two tools in the Communicating ethics group, and they were quite US-centric. They required familiarity with machine learning models, making them intimidating for newcomers. They would also need to be complemented with other materials to really persuade audiences that ethics is a necessary part of development that needs to be embedded throughout a product’s lifecycle.

Explanation

  • Tools are often broken and unmaintained

  • Explainability needs to be built-in and become core to existing libraries with less separation, as it is currently a stand-alone function

  • Usability level varies between open source and commercialised products

When assessing the five Explanation tools, the group found they were often broken and unmaintained. They were also treated as stand-alone functions, rather than being built-in as core parts of existing libraries. The usability varied a lot between open source and commercialised products, with commercial organisations able to put resources behind an accessible user interface compared.


How can I contribute?

The conclusion that can be drawn from this project is that there is plenty of work to do towards making ethics something that can be easily embedded into data science! We hope this can be a living document that can be updated as the open-source ethics toolkit landscape matures over time. You can access the Github page here. Please feel free to add change requests, and we will review them.

How we curated the event

The participants were recruited through the DataKind UK mailing list, with several targeted invitations to individuals actively engaged in algorithmic ethics. While initially planned as an in-person event, physical meeting was not possible due to COVID-19 restrictions. Given the participants were already familiar with the relevant literature and debates, curating this group — rather than randomly sampling — allowed for more rapid and in-depth assessment of the toolkits without the need for the initial preparation or training on relevant material. Once recruited, participants were split into the sub-groups, assigned by prioritising their stated preference collected in their registration form while maintaining a fairly even split in numbers among the groups.

The Ethics Committee members who organized the event are Michelle Seng Ah Lee, Stef Garasto, Laura Carter, Ruby Childs, Nick Sorros, and Frankie Garcia.

We’d also like to say a huge thank you to all of the participants who attended on the day: Paolo Zoccante, Animesh Chaturvedi, Diego Arenas, Jat Singh, Jennifer Stirrup, Jo Watts, and Adam Hill. Some participants did not give permission for their names to be used.

Previous
Previous

Free data and digital support

Next
Next

Creating a more impactful third sector through improved data use