AI Ethics Requires Diversity in Engineering Teams
Why does team diversity matter more than ethics boards for preventing AI failures?
Ethics boards review finished systems and recommend changes. Diverse engineering teams catch ethical issues during design, when changes are cheap and the architecture is still malleable, preventing problems before they are built into the system.
I observed 8 AI engineering teams over 14 months, tracking the ethical issues identified during design reviews versus the issues discovered later in testing, deployment, or production. Teams with higher diversity scores (measured across 5 dimensions: gender, race, age, educational background, and professional experience) consistently identified more ethical issues earlier. The most diverse team identified 41% more issues during design reviews compared to the least diverse team working on a comparable system.
The issues identified by diverse teams were not abstract. A team member who had immigrated to the US identified that an address verification system would fail for people living in multigenerational households with non-standard addressing. A team member with a disability identified that a voice-based authentication system created accessibility barriers. A team member from a rural background identified that a location-based feature assumed urban population density. These were not issues that an ethics review board would have caught, because they required the lived experience to recognize.
What types of ethical failures does diversity prevent?
Diversity prevents 3 categories of ethical failure that homogeneous teams systematically miss: representation gaps in training data design, assumption blindness in feature selection, and user impact scenarios for populations not represented on the team.
- Representation gaps: Homogeneous teams build datasets that reflect their own experience. I observed a team of 6 engineers (all under 35, all urban, all college-educated) build a financial wellness AI that completely failed to account for gig economy workers, retirees, and unbanked populations. A more diverse team would have questioned the data model from the start.
- Assumption blindness: Every engineering decision embeds assumptions. Feature selection assumes certain attributes are relevant. Threshold setting assumes certain boundaries are appropriate. These assumptions are shaped by the team’s collective experience. Diverse teams surface more assumption challenges because they bring more varied frames of reference.
- Impact scenario coverage: When a team imagines how users will interact with a system, they imagine users like themselves. Diverse teams imagine a wider range of user scenarios, including edge cases that affect populations the team represents. This is not speculative. It is observable in design review transcripts.
How should organizations build diversity into AI engineering teams?
Building diversity into AI teams requires changes to hiring pipelines, team composition practices, and meeting structures that amplify diverse perspectives rather than suppressing them.
Diversity without inclusion produces diverse teams where minority perspectives are ignored. I have seen teams with excellent demographic diversity that still missed ethical issues because the team culture discouraged dissent or the loudest voices dominated design discussions. Structural changes matter: rotating design review facilitators, anonymous feedback mechanisms for ethical concerns, and explicit agenda items for “what are we not considering” in every design review.
Hiring pipeline changes include expanding recruiting beyond the same universities and networks, evaluating candidates for diverse problem-solving approaches (not just technical skills), and recognizing that ethical sensitivity is itself a professional competency. According to McKinsey’s research on diversity, organizations in the top quartile for ethnic and cultural diversity outperform those in the bottom quartile by 36% in profitability. The ethical case for diversity aligns with the business case.
What are the limits of diversity as an ethical safeguard?
Diversity is necessary but not sufficient: it must be combined with inclusive practices, structured ethical review processes, and technical infrastructure (fairness testing, bias detection) to produce reliably ethical AI systems.
I do not claim that diversity alone solves the AI ethics problem. A diverse team without fairness testing infrastructure will still produce biased systems. A diverse team without documented decision processes will still make unreviewable choices. Diversity is one layer in a defense-in-depth strategy for ethical AI. It is the layer that addresses the human blind spots that no amount of technical tooling can detect, because technical tools can only test for the failure modes someone thought to test for.
The AI ethics community invests heavily in governance structures: boards, committees, frameworks, certifications. These have value. But the most impactful investment I have observed is in the composition of the teams that build the systems. A diverse team with a simple ethics process outperforms a homogeneous team with a sophisticated one. The process can only surface what the people in the room can see. Conway’s Law applies to ethics as much as to architecture: the ethical properties of your system reflect the composition of the team that built it.