The Trolley Problem Is the Wrong Framework for AI Ethics

The trolley problem has dominated AI ethics discourse since at least 2016, when MIT’s Moral Machine experiment collected 40 million decisions from participants in 233 countries. Yet the trolley problem was designed to test intuitions about individual moral agents facing binary choices. AI systems are sociotechnical systems embedded in institutions, markets, and power structures. Applying the wrong framework does not just produce wrong answers. It prevents the right questions from being asked.

Why is the trolley problem the wrong framework for AI ethics?

The trolley problem assumes a single agent, two clear options, perfect information, and an immediate outcome. AI systems operate under none of these conditions. The framework’s simplicity is not a feature. It is a distortion.

The trolley problem, introduced by Philippa Foot in 1967 and refined by Judith Jarvis Thomson, is a thought experiment in which a person must choose between allowing five people to die or diverting a trolley to kill one person. It tests intuitions about the moral distinction between action and inaction.

I have sat in 3 separate corporate ethics workshops where the facilitator opened with the trolley problem. Each time, the room debated enthusiastically. Each time, the debate produced nothing applicable to the actual ethical challenges the engineering team faced. The team was not choosing between two outcomes. They were designing systems whose effects would compound across millions of users over years, under conditions of radical uncertainty.

The trolley problem presupposes what philosophers call a “thin” moral scenario: stripped of context, history, institutional structure, and feedback loops. Real AI ethics operates in “thick” scenarios where the moral weight is distributed across hundreds of decisions made by dozens of people over months. No one person pulls the lever. The lever does not exist.

What happens when we apply individual moral frameworks to institutional systems?

We get moral theater. Organizations perform the appearance of ethical reasoning while the structural conditions that produce harm remain untouched.

Hannah Arendt observed that the most dangerous moral failures are not dramatic choices but bureaucratic routines. The engineer who selects a training dataset is not facing a trolley problem. They are making a technical decision within institutional constraints, under time pressure, with imperfect knowledge of downstream effects. The banality of algorithmic harm is precisely that it emerges from ordinary decisions, not dramatic ones.

According to a Stanford Encyclopedia analysis, the trolley problem has generated over 2,500 academic papers. The volume of scholarship has not produced a single operational framework for governing AI systems. This is not a failure of effort. It is a failure of framing.

When I reviewed the ethics documentation for 4 AI products at different organizations, I found trolley-adjacent language in all of them: “when the system must choose between competing outcomes.” But in none of the 4 cases was the system actually choosing between discrete outcomes. In every case, the system was producing probability distributions that shaped downstream human decisions. The moral weight was diffused across a pipeline, not concentrated at a junction.

What frameworks should replace the trolley problem?

Systems ethics, institutional design, and care ethics offer frameworks that match the actual structure of AI harm: distributed, cumulative, and embedded in organizational routines.

Systems ethics: Instead of asking “what should the algorithm decide?”, ask “what are the feedback loops, incentive structures, and power dynamics that shape how this system affects people?” This is the approach I use when conducting architecture reviews with ethical dimensions.
Institutional design: Ethics is not a property of individual decisions but of the institutions that structure decision-making. The question is not “did the engineer make the right choice?” but “does the organization create conditions where ethical outcomes are likely?”
Care ethics: Developed by Carol Gilligan and Nel Noddings, care ethics shifts attention from abstract principles to concrete relationships of responsibility. Who is affected by this system? What do they need? How do we remain accountable to them over time?
Virtue ethics at the institutional level: Rather than asking what the algorithm should do, ask what kind of organization we are becoming through the systems we build. This connects directly to Aristotelian thinking about character formation.

Why does the framing of the question matter so much?

The framework you choose determines the questions you ask. The wrong framework does not produce wrong answers. It prevents the right questions from ever being formulated.

Ludwig Wittgenstein argued that philosophical problems often dissolve when you examine the language that created them. The “AI ethics problem” as framed by the trolley problem is a pseudo-problem. It asks “how should AI choose?” when the real question is “how should organizations govern the systems they deploy?” These are fundamentally different questions requiring fundamentally different methods.

I spent 6 months working with a team that had been paralyzed by an “ethical AI” initiative. They had formed a committee, read papers, debated scenarios. They had not changed a single line of code or a single organizational process. The trolley problem framework had given them a way to feel ethical without being ethical. When we shifted to systems ethics, asking “what are the 5 points in our pipeline where bias could enter, and what detection mechanisms exist at each point?”, they produced actionable changes within 3 weeks.

“The question is not what the algorithm should decide. The question is what kind of institution you are building around the algorithm.”

The trolley problem is a beautiful thought experiment for teaching undergraduates about moral intuitions. It has no place in the governance of sociotechnical systems. AI ethics needs frameworks that match the structure of AI harm: distributed, institutional, cumulative, and embedded in the ordinary routines of engineering work. The lever is not the problem. The tracks are. And we built the tracks.

Why is the trolley problem the wrong framework for AI ethics?

What happens when we apply individual moral frameworks to institutional systems?

What frameworks should replace the trolley problem?

Why does the framing of the question matter so much?

More Essays

Epistemic Injustice in Technical Interviews

Imposter Syndrome as Socratic Wisdom

Alienation in the Age of Automation: Marx Was Partly Right