The Carbon Cost of Large Language Models Is an Ethics Problem

Training a single large language model emits an estimated 300 to 500 metric tons of CO2, equivalent to the lifetime emissions of 5 average cars. Running inference on GPT-4 class models at scale consumes an estimated 1.5 to 3 GWh annually per major deployment. The environmental cost of large language models is not an externality to be dismissed. It is an ethical obligation to be managed.

Why is the carbon cost of LLMs an ethics problem rather than just an operational concern?

The carbon cost of large language models is an ethics problem because the environmental burden is distributed globally while the benefits accrue to specific organizations and users, creating an inequitable distribution of costs and benefits that mirrors other environmental justice issues.

I calculated the carbon footprint for 2 AI deployments I manage. The inference costs alone for a customer-facing LLM application running on GPU clusters consumed approximately 840 MWh annually. At the average US grid carbon intensity, that is roughly 370 metric tons of CO2 per year. For one application. The environmental cost is real, measurable, and almost never included in the project’s cost accounting.

The ethical dimension emerges when you ask who bears the cost. The CO2 is emitted globally. The climate effects fall disproportionately on populations in the Global South, who are the least likely to benefit from the AI systems generating the emissions. This is the structure of an environmental justice problem, and the AI industry has not confronted it with anything approaching the seriousness it deserves. The FinOps conversation about AI costs is incomplete if it excludes the environmental dimension.

I am not arguing that LLMs should not exist. I am arguing that their environmental cost should be measured, reported, and actively reduced as an ethical obligation, not ignored as an externality. A model that costs $2 million to train and emits 400 tons of CO2 should report both numbers. Engineers making model selection decisions should have access to both cost dimensions. Currently, they almost never do.

According to the International Energy Agency’s 2024 report, data center electricity consumption is growing at 20-30% annually, driven significantly by AI workloads. This trajectory is not sustainable without either massive clean energy investment or significant improvements in model efficiency. The engineers building these systems bear some responsibility for choosing efficient architectures and questioning whether every use case truly requires the largest available model.