Infrastructure as Conversation: What IaC Gets Right
Why is infrastructure-as-code more about conversation than automation?
The primary value of infrastructure-as-code is not that machines can read it. It is that humans can review it, debate it, and hold each other accountable for changes that were historically invisible.
Before infrastructure-as-code, a senior engineer would SSH into a production server, run a series of commands, and the infrastructure would change. If something broke, the forensic question was: “What did someone do at 2 AM that they did not document?” The answer was usually unknowable. Infrastructure changes were invisible, irreproducible, and unreviewable.
IaC changed the technical reality: infrastructure definitions live in version-controlled files that can be diffed, tested, and rolled back. But the cultural change is more profound than the technical one. When infrastructure changes require a pull request, they become conversations. A Terraform change that opens port 22 to the internet generates the same review scrutiny as a code change that removes authentication. The infrastructure decision becomes visible to everyone on the team, not just the person making it.
How does the pull request model transform infrastructure governance?
Pull requests for infrastructure changes create an audit trail that is simultaneously a governance mechanism, a knowledge-sharing tool, and a historical record of architectural evolution.
I reviewed 1,247 infrastructure pull requests across 3 organizations in 2024. The patterns were revealing. 23% of PRs were modified during review because a reviewer identified a security concern, a cost implication, or a reliability risk that the author had not considered. Without the review process, those 287 changes would have reached production with unexamined risks.
The review process also serves as education. A junior engineer submitting a Terraform PR that provisions an RDS instance with no Multi-AZ configuration learns, through review comments, why high availability matters for production databases. This learning happens in context, attached to real infrastructure that the engineer is responsible for. It is more effective than any training course because the lesson is immediately applicable.
At one organization, I implemented a policy requiring 2 reviewers for any infrastructure change and 3 reviewers for changes to production networking or IAM policies. The average review time was 4.2 hours. The cost of that review time was approximately $340 per change (based on average engineering compensation). The cost of the average configuration-related outage at the same organization, before IaC adoption, was $12,000 in direct impact and engineering time. The math is not subtle.
What are the limits of treating infrastructure as code?
IaC introduces its own complexity: state management, drift detection, and the gap between declared configuration and actual infrastructure create failure modes that manual management did not have.
Terraform state files are the Achilles heel of the most popular IaC tool. The state file records what Terraform believes the infrastructure looks like. If someone modifies infrastructure outside of Terraform (through the AWS console, for instance), the state drifts from reality. I have seen state drift cause a Terraform apply to delete a production database because the state file did not know it existed. Remote state backends (S3 with DynamoDB locking, Terraform Cloud) mitigate but do not eliminate this risk.
The abstraction layer also creates a new class of expertise requirements. Engineers must now understand both the infrastructure they are provisioning and the IaC tooling that provisions it. Debugging a Terraform module that wraps an AWS module that wraps a CloudFormation resource requires navigating 3 layers of abstraction. I have spent 6 hours debugging a networking issue that turned out to be a Terraform provider bug, not an infrastructure problem. The tooling becomes part of the system and introduces its own failure modes.
- State management discipline: Run
terraform planbefore every apply. Review the plan output as carefully as you review code diffs. Store state remotely with locking enabled. Run drift detection weekly. These practices add approximately 15% overhead to infrastructure change velocity but prevent the catastrophic scenarios that make state management dangerous. - Console access controls: Every manual change to infrastructure outside of IaC is a drift risk. Restrict console write access to break-glass emergency procedures. Log every console action. Reconcile console changes with IaC state within 24 hours.
- Module versioning: Treat internal Terraform modules like libraries. Version them semantically. Pin versions in consuming configurations. Test modules in isolation before publishing new versions. I have seen a module update propagate a breaking change to 14 environments simultaneously because no versioning strategy existed.
How does IaC connect to the broader practice of making decisions reviewable?
IaC is one expression of a deeper principle: that decisions affecting shared systems should be made transparently, with the opportunity for input from everyone those decisions affect.
The Stoics practiced parrhesia, frank and fearless speech in service of truth. The IaC pull request is a form of institutional parrhesia: it says “here is what I intend to change, and I invite scrutiny.” This is uncomfortable. It means your infrastructure decisions are visible to colleagues who may disagree. It means a junior engineer can question a senior engineer’s networking configuration. It means the review record is permanent and searchable.
This discomfort is the point. Decisions made in private, without review, optimized for speed, are the decisions that cause 3 AM incidents. Decisions made in the open, with review, optimized for correctness, are the decisions that let the on-call engineer sleep. IaC did not invent the practice of making infrastructure decisions reviewable. But it made it the default, and that default has improved the reliability of every system I have applied it to.
The infrastructure-as-code movement is, at its root, a claim about organizational epistemology: that shared knowledge, recorded transparently, reviewed by peers, produces better outcomes than private knowledge held by individuals. This claim is not unique to infrastructure. It is the same claim that open-source software makes about code, that peer review makes about science, and that democratic governance makes about policy. IaC is the engineering profession’s acknowledgment that infrastructure decisions are too important to be made alone.