When Building AI Fails: The Incident Response Playbook Nobody Has

James W.
6 days ago
8 min read

*Your building's AI will make a bad decision. The question isn't whether — it's whether you'll know what to do when it happens.*

---

The 3 AM Call Nobody Planned For

It's a Tuesday at 3:14 AM. Your building management system's AI-driven HVAC optimization algorithm decides that a pharmaceutical cold storage facility on the 14th floor doesn't need cooling. The algorithm's logic is sound by its own metrics — occupancy sensors show zero humans, energy prices are peaking, and the model's cost-optimization function is doing exactly what it was trained to do.

By 6 AM, $2.3 million in temperature-sensitive biologics are compromised.

The facility manager gets the call. She checks the BMS dashboard. Everything shows green — because the AI achieved its optimization target. The system didn't fail. It succeeded at the wrong objective.

Now what?

She calls the BMS vendor. They check the algorithm logs — but there are no decision logs. The AI made 847 optimization decisions overnight. Nobody can tell her which one triggered the cooling shutdown, what data it used, or why the cold storage exception wasn't honored.

This isn't a hypothetical. Variations of this scenario play out across commercial real estate every month. The difference between a $2.3 million loss and a caught-in-time near-miss is almost never better AI. It's whether anyone had a playbook for when AI decisions go wrong.

Almost nobody does.

Recent Case Study: Building AI Failure in Action

In August 2023, a high-rise office building faced a critical incident when its AI-driven energy management system erroneously prioritized energy savings over tenant comfort. The result: employees reported extreme temperatures, leading to a spontaneous evacuation. The facility team scrambled to manually override the AI settings, which proved cumbersome and delayed, causing disruption and discontent among tenants.

Analyzing the response, it became clear that there was inadequate decision logging and absence of a predefined escalation process. The incident served as a wakeup call, reminding stakeholders about the importance of having a comprehensive incident response plan for AI systems, just as they would for cybersecurity breaches.

Why Traditional Incident Response Doesn't Work for AI

If you manage commercial buildings, you probably have incident response procedures. Fire. Flood. Power failure. Elevator entrapment. Active threat. These are well-rehearsed, code-mandated, insurance-required.

But AI incidents are fundamentally different from physical infrastructure failures, and the playbooks don't translate.

Physical failures are observable. A pipe bursts — you can see the water. An elevator stops — you can hear the alarm. AI failures are invisible. The system continues operating normally by every metric it tracks. The HVAC is running. The access control is functioning. The occupancy sensors are reporting. The AI simply made a decision that optimized for the wrong thing, and nothing in the system flags that as a problem.

Physical failures have clear causation. The pipe burst because of freezing temperatures. The elevator stopped because of a mechanical fault. AI decisions emerge from the interaction of training data, model weights, real-time inputs, and optimization objectives. When an AI-driven system makes a bad decision, the cause isn't a single broken component — it's the emergent behavior of a complex system. Tracing causation requires decision logs that most building AI systems don't generate.

Physical failures have established liability chains. The plumbing contractor is responsible for pipe failures within warranty. The elevator maintenance company carries liability per the service agreement. AI decision liability is currently a void. When the HVAC optimization algorithm shuts down cooling to a pharmaceutical tenant, who's liable? The BMS vendor who built the algorithm? The systems integrator who configured it? The building owner who approved the optimization targets? The facility manager who didn't set up cold storage exceptions? The answer, in most buildings today, is: nobody knows.

Physical failures trigger automatic responses. Sprinklers activate. Emergency lighting engages. Backup generators start. AI incidents trigger nothing — because the system doesn't know it made a mistake.

The AIRS Framework: What an AI Incident Response System Actually Looks Like

Cognitive Corp's AI Incident Response System (AIRS) was developed specifically for this gap. It's not a technology product — it's a governance framework that tells you what to build, what to document, and what to rehearse before your building's AI makes a costly mistake.

AIRS operates across four phases:

Phase 1: Detection — Knowing Something Went Wrong

The hardest part of AI incident response is recognizing that an incident occurred. Unlike a fire alarm, there's no siren when an AI makes a bad decision.

Detection requires three capabilities most buildings lack:

Decision logging. Every AI-driven system decision above a defined risk threshold must generate a log entry: what decision was made, what data triggered it, what alternatives were considered, what the expected outcome was. Without decision logs, you can't investigate an incident — you can only observe the damage.

Outcome monitoring. A separate system must track whether AI decisions produced the expected outcomes. If the HVAC optimization algorithm predicted a 12% energy reduction and cooling was maintained, did that actually happen? Outcome monitoring catches the cases where the AI "succeeded" by its own metrics but failed by human standards.

Anomaly boundaries. Pre-defined thresholds that trigger human review regardless of what the AI's own metrics say. If any zone drops below a temperature floor, or access control denies more than N entries in an hour, or occupancy predictions diverge from actuals by more than a set percentage — flag it. These aren't AI-generated alerts. They're human-defined guardrails.

Phase 2: Triage — How Bad Is It?

Once an AI incident is detected, triage determines the response level. AIRS uses a four-tier severity model aligned to the Building Constitution's seven principles, borrowing principles from established cybersecurity frameworks like NIST:

Tier 1 (Safety/Life Safety): AI decision directly affects occupant safety. Examples: HVAC serving a healthcare facility, access control during emergency evacuation, fire system interactions. Response: immediate human override, full system quarantine, incident commander activation.

Tier 2 (Regulatory/Compliance): AI decision creates regulatory exposure. Examples: energy reporting violations, accessibility standard breaches, environmental compliance gaps. Response: system pause, compliance officer notification, documentation preservation.

Tier 3 (Financial/Operational): AI decision causes measurable financial or operational harm. Examples: the pharmaceutical cold storage scenario, energy cost spikes from bad optimization, tenant service disruptions. Response: targeted system adjustment, stakeholder notification, loss quantification.

Tier 4 (Performance/Optimization): AI decision underperforms but causes no direct harm. Examples: slightly suboptimal energy usage, marginal scheduling inefficiencies. Response: logged for review, addressed in next governance cycle.

Most buildings can't even do this triage — because they don't have the decision logs to determine what the AI decided or the outcome monitoring to measure the impact.

Phase 3: Response — What to Do

Each severity tier has a defined response protocol. The critical elements:

Human override capability. Every AI-driven system must have a documented, tested, accessible override mechanism. Not "contact the vendor's support line." A button, a switch, a command that the on-site team can execute at 3 AM without waiting for anyone's approval. Cognitive Corp's assessments consistently find that the vast majority of building AI systems lack documented override procedures that facility staff actually know how to execute.

Decision preservation. Before anyone starts fixing the problem, preserve the evidence. Decision logs, sensor data, configuration state, model version. This is the AI equivalent of not disturbing a crime scene. You'll need this data for root cause analysis, insurance claims, and regulatory reporting.

Stakeholder notification chain. Who gets told, in what order, with what information. The tenant whose pharmaceuticals were compromised. The insurance carrier. The building owner. Legal counsel. The BMS vendor. Each gets different information at different times — and the wrong notification sequence can create unnecessary liability.

Containment scope. AI decisions are interconnected in ways physical systems aren't. The HVAC optimization that shut down cooling to Floor 14 may have simultaneously adjusted ventilation on Floors 12-16. Containment must cover not just the obvious failure but the cascade of related decisions.

Phase 4: Recovery and Learning — Making It Not Happen Again

The recovery phase is where AIRS diverges most sharply from traditional incident response. Physical system recovery means fixing the broken thing. AI incident recovery means understanding *why the system made that decision* and ensuring the governance framework prevents recurrence.

Root cause analysis for AI decisions requires examining the full decision chain: training data, model architecture, real-time inputs, optimization objectives, and — critically — what governance constraints were or weren't in place. The pharmaceutical cooling failure didn't happen because the algorithm was broken. It happened because nobody defined a governance constraint that said "cold storage cooling is non-negotiable regardless of energy optimization targets."

Governance gap remediation. Every AI incident reveals a governance gap. The AIRS recovery protocol requires documenting the specific gap and implementing the governance control that would have prevented the incident. This feeds directly into the Building Constitution's continuous improvement cycle.

Tabletop replay. Within 30 days of any Tier 1 or Tier 2 incident, AIRS requires a tabletop exercise where the team replays the incident with the new governance controls in place. Does the detection phase catch it earlier? Does the triage correctly severity-rank it? Does the response protocol work? This is how you build institutional muscle memory for AI incidents.

What Your Building's AIRS Score Looks Like Today

Cognitive Corp has assessed AI incident response readiness across commercial real estate portfolios using the BAGI framework. The results are sobering.

The average building scores between 15-25 on a 100-point governance scale. On the incident response dimension specifically, most score below 10. That means:

No AI decision logging in place
No outcome monitoring separate from the AI's own metrics
No defined severity tiers for AI-specific incidents
No documented human override procedures for AI systems
No AI incident tabletop exercises conducted

This isn't because facility managers don't care about incident response. They do — for fires, floods, and power failures. It's because nobody told them that AI decisions in their building need incident response protocols too.

The vendors certainly didn't tell them. Why would they? Admitting that your AI system needs an incident response playbook is admitting that your AI system can fail in ways you can't predict. That's not a great sales pitch.

But it's the truth. And the buildings that acknowledge it — and prepare for it — are the ones that turn a $2.3 million loss into a near-miss that gets caught at 3:15 AM instead of 6:00 AM.

The Five Things to Do This Week

You don't need to build a complete AIRS implementation overnight. Start here:

1. Inventory your AI decision-makers. List every system in your building that makes autonomous or semi-autonomous decisions. HVAC optimization. Access control. Predictive maintenance scheduling. Occupancy-based lighting. Energy load balancing. Most facility managers discover 3-5x more AI decision points than they expected.

2. Ask your vendors for decision logs. For each AI system, ask the vendor: "Can you provide a log of every decision this system made yesterday, including the data inputs and alternatives considered?" The answer will tell you everything about your incident response readiness.

3. Define your non-negotiable boundaries. What are the outcomes that no AI optimization should ever compromise? Cold storage temperatures. Minimum ventilation rates in occupied spaces. Access control during emergency events. Write them down. These become your anomaly boundaries.

4. Document your override procedures. For each AI system: who can override it, how, and without waiting for whom? If the answer involves "call the vendor's support line," you don't have an override procedure — you have a hope.

5. Run a tabletop. Pick the pharmaceutical scenario — or your industry's equivalent — and walk through it with your team. What would you notice first? Who would you call? What would you do at 3 AM? The gaps will become immediately obvious.

The buildings that do this aren't the ones with the smartest AI. They're the ones that will survive when smart AI makes a dumb decision.

---

*James C. Waddell is President of Cognitive Corp and author of the AI Incident Response System (AIRS) framework. Cognitive Corp provides AI governance assessments for commercial real estate, including incident response readiness scoring and AIRS implementation planning.*

*→ Download the full AIRS Framework Guide: [link]*

*#AIGovernance #IncidentResponse #SmartBuildings #BuildingAI #CRE #AIPlaybook*