Research · Working Paper · February 2026
System Decision Mapping
A justification engine for AI progress in regulated systems.
AI capability evolves rapidly, while autonomy decisions in regulated systems are typically made early and changed cautiously. As a result, many organizations default to limiting AI use — not because models are incapable, but because expanding autonomy requires evidence that is difficult to produce and defend.
System Decision Mapping proposes an approach for organizations that want to revisit those boundaries over time. It focuses on generating decision-level evidence that can support incremental AI enablement — without changing regulatory accountability.
The approach combines two elements: a structured inventory of system decisions informed by governing compliance standards, and a design pattern that treats autonomy as a configurable property rather than a permanent architectural choice.
Target Audience: People designing, building, or governing AI-enabled enterprise systems in regulated environments.
The problem this solves: Enterprises can't justify expanding AI autonomy over time because no one has inventoried which decisions are constrained by regulation, which are eligible for AI ownership, and how those boundaries shift as model capabilities improve. This is that inventory, plus a design pattern for making the boundaries configurable.
What This Is
A staircase, not a guardrail
Most AI governance work asks: "Should AI do this?" System Decision Mapping asks: "Which decisions does this system actually make, which ones are structurally eligible for AI ownership today, and how does that change as models improve?"
That inversion matters. Governance without a decision inventory is policy theater. Governance with one becomes evidence — the kind regulators can evaluate, auditors can review, and architects can act on.
If you're familiar with governance frameworks like NIST AI RMF: this work doesn't replace them — it starts where they stop.
Before you can govern AI decisions, you have to know where they are. System Decision Mapping extracts them from real codebases — consistently, before reading a line of business logic.
Nine evaluation questions surface which decisions are safe to automate today, which require human approval, and which are regulatory hard floors. The governing standard predicts the structure.
The output is not a report — it's a design pattern. Decisions become configurable units. Autonomy is a property that can be upgraded with evidence, not a one-time architectural commitment.
Two Core Capabilities
System Decision Mapping delivers two things that are distinct but share the same foundation.
Decisions are stable. AI capability changes quickly. The missing piece is a measurement that lets organizations justify upgrading AI autonomy over time without renegotiating regulatory accountability. That's what this produces.
The capability is not the bottleneck. The justification is. Regulators don't block AI because they distrust the technology — they block it because there's no documented, auditable basis for the autonomy decision. This system produces that documentation.
Four Emergent Decision Types
These types weren't designed in advance — they emerged from analyzing real production codebases. Every decision point extracted from a codebase falls into one of these four categories.
The Nine Questions
Applied to every decision point extracted from a codebase. Questions are weighted — reversibility and regulatory floor carry the most signal.
Weighting note: Questions are not equal. Reversibility (Q1) and regulatory requirement (Q3) function as hard floors — a single "yes" on either overrides all other signals. Money movement (Q2) compounds irreversibility. Volume/pattern (Q4) and feedback velocity (Q9) are the primary AI-positive signals. All others are confidence modifiers.
Four Autonomy Tiers
Each decision point in the inventory carries an autonomy tier — a starting position, not a permanent label. The tier is a configurable property. Upgrades require evidence. Rollback is structural.
AI owns from day one. Safe to automate without a track record. No regulatory flag. Fast feedback. Fully reversible.
AI owns after demonstrated performance. Starts supervised, graduates when confidence is established across a meaningful sample.
AI recommends, human approves. Tier for decisions with real downstream consequence or precursor to irreversible action. Human is cosigner, not sole decision-maker.
Human owns, AI advises only. Regulatory floor, irreversible money movement, or deep discretion. AI cannot trigger execution — it prepares the summary and surfaces the impact. Human decides.
Every human or Supervised decision is a labeled example. The system trains as you work — not as a separate project. Conservative tier assignments aren't holding patterns; they're structured data collection. The path from Supervised to Earned is paved by the decisions already being made.
For Supervised and Aspirational decisions, the AI classification is deliberately withheld until after the human has made their decision. This prevents decision complacency — the well-documented tendency for humans to anchor on and defer to an AI recommendation even when their independent judgment would differ. The result is a cleaner training signal and a more defensible audit trail: human decisions are genuinely independent, and divergence from the AI is measurable rather than suppressed.
Monitoring, Learning & Time Dynamics
Autonomy is not a one-time classification. It is earned, monitored, and can be revoked. Feedback velocity determines how fast that earning can happen — and how fast revocation must happen when things go wrong.
NIST AI RMF → System Decision Mapping (Operational Use)
This mapping shows how System Decision Mapping can be used as structured input — by humans or AI — to identify the decision landscape implicit in the NIST AI Risk Management Framework, providing a concrete place to start without defining policy or permissions.
| Question NIST Helps You Answer | Where NIST Stops | NIST Function | System Decision Mapping Feature | How This Helps You Move Forward |
|---|---|---|---|---|
| What risks exist in this AI system? | NIST MAP calls for identifying risk sources and system context | MAP | Decision-Point Extraction | Surfaces where decisions actually occur in a live codebase, making risk locations explicit instead of inferred |
| How should AI risk be assessed? | NIST MEASURE requires risk assessment but leaves methods open |
MEASURE MANAGE |
Nine Evaluation Questions | Provides a repeatable way to examine each decision point for reversibility, discretion, regulation, and impact |
| Who is accountable for AI behavior? | NIST GOVERN establishes accountability as a requirement |
GOVERN MANAGE |
Four-Tier Autonomy Model | Defines when AI may act, when it must defer, and when humans retain ownership — with clear thresholds and upgrade criteria |
| What happens when confidence is unclear? | NIST does not define a default posture |
GOVERN MANAGE |
Low-AI-Confidence Defaults | Applies a simple rule: if confidence can't be established, the decision remains human-owned until evidence supports an upgrade |
| How do we monitor and respond to issues over time? | NIST calls for monitoring and response |
MEASURE MANAGE |
Feedback Velocity & Learning Loops | Limits autonomy based on how quickly decision outcomes can be observed and corrected |
NIST tells you governance is required. This tells you where it lives in the code. That's the gap — and it's where AI adoption either stalls in policy documents or moves forward with an auditable basis.
Cross-Domain Validation
Predictions derived from standards documentation — SWIFT CBPR+, ISO 20022, BMC Medical Informatics (2020), CMS ICD-10 guidelines — then tested against real codebase analysis. The governing standard predicted the decision structure before the code was read.
Decision points classified across banking and healthcare
Domains validated (banking, healthcare)
Max tier prediction delta vs. source-derived predictions
Agreement between heuristic and reasoning-based classification
| Codebase | Domain | Architecture Pattern | Governing Body | Decision Points | Status |
|---|---|---|---|---|---|
| Apache Fineract | Core banking | @CommandType Java annotations | SWIFT, ISO 20022, NACHA | 416 | ✓ Validated |
| OpenMRS | Electronic medical records | @Authorized service methods | ICD-10, SNOMED, LOINC | 266 | ✓ Validated |
| Apache OFBiz | Supply chain / ERP | XML service definitions | GS1, UNSPSC, HS codes | 3,493 | ⟳ In research |
Banking and healthcare standards (SWIFT, ICD-10) carry mandatory compliance enforcement — messages are rejected when they don't conform. GS1 and UNSPSC are voluntary adoption standards. The framework's predictive signal is not the existence of a governing standard but the enforcement mechanism behind it. Supply chain requires a separate evaluation methodology and is planned research.
The Tool: Decision-extractor
A static analysis tool that extracts and classifies AI decision points from enterprise codebases. The tool is evidence for the framework — not the contribution itself.
Feedback Welcome
System Decision Mapping documents a recurring pattern observed across production systems in regulated environments. The most useful feedback is failure cases: where the enforcement mechanism distinction breaks, where the nine questions give ambiguous results, and where the tier model doesn't map cleanly to a real decision. If you've tried to apply this to your own systems, I'd like to compare notes.