Library

Notes from the decision system.

Essays, artifacts, and field notes — on decision-system architecture, measurement, and the operating teams that ship them.

Essay

The contracts between systems

Integration is two questions stacked on top of each other: do the bytes move, and when they arrive, can anyone act on them. Institutions have answered the first across three eras and skipped the second, and the agentic era is about to make that gap load-bearing.

Essay

The numbers don’t agree because the words don’t.

Two people read different student-persistence numbers from the same data. The governance council is functioning. The framework looks complete. What's broken is definitional, and the work to fix it is the work most councils skip.

Field Note

The Take-Home Test

More than a dozen interview take-home tasks, done cold across a decade, read as one experiment. The same few failures showed up in almost every one — and none of them was a skills gap.

Essay

What is this system actually measuring?

Universities have built the scaffolding to govern AI and left out a load-bearing pillar: evaluation. The measurement-science question every adopted system should face — what is this actually measuring, and is that what we meant?

Essay

Actions, Not Answers

Agentic AI produces actions, not answers — and the human checkpoint that came free with every answer is gone unless you design it back in. Why agentic adoption is a decision-system question, not a technology one.

Field Note

Burden, Disparity, and the Next Dollar

Burden and disparity are two different signals in the same CDC mortality data. The priority list you build from one is not the list you build from the other — and a framework that shows both changes where the next prevention dollar goes.

Essay

Where Should Data Sit?

Who owns data infrastructure is one of the org chart’s most muddled questions. The fix is not a better title; it is a principle — report to the integration seat, never to a single function.

Field Note

LO 2.0 — Stitching the Layers

Why national education data, classroom assessments, and local instruments are most useful stitched together — and what the integration architecture looks like.

Essay

Grounding the AI Layer

Where AI belongs in the modern data stack, and the single contract that keeps every AI feature honest.

Field Note

When GenAI Redesigned My Dashboard

A GenAI redesign of my own dashboard came back uglier — and clearer. What that taught me about data teams and AI tools.

Essay

Three Surfaces, One Keystone

Why BI tool selection is the last decision, not the first — and the three reporting surfaces most analytics products owe their audiences.

Artifacts

Diagrams and figures from the work — architecture, measurement, and decision-system frames.

The Decision System — reference architecture
Artifact

The Decision System — reference architecture

A tool-agnostic reference architecture: sources through integration, warehouse, and the semantic-layer keystone to AI and the reporting surfaces, with a governance rail across every layer and a learning loop that closes the system.

One Architecture, Three Stacks
Artifact

One Architecture, Three Stacks

The same six-layer architecture instantiated three ways — Microsoft/Fabric, the modern data stack, and lean/open — showing the tools swap while the architecture holds. The semantic layer is the keystone in all three.

The Agent System
Artifact

The Agent System

An agentic-AI architecture: five named agents — Data, Analysis, Insight, Execution, Monitoring — operating the Signal–Decision–Action loop, with monitoring closing the loop and a human-in-the-loop rail across every agent. The system around the model, not the model itself, is the architecture.

Decision Load vs Decision Capacity
Artifact

Decision Load vs Decision Capacity

AI raises both an organization's decision load and its decision capacity. Whether the gap closes or opens is a design choice. Deploy without redesign and a leader quietly becomes the buffer the system never built. Design for capacity expansion and the system absorbs what was previously personal.

Reliability vs Validity
Artifact

Reliability vs Validity

The four-target view of the AI scoring trap: a model can agree with human raters at a high rate (reliable) and still measure the wrong thing (invalid) — a tight cluster, off the bullseye.

The Validity Ladder
Artifact

The Validity Ladder

Five rungs of evidence for an AI system. Most AI scoring stops at rung three — agreement with human raters — when the real bar is rung four: does the score predict the outcome it was built to predict?

Fair for Whom?
Artifact

Fair for Whom?

Fairness reframed as validity asked one subgroup at a time. An aggregate accuracy number can look fine while the model quietly degrades for smaller groups — differential prediction hiding under the average.

The Evidence Spine
Artifact

The Evidence Spine

The measurement-and-evaluation architecture that turns monitoring into learning: a living theory of change as keystone, harmonized assessments, and one semantic layer so every audience sees numbers that agree.

Measurement = Diagnostics
Artifact

Measurement = Diagnostics

A sixteen-row translation table from educational measurement vocabulary to medical diagnostics — across foundations (validity, reliability), models (IRT and ROC, standard setting and thresholds, equating and calibration), bias and equity, stakes and decisions, standards and integrity, and the inferential closer: validity argument and differential diagnosis. Different instruments; the discipline is the same.

Higher Ed = Healthcare
Artifact

Higher Ed = Healthcare

An eighteen-row translation table mapping higher-education data and analytics vocabulary onto healthcare equivalents — across outcomes, throughput, advising and care navigation, support programs, infrastructure (SIS/EHR, NSC/HIE, 1EdTech/FHIR), regulation, accountability, equity, and integrative philosophy. Different sectors; the discipline is the same.

K-12 = Healthcare
Artifact

K-12 = Healthcare

An eighteen-row translation table mapping K-12 data and analytics vocabulary onto healthcare equivalents — across outcomes, intervention workflow, infrastructure, regulation, accountability, and integrative philosophy. Different sectors; the discipline is the same.

Commercial = Mission-Driven
Artifact

Commercial = Mission-Driven

A fourteen-term translation table from commercial vocabulary — GTM, audience, segmentation, funnel, conversion, KPIs, OKRs, ROI, LTV, runway, churn, A/B testing, MVP, CI/CD — to its mission-driven equivalents. Different bottom line; the discipline is the same.

More entries coming. Have a topic you want me to write about? Send a note.