The numbers don’t agree because the words don’t.

In brief

Two people in the same meeting cite a different student-persistence number. Both are reading from a real report, prepared by competent people, drawn from the institution’s real data. A third person at the table has a third number. The institution has a data governance council. It has a published data policy. It has named stewards. None of that stops the meeting from getting stuck.

Most “data problems” are not data problems. They are definitional disagreements misread as technical ones. “Student persistence” (the rate at which enrolled students continue from one period to the next) can mean keeping a student from the first day of the school year to the last day. It can mean keeping them from one annual official census date to the next census date a year later, which is how state accountability typically counts. It can mean keeping them from the first day of school to that same year’s census date, which is a different and shorter window. Each of those is a valid and useful definition. Each is what some real obligation requires. They do not agree with one another, and the institution cannot act on data it cannot agree on.

The framework is necessary but insufficient

Governance councils default to architecture, and they have well-developed frames to draw on. DAMA’s Data Management Body of Knowledge organizes the field into eleven knowledge areas with governance at the center: architecture, modeling, integration, quality, metadata, master and reference data, and the rest. EDUCAUSE’s data-empowered-institution model distills the higher-education version to five components — data quality, integration, governance, management, and literacy. Both frames are correct about what to build. Both are necessary. Both are also insufficient. The actual operational work that makes governance hold (getting the registrar, financial aid, institutional research, and the deans into the same room to decide which version of “persistence” gets used where, and why each version exists) is slow, unglamorous, and often unwritten. Most councils never do it. The framework looks complete. The numbers still do not agree.

Three definitions, one number, eight recalculations

At a K–8 charter network operating across multiple campuses, student persistence was formally defined at least three different ways at the same time. From day one of the school year to the last day, for program-completion reporting. From one annual state census day to the next year’s census day, for state accountability. From the first day of school to that same year’s census date, for early-year persistence reporting. Those translations fed into still more obligations (S&P bond-rating reporting, principal incentive calculations, enrollment forecasting, federal accountability), each of which required a particular version. None of the definitions could be discarded; each existed because a real obligation required exactly that version. In one year alone, the same persistence number was independently recalculated eight or more times across the network’s reports: every recalculation correct under its own definition, none of them agreeing with the others. And yet the institution still had to be able to talk about persistence without the conversation fragmenting into a definitional argument every time it came up.

One metric, eight recalculations: how a single word fragments across obligations — One word. Eight obligations. Eight numbers — each correct under its own definition, none of them agreeing.

The same gap now carries a second cost, and it arrives as consumption rather than confusion. When an analyst recalculated persistence eight ways, the institution was spending labor it had already paid for in salary. When an agent does that work, each version it computes is a metered call against a model. Ask an agent for the persistence rate against ungoverned definitions and it has no basis to pick one; it can compute every plausible version, or recompute the same one on every request, because nothing in the data told it which definition the question meant. The stuck meeting was the human-scale symptom of missing definitional reconciliation. Metered recomputation is the machine-scale symptom of the same missing thing. A governed number that names which definition feeds which obligation does not just end the disagreement; it keeps the agent from paying to generate versions no one asked for. The semantic layer was always a correctness control. Pointed at an agent that runs at machine cadence and bills per call, it becomes a consumption control too, and the spend, like the disagreement, traces back to whether the words were settled first.

The work that closed the gap was not a framework. It was definitional reconciliation. The team mapped every reporting obligation to its required definition. Each definition was named explicitly. The relationships among them were established: how one translated to another, which report drew which number from where, what shared baseline assumptions sat underneath. Then a small set of trusted, governed numbers fed every obligation with the correct version, so that one set of dashboards could serve all of them without any of them being wrong. With that definitional reconciliation as the foundation, data completeness rose from 60 percent to 90 percent, principal dashboard adoption reached 70 percent, and reporting lag dropped by 40 percent. The framework looked the same after that work as it had before. The numbers themselves did not become identical; the framework still required several different ones, for several different obligations. What changed was that the words underneath each one started meaning the same thing in the same place, and the disagreements stopped.

The same problem across time: crosswalks

The other shape of the same problem appears across time. Any organization that runs longitudinal measurement of latent constructs (a foundation tracking program outcomes, a youth-mental-health team tracking adherence and well-being, a behavioral-health agency tracking clinical change) runs into the same definitional friction. Survey versions get updated as the theory of change matures. The constructs being measured shift as the field learns what matters. Items are added, edited, retired. Data-quality standards tighten. Each of those changes is legitimate; none can be paused while the field catches up. And yet the institution still has to be able to look at three years of program data and say something true about it.

What holds that work together, in modern data-stack terms, is a semantic layer above the raw data tables: explicit canonical definitions for each construct, maintained as the underlying instruments evolve, with the discipline of crosswalks. The crosswalks document how a question asked in one survey version maps to the same construct asked slightly differently in the next, with documented limits of comparability and documented gaps where comparison is not warranted. In practice the semantic layer might live as a separate analytics schema sitting above raw tables, reachable from dashboards through whatever read interface fits the stack (PySpark or otherwise), and the field-mapping work of bringing parallel collection platforms together (for example two separate survey-collection environments feeding the same warehouse) is itself a definitional discipline before it is an engineering one. The framework is not the answer. The semantic layer, the crosswalks, and the discipline of owning every definitional change are what hold the reporting foundation honest as the underlying questions keep evolving.

The same problem across systems: rostering

Different systems hold overlapping versions of the same field, and the drift compounds under load. In one K-8 network, enrollment moved through four holders between June and August: the SIS held one version of who was enrolled, the applicant-tracking system held another (still processing applications that had been accepted but not yet rolled forward), the operations office kept a spreadsheet the ops team used for building assignments and family communications, and the enrollment team kept its own spreadsheet, the one they used to track summer targets and plan late-summer marketing efforts. All four were correct against their own local definition of “enrolled.” None of the four agreed on the first day of school. The reconciliation ran through August and rarely caught up before students walked in. The definitional gap was not that anyone was wrong; “enrolled” was doing four different jobs across four systems, and no one had written the mapping down.

Granularity is its own governance problem

Granularity is its own governance problem, and aggregation is where many institutions quietly compromise it. At the same K–8 network, daily attendance was a single data stream with at least three different operational lives. A single absence on a given day triggered an immediate workflow (outreach, follow-up, resolution), owned by an operations coordinator. Three consecutive days of absence triggered a different workflow, owned by a teacher or student-support counselor. Chronic absenteeism (eighteen or more days in a year, or more than ten percent of school days as a running rate) triggered a third workflow, owned by the principal. The same data, three aggregations, three views, three sets of decision rights, three stakeholders. Governance at the granularity layer was not deciding whether to compute these numbers. It was deciding which view triggered which workflow, who owned each decision, and what the legitimate translation among them was, knowing that a daily count cannot be disaggregated from a chronic-absenteeism rate without losing what it measured.

Architecture is governance

Architecture is governance too. When a student-information system is replaced, or a behavioral-health electronic record is migrated to a new platform, the definitional question is not the migration. It is whether what the new system records is the same thing the old one recorded. Field mappings have to be made explicit; new fields added where the schema changed; legacy fields retired only after every use case is accounted for; data-entry personnel trained on the new system’s expectations for completeness, accuracy, and timeliness. None of that is technical work. It is definitional work conducted at the architectural layer — the layer the contract between systems has to govern.

Stewardship is what makes it stick

Quality is a continuous practice owned by the people closest to the data. The principal dashboard adoption rate of 70 percent at the K–8 network was not just a usage statistic. It was evidence of distributed stewardship. The principals were not merely consumers of their numbers; they were the ones who noticed anomalies, raised corrections, pushed back on definitions that did not serve their schools, and held the institution accountable to its own standards. A central data office that owns quality alone is fragile. A network of stewards who own their own data, with shared definitions they help refresh, is durable. This is the model that survives leadership turnover, budget cycles, and reorganization.

Stewardship of that kind requires operational discipline at the configuration and training layer, because the biggest risk in any compliance or reporting process is rarely one dramatic mistake. It is drift. If a gradebook is configured one way at one school and a different way at another, GPA calculations diverge before any dashboard sees them. The same instance has to be replicated across sites (same scales, same formulae, same business rules) with a checking cadence and alerts in place to catch unintended drift or misconfiguration before errors propagate. Definitional work also has to reach the personnel who actually enter the data. Whether an in-house suspension is coded as “present” or “absent” in the student-information system is a definitional decision data-entry personnel make every day. If they have not been trained on which version the institution is using, no framework above them can compensate. Quality is held together one configuration, one alert, and one training conversation at a time.

Why universities need this most

Universities are structurally decentralized in ways most organizations are not. School autonomy, faculty governance, and distributed authority by design are how the institution is meant to work. Governance imposed from the center has a poor track record in higher education because the autonomy is rightly defended. Governance embedded through definitions has a much better one. A definition agreed across the registrar, financial aid, institutional research, and the relevant deans is much harder to walk back, because each domain steward owns it. A policy written by the governance council, however thorough, can be politely ignored by a department running its own numbers. The framework’s real authority is not the document. It is the working set of shared definitions that domain leaders maintain together.

What working governance actually looks like

When a dean and the registrar can agree on what a number means — and can recover, on demand, why three other versions of the same number exist, where each one is used, how this year’s definition relates to last year’s, and how the system that produces it connects to the systems that consume it, governance is working. When the policy document is elegant and the numbers still do not agree, it is not. The work to do is not a better framework. It is the slow, distributed work that lives inside the framework: definitional reconciliation, crosswalks, aggregation governance, architectural mapping, and stewardship. That work turns fragmented words about the institution into decision-ready meaning. It is the work most governance councils skip, and the work the institution’s hardest questions cannot be answered until someone does.

Written June 2026 for the Analytic Bytes Library by Chaitanya Ramineni. Cases described are drawn from the author’s practice across a K–8 charter network and longitudinal-measurement settings; organizational details are abstracted and no individual record, person, or proprietary number is reproduced. The performance figures cited from the K–8 case — data completeness, principal dashboard adoption, reporting lag — come from the author’s working records and internal reports compiled during the engagement; the figures appear consistent with their documented use in the author’s case-studies record.

Analytic Bytes

From fragmented to decision-ready.

Questions, pushback, or a problem that looks like this one? Write to chai@analyticbytes.systems.