Introduction
Ask an institutional research director what would break if their most senior analyst gave two weeks notice tomorrow. The honest answers are uncomfortable. The board report nobody else knows how to reproduce. The IPEDS submission that depends on a filter only one person remembers to apply. The standing dashboard whose numbers are correct for reasons that exist nowhere in writing.
The Clema research team studied 20 IR and IE professionals across 13 states to understand how data definitions get built, stored, and maintained. The single most widespread finding was not about tools or budgets. It was about people. In 85% of the institutions we studied (17 of 20), the meaning behind the numbers was person-dependent: definitions that lived in someone's head, on a personal laptop, or in a document nobody else could find. As one respondent put it, definitions are "sometimes just hosted in the end of the IR person's brain, and they go with them when they leave."
Put a number on the exposure and it gets worse. 90% of the institutions we looked at carried moderate-or-higher risk of knowledge loss, and 35% sat at very high risk. Only 10% were genuinely insulated. So the question in the title is not rhetorical. For most IR shops, a single resignation really can take a meaningful slice of institutional knowledge out the door. This post gives you a way to figure out where you stand and how worried to be.
This is the second post in a series. The first one explained why a single term like "enrolled student" can carry five valid definitions at once, and why that fragmentation is structural rather than a failure of any one office. Here we move from the problem to the diagnosis. The third post lays out the framework for fixing it.
What the Institutional Intelligence Gap is
We call the thing you are trying to measure the Institutional Intelligence Gap. It is the distance between the intelligence your institution needs to make reliable decisions and the intelligence it actually has in a documented, accessible, survivable form.
Two words in that sentence are doing the heavy lifting. "Has" is not the same as "knows." Plenty of institutions know a great deal; the knowledge is real, hard-won, and accurate. The gap is about whether that knowledge exists in a form that outlives the person who holds it. A definition a veteran analyst can recite from memory counts for nothing on the diagnostic if it is recorded nowhere, because the institution does not have it. The analyst does.
One clarification matters before we go further. This is a diagnostic, not a maturity model. A maturity model implies a ladder everyone should climb, with the top rung as the goal for all. That framing does not fit here. A two-person community college office and a 28-institution association face different risks and need different responses; telling the small shop it has "failed" to reach enterprise-grade governance is neither fair nor useful. The gap is a snapshot of fragility, read against your own institution's needs. The point is to see clearly where your knowledge is exposed, then decide what is worth shoring up.
The three tiers
Large Gap
55% of the sample (11 of 20), very high risk
Intelligence is person-dependent and fragile. Definitions live in one person's head, on a personal laptop, or in a document that was written once and shelved. There is no governance, no review cadence, no record of what changed or why. The knowledge is often excellent, but it is undocumented and trapped. A key departure here triggers an estimated 6–12 months of recovery while the institution reconstructs what one person used to carry. This is the majority of the field.
Moderate Gap
40% of the sample (8 of 20), medium risk
Definitions exist and some are written down, but they are fragmented and only partly governed. Which definition applies in which context is real institutional knowledge, and experienced staff carry it reliably; the problem is that the knowledge itself is undocumented. The raw material is there. It has not yet been organized, made accessible, and made survivable. A departure hurts but does not erase the institution's memory.
Small Gap
5% of the sample (1 of 20), low risk
Definitions are documented comprehensively, governed by a formal council, and survivable beyond any single person. The one institution at this tier maintained 495 definitions under active governance with versioning and campus accessibility. Note how rare this is: one shop in twenty. A Small Gap is achievable, but almost nobody in the field has reached it, which tells you something about how hard the maintenance is rather than how unattainable the goal should be.
Where the 20 institutions landed
One detail from the study is worth sitting with: governance status predicted an institution's challenge profile more strongly than team size did. A small team with governance beat a large team without it. So if you are a solo shop reading the 55% number and assuming you are doomed to the Large tier, you are not. The variable that moves you is whether definitions are documented and governed, not how many people you have.
The five-factor self-assessment
Documented
Foundational
Are your definitions written down at all, and written down completely? A complete definition carries both halves: the plain-language business meaning a non-technical consumer can read, and the technical calculation with its inclusion and exclusion rules. A term name with no logic behind it is a label, not a definition.
Accessible and integrated
Can anyone outside the IR office find a definition when they need it, without asking you? Better still, are definitions embedded where people actually work: in the dashboard tooltips, report headers, and tools they already open? In the study only 15% of institutions maintained dictionaries the wider campus could reach.
Literacy and adoption
Are your definitions written so a non-technical user understands them, and are they actually used rather than bypassed? A dictionary nobody reads is shelfware. One respondent described putting every definition into a tool and then facing the harder question: "now what?" Adoption is its own factor, separate from documentation.
Governance and institutional memory
Is there a real process for change? That means named owners, a review cadence, an approval step, and notification when something moves. And critically, is the how and the why recorded, so the reasoning behind a definition survives the person who set it? 55% of the institutions we studied had no review cadence at all; they updated definitions only when something broke.
Reproducible and survivable
Advanced
Does the same request yield the same output no matter which analyst runs it? As one respondent observed, without shared definitions two analysts on the same team can interpret an identical request differently. If the answer depends on who happens to pick up the ticket, the knowledge is trapped, not distributed, and it leaves when they do.
Read the five factors as questions and answer them honestly for your own shop. If three or more of your answers land in the Large-Gap column (not documented, not accessible, not adopted, not governed, not reproducible), your definition infrastructure is fragile, which puts you with the 55% of the sample. A mix of Large and Moderate answers means documentation has begun but governance and memory have not. Mostly Moderate answers mean the raw material exists; it is just not yet organized, accessible, and survivable. None of these are verdicts. They are starting points, and the factors are roughly sequential: you document before you can govern, and you govern before you can guarantee reproducibility.
What it costs you, immediately and later
The cost of the gap shows up in two timeframes, and the second one is bigger.
The immediate cost is rework. Reports that should take hours take days, because the analyst stops to reconcile which version of a number is the right one. Some teams told us they produce every version of a figure on a request, just to avoid sending the wrong one, which is its own quiet tax on the office. One respondent estimated that 20–30% of annual IR time goes to definition-related clarification, rework, and reconciliation. That is a fifth to a third of a team's capacity spent not on analysis but on disambiguation.
The delayed cost is the one that compounds. 75% of the institutions we studied reported trust-erosion events, moments when conflicting or incorrect data reached leadership and the office's credibility took the hit. 50% said inconsistencies had invalidated their peer benchmarking outright. Over time, undocumented definitions breed shadow interpretations across offices: parallel versions of the truth that are each internally consistent and mutually incompatible. The institutional memory erodes quietly until a trigger event brings the whole deferred bill due at once.
The events that bring the debt due
| Trigger event | Why it brings the debt due |
|---|---|
| Staff departure | The most cited and most consequential trigger. Knowledge that lived in one person leaves with them, and the institution spends months reconstructing what it used to have for free. |
| System migration | Moving off a legacy SIS forces every undocumented definition into the open at once. Teams that succeed start definition work one to two years ahead, not during the migration. |
| Leadership turnover | A new provost or president asks new questions, and those questions expose old ambiguities that nobody had to resolve while the prior leader was content with the existing reports. |
| Accreditation or external reporting | Deadlines do not wait for you to figure out which definition is correct. The evidence has to be defensible on a fixed date, ambiguity and all. |
These triggers do not just surface the debt; they magnify it. A definition that lives in one person's head is an IR-level risk on a normal Tuesday. That same person leaving during an accreditation year, in the middle of a system migration, is an institutional crisis. The risks correlate, and they tend to arrive together. The chain can reach all the way to students, too: when an early-alert program acts on whatever definition decides who is "at risk," a wrong definition groups the wrong students. One institution counted retention as same-semester enrollment rather than next-semester re-enrollment, and targeted the wrong population as a result.
Putting numbers on it
IR directors keep telling us the same thing: they cannot get a governance project funded because nobody can name the return. As one respondent framed the central unanswered question, "what does a data governance structure look like, and what is the return on investment?" The table below is an attempt to give you language for that conversation.
A heavy caveat first. These are illustrative ranges, conversation starters, not financial projections. They are not empirical findings from the study. They exist to help you reason about the order of magnitude, and to give a CFO something concrete to react to. Treat them as a way to start the budget discussion, not as a quote.
Three illustrative cost ranges
| Cost area | Illustrative range | How the range is built |
|---|---|---|
| Rework and reconciliation | $28K–$84K per year | Roughly 5–30% of annual IR team time spent on definition-related clarification and reconciliation, scaled by team size and salary. |
| Key-person departure | About $52K–$104K per departure | IR-specialized consulting at $100–$200 per hour, around 20 hours per week, for roughly 6 months to rebuild what one person carried. |
| Decisions on wrong data | About $21,000 per retained student | Roughly $7,000 per year of net tuition and aid across 3 years at a four-year college, when the wrong students get grouped and an intervention misses. |
The point of these figures is not precision. It is to make the abstract concrete. "We have key-person risk" rarely moves a budget meeting. "A single departure could cost us roughly $52,000 to $104,000 to recover from, and we have no documentation to soften the blow" tends to land differently. Build your own version of these numbers with your own salaries and your own team size; the act of building them is half the argument.
Closing the gap
The good news in the research is that the path out is short and sequential, and the returns compound as you go.
Moving from Large to Moderate is the hardest step, and it needs no budget, no tool, and no committee. It is the move from undocumented to documented. Write down what is in your head today. A shared Google Sheet is enough to start. The only requirement that matters is that at least one other person can read it, because the entire point is to get the knowledge out of a single skull and into a place that survives a departure. Begin with your highest-risk variables: the ones feeding federal submissions, board reports, and accreditation evidence.
Moving from Moderate to Small is where governance, accessibility, and reproducibility come in (factors 3 through 5). You connect definitions to the reports that use them, set a review cadence of at least quarterly, and consolidate the same term defined in three places into one authoritative version. This stage takes more deliberate effort, but it builds on a foundation that already exists.
The compounding is the part worth holding onto. Each definition you formalize lowers the cost of the next one, because the process, the template, and the precedent are now in place. The first definition is expensive. The hundredth is cheap. As the study put it, creation is achievable; maintenance is where efforts collapse. The institutional intelligence gap is not a gap in knowledge. It is a gap in infrastructure. For more on building the request workflows that make documentation a habit rather than a heroic project, our best practices for IR and IE teams covers the day-to-day mechanics.
Where Clema fits in
We built Clema because the gap this post describes is a real, recurring problem in the field, and most of the work to close it is unglamorous and easy to defer. Clema is an AI data-intelligence platform for IR and IE teams. It connects to nine federal data sources, including IPEDS, College Scorecard, EADA, Pell Grants, and DAPIP, and lets you query across your institutional and federal data in plain language.
The part that speaks to the Institutional Intelligence Gap directly is the memory. Clema flags discrepancies across sources, and it builds institutional memory that persists beyond any individual's tenure, with both a technical dictionary for analysts and a consumer-facing glossary for requestors drawn from the same underlying definitions. That is factors 2 and 5 of the self-assessment, made operational. None of this replaces the human judgment about what a definition should be; it gives that judgment a place to live that does not walk out the door. Thirty-five institutions use Clema today.
If you want the full picture behind the numbers in this post, including the methodology, the case studies, and the cost model, the research is worth reading in full. The next post in this series turns the diagnosis into a build plan: a six-step framework for governing data definitions.
Read the research behind the gap
The Institutional Intelligence Gap is based on interviews with 20 IR and IE professionals across 13 states. Get the three-tier diagnostic, the five-factor self-assessment, the cost model, and the full case studies in one report.
Read the whitepaper