SchoolDecoder Methodology

SchoolDecoder publishes the Decoded Rank — a weighted composite of five school-level signals. School Alpha (context-adjusted performance) is the largest single component at 35%; the other 65% comes from cohort learning rate, equity-within, multi-year trajectory, and participation. The full composite methodology and the per-component breakdown are documented on the Decoded Rank composite page; this page explains the longest-running component — School Alpha — in detail.

The voice here is plain first, technical second. Parents and journalists should be able to read the first half. Researchers and educators should be able to read the second half and decide whether to cite it.

Lineage — keeping the SEDA methodology current and parent-readable

SchoolDecoder is the public, parent-facing successor to a methodology the academic community has used for nearly a decade. The Stanford Education Data Archive (SEDA), built under Sean Reardon, established the modern playbook for context- adjusted school comparisons: cross-state SBAC scales placed on a common metric, demographic adjustments published transparently, and the pandemic carve-out treated as a data gap rather than a noise- ridden observation. SEDA's most recent public update lags by years, and its data is delivered as research-grade CSVs intended for academics.

SchoolDecoder picks up where SEDA leaves off:

Same carve-out. The 2019-20, 2020-21, and 2021-22 school years are excluded from the data window. This matches the approach Reardon's Educational Recovery Scorecard takes to pre/post-pandemic comparisons.
Same comparable-scale instinct. SBAC scores from a clean five- year pre-pandemic window are placed alongside a clean three-year post-pandemic window, on the same composite, with the gap visible to the reader.
Plain-language presentation. Where SEDA hands you a download, SchoolDecoder hands a parent a school page with a sparkline that shows the gap, a trajectory headline that names the direction, and a methodology section they can read in five minutes.

The full data-window contract is in Decoded Rank → Data window and the internal docs/methodology-calibration.md file in the repository.

The question SchoolDecoder asks

Most school rating sites answer one question: how did students at this school score on the state test?

That question matters, and SchoolDecoder shows the answer too — we call it the Achievement Rank. But raw test scores reflect at least two things at once: what students learned, and what students brought with them when they walked in the door. Family income, parent education, neighborhood stability, and access to outside-of-school resources are all visible in test scores. None of those are the school's doing.

So SchoolDecoder asks a different question:

How did students perform compared with what we'd expect given the school's context?

That is the question the Decoded Rank tries to answer. It is one lens, not a verdict.

What School Alpha is

In methodology and API documentation, the technical name for the context-adjusted residual is School Alpha (also called Context-Adjusted Performance). School Alpha is the largest component of the K-8 Decoded Rank, not the whole rank. For high schools in the current release, the HS Decoded Rank is Alpha-only and uses a separate high-school pool.

School Alpha is the residual between a school's actual test performance and its expected test performance given public socioeconomic and enrollment data. In equation form, the idea is straightforward:

School Alpha = standardized actual performance
             − standardized expected performance

For each state, grade, subject, and year, SchoolDecoder:

Takes the school's achievement measure and standardizes it within the assessment context. Oregon and Washington use an achievement composite computed from the four published performance levels because they do not publish school-level mean scale scores; states that publish a verified mean scale score use that value directly. The data sources page describes the composite formula, provenance, and limitations.
Predicts a standardized score from socioeconomic and enrollment context variables.
Computes School Alpha as the gap between the actual standardized score and the predicted standardized score.
Aggregates School Alpha across grades and subjects into a school-level score, weighted by sample size.
Applies shrinkage so that small or volatile schools do not produce extreme rankings on noise alone.

A positive Alpha means the school performed above what its context would predict. A negative Alpha means the school performed below. Either way, Alpha is a comparison against expectation, not a causal estimate of teaching quality.

Inputs

The model uses public data only. The four input families are state assessment data, federal school directory and demographic data, federal civil rights and enrollment context data, and Census tract-level socioeconomic data.

We do not use race as a predictor variable. We use income, neighborhood education levels, free/reduced-price lunch status (with adjustments for the Community Eligibility Provision), English-language-learner share, and special-education share, plus enrollment and NCES locale category. Locale is treated as a category, not as a city-to-rural numeric ladder.

The English-language-learner and special-education shares are not available for every state. In California — the largest state we publish — neither is used: no school-level English-learner share is wired yet, and the federal special-education figure that does exist covers about 93% of California schools, so using it would drop roughly 127 schools out of the published rankings entirely. We would rather rank those schools with a slightly simpler model than remove them for a data gap that is not theirs. California is therefore adjusted for income, neighborhood education, free/reduced-price lunch status and locale only. Each state's exact predictor set is recorded with its published scores, and every gap carries a dated expiry rather than being left open-ended.

For complete source descriptions, vintages, update cycles, and licensing notes, see SchoolDecoder data sources.

Shrinkage and confidence

A small school with twenty tested students can produce an unusually high or unusually low residual by chance. A larger school with two hundred tested students cannot. Treating both as equally trustworthy would let tiny schools dominate the top and bottom of a ranking on noise alone.

To prevent that, SchoolDecoder applies shrinkage: the published School Alpha is pulled toward zero based on tested sample size, coverage across grade/subject cells, and year-to-year volatility when prior Alpha history is available. A small or volatile school's published Alpha is closer to zero than its raw residual. A large, stable school's published Alpha is close to its raw residual.

Confidence is also reported on every school page. We show data confidence as a single qualitative line — High, Medium, or Low — based on tested count, available testing years up to the ranked year, participation rate, and volatility. Lower confidence does not mean the school is bad. It means the ranking should be read with wider uncertainty.

A dedicated confidence page with the full shrinkage formulas and a worked example will be published alongside the launch in the weeks that follow. Until then, the short version is: smaller schools get pulled toward zero more aggressively, and confidence is shown next to the rank, not buried.

Story types

Every school's pair of ranks — Achievement Rank and Decoded Rank — falls into one of five story types. The labels are deliberately plain.

Story label	When it fires
Strong by Both Measures	High raw achievement and high decoded performance. Students score well, and the school still performs above expected once context is considered.
Outperforming Expectations	Medium or low raw achievement, high decoded performance. Raw scores may understate the school's performance for its context.
High Scores, Context Matters	High raw achievement, medium or low decoded performance. Students score well, but the school performs closer to expected once neighborhood context is considered.
Lower by Both Measures	Low raw achievement and low decoded performance.
Typical Profile	Everything else. The school performs roughly as expected for its context.

Most schools will get Typical Profile. That is correct. Most schools perform roughly as expected for their context, and SchoolDecoder will not invent a story where there is not one.

A non-Typical label only fires when both conditions are met together: the Decoded Rank (the all-factors-combined number) and the Achievement Rank sit on opposite sides of a meaningful percentile threshold by a comfortable margin, and the school's data confidence is not Low. The threshold and gate parameters are kept in configuration, not hardcoded into editorial templates, so the rules can be tuned without rewriting copy. The intent is hysteresis: schools sitting near a boundary stay in Typical Profile rather than flipping labels year over year on small movements.

Tier vs exact rank

A school's Decoded Rank is shown two ways together — as a tier and as a precise rank within that tier. For example:

Top 3% decoded · #8 of 312

The tier is the precision claim. "Top 3%" describes the school's standing with the kind of certainty the model genuinely supports.

The exact rank is a locator within the tier. "#8 of 312" lets a parent see where this school sits relative to neighbors, but a one- or two-place difference between two schools in the same tier should not be read as a meaningful gap. Both the tier label and the exact rank are shown so the page is honest about where the precision is and where it is not.

The tier label and the exact rank are computed from the same underlying Decoded Rank — the composite for K-8 schools, the Alpha-only HS rank for high schools. The tier reflects how confident we are in the rank's neighborhood; the exact rank is the address inside that neighborhood. The tooltip on the rank card carries the rank's 95% confidence interval — that interval is the uncertainty number.

Cross-state metro handling

Some metros cross state lines. The Portland-Vancouver metro is the launch example: Multnomah, Washington, and Clackamas counties are in Oregon, and Clark and Skamania counties are in Washington.

Oregon and Washington both administer Smarter Balanced for grades 3–8 in ELA and math, but their public school files do not contain publisher scale scores. SchoolDecoder computes the Achievement signal within each state first, standardizes it, and then forms the full Portland-Vancouver metro pool. The displayed cross-state rank is therefore a pooled standardized placement, not a comparison of raw OR and WA composite points.

The same fail-safe applies when a metro spans states with different assessments: state-test results are standardized within their own assessment context before any pooled placement. School Alpha is also computed within each state, grade, subject, and year before aggregation. Every cross-state metro carries a methodology disclosure.

We do not publish national raw achievement percentiles. National raw comparison requires NAEP-linked calibration, which is on the roadmap but not in the launch.

What School Alpha is NOT

School Alpha is a context-adjusted residual. It is not a causal estimate, not a teaching-quality score, and not a national achievement percentile.

A high School Alpha is consistent with stronger-than-expected performance. It does not prove the school caused the result. A school may serve a student population whose advantages are not fully captured in public data; the residual will reflect that.

A low School Alpha is consistent with weaker-than-expected performance for the school's context. It does not prove the school is failing students, and it does not mean students at the school are doing poorly in absolute terms — many high-achievement schools have modest decoded scores because their raw scores are largely explained by the advantages students bring.

For the full list of what the score can and cannot claim, the known omitted variables, and the cases where the model is most cautious, see SchoolDecoder limitations.

FAQ

What is a Decoded Rank?

A Decoded Rank is a school's context-aware ranking. For K-8 schools, it is a weighted composite: School Alpha is the largest component, and the rank also includes cohort learning rate, equity-within, recent trajectory, and participation. For high schools, the current release publishes a separate Alpha-only HS Decoded Rank when the high-school assessment data clears eligibility.

How does SchoolDecoder adjust for income?

SchoolDecoder builds a composite socioeconomic index for each school from Census tract-level median household income, poverty rate, and adult educational attainment, plus school-level enrollment and demographic context such as free/reduced-price lunch share and, where the state publishes them, English-language-learner share and special-education share. (California publishes neither in a form we can use without dropping schools, so its model uses the income, lunch-status and locale context only.) The model then predicts standardized test performance from that index and reports School Alpha as the residual between actual and predicted performance. The composite index is more robust than any single income measure, and it does not let one variable dominate the prediction.

Why don't you use race in the model?

Race is not used as a predictor in School Alpha. We use income, neighborhood education levels, free/reduced-price lunch status (adjusted for the Community Eligibility Provision), and — in states that publish usable school-level figures — English-language-learner share and special-education share to capture school context. This is an editorial choice as well as a modeling one: a school's Decoded Rank should not be lower or higher because of the racial composition of its students. SchoolDecoder publishes a bias audit subpage that reports the distribution of School Alpha by racial composition of served students, so any residual relationship is documented honestly rather than implied away.

Can School Alpha prove a school causes better outcomes?

No. School Alpha is a context-adjusted residual, not a causal estimate. It can show that a school performed above or below expectation given available public context. It cannot prove the school caused the difference. SchoolDecoder uses language like "outperforming expectations," "performs above expected for its context," and "raw scores may understate this school's performance" — never "this school causes better outcomes."

What if my school is small?

Confidence is lower for schools with fewer tested students. SchoolDecoder applies shrinkage so that small schools' published Alpha is pulled toward zero, and the confidence line on the school page reports a lower tier with a wider rank interval. A small school can still be ranked, but the ranking should be read with the wider uncertainty in mind. Schools below the state's suppression threshold for tested counts are not publicly ranked.

How current is the data?

Updated within 2 weeks of state release for monitored launch states. National coverage updates state-by-state as data becomes available. Each school page displays the assessment year, the Census ACS vintage used for socioeconomic context, the NCES Common Core of Data year used for enrollment, and the page's last-updated date.