Measuring Student Outcomes

Special education runs on paperwork and promises. Districts write Individualized Education Programs for every qualifying student, then have no reliable way to prove those plans worked. The services happen. Whether they produce results anyone can measure is a different question. I’ve spent the last two years at Parallel Learning building the data systems to answer that. The answer turned out to be more technical and more human than I expected. Sixteen thousand more special education teachers leave the profession each year than enter it. Student numbers keep climbing: 7.5 million in the 2022-2023 school year, up from 7.1 million four years earlier, now roughly 15% of all public school students. The $190 billion in federal pandemic aid that propped up programs expired in September 2024. Parallel Learning works with school districts to fill that gap through virtual service delivery. During the 2024-2025 school year, our providers delivered more than 77,000 sessions to students across 18 states. These weren’t supplemental. They were the legally mandated services written into each student’s IEP: speech-language therapy, behavioral mental health counseling, specialized instruction, psychoeducational evaluations. Eighty percent of the communities we served scored High or Very High on the CDC’s Social Vulnerability Index, meaning we were reaching the districts least equipped to solve their own shortages. Our platform, Pathway, is where all of this gets tracked. Goals, metrics, session notes, and progress charts live inside the same telehealth interface providers use to deliver services. No toggling between a video call and a separate system. That design choice would matter later, when we started measuring what we’d built. Every student’s IEP looks different. One kid has a single articulation goal. Another has four goals spanning reading comprehension, written expression, and social-emotional behavior. The goals vary in type: over 90% of ours are accuracy-based, but we also track frequency, prompting levels (how much adult support a student needs), and duration. The targets are different. The timelines are different. Completion rates seemed like the obvious starting metric. Did the student meet their goal or not? Simple. It didn’t work. Providers consistently under-report completion. Many inherit students mid-cycle from other clinicians and don’t feel authorized to mark objectives as achieved. Out of thousands of tracked objectives, only 170 showed manual completion marking. Completion data told us almost nothing about whether students were actually advancing. So we shifted to baseline-to-peak progress. Compare each student’s first recorded metric against their highest recorded value. A student who started at 26.7% accuracy and reached 92.8% made progress, regardless of whether anyone clicked an “achieved” button. This required a few deliberate choices. We use actual baseline performance, not zero-baseline. Some districts advocate for zero-baseline calculations, but our clinical team argues the first recorded metric is more accurate given provider assessment protocols. When a student starts above their IEP target and stays there, we count that as progress too, because targets are sometimes set below a student’s actual capability. And we filter for data sufficiency: at minimum two recorded metrics per objective, with a clinical standard of five sessions, representing roughly a month of intervention. Since Pathway’s release, we’ve captured 102,400 individual metrics across 4,697 students with quantified IEP objectives. Students average four sessions generating progress data. Metrics get recorded every 16.7 days on average across 18,709 tracked objectives. Median gap between consecutive recordings: eight days. The methodology comes from a white paper we published in late 2025, written by Dr. A. Jordan Wright and Stacie Corder. Every student’s IEP year starts at a different calendar point, so we standardized goals across students, weighted them equally, and scaled target expectations by how long each student had been working. A student whose IEP was set in February and whose school year ended in June would have a target of completing 50% of their goals by year’s end. We defined three categories: below expectations (more than 10% under target), on target (within 10%), and above expectations (more than 10% over). The 10% threshold was chosen as a reasonable confidence interval given the natural variance in session-to-session performance. Across all service lines, 98% of students were at or above target expectations. Only 2.1% of speech-language pathology students fell below. For behavioral mental health and specialized instruction, zero percent. We also run satisfaction surveys at the end of every session, a slider on a 0-10 scale calibrated for children. Average score for how students feel about their provider: 8.9 out of 10. People assume a screen makes therapeutic relationships harder. The students don’t seem to agree. The clearest pattern in our data: more appointments mean more progress. Students who showed progress averaged 13.88 appointments. Students without progress averaged 5.70. The median tells the same story: 13 versus 4. What’s interesting isn’t the direction of the correlation, which is obvious. It’s the magnitude of the gap. The distributions barely overlap. Students with fewer than five sessions almost never show progress; students with more than ten almost always do. That makes the case for consistent service delivery concrete and numerical. In a field plagued by cancellations, no-shows, and staffing gaps, the data now says exactly what’s at stake when a session doesn’t happen. Age and geography, by contrast, aren’t strong predictors. Mean age for students with and without progress is nearly identical: 10.14 versus 10.20. Regional progress rates hold steady across the Midwest, Northeast, South, and West, ranging from 82% to 86%. Whatever drives progress, it isn’t where the student lives or how old they are. It’s whether they show up, and whether someone shows up for them. Progress measurement was the foundation, not the end. The next step is normalized regression metrics: progress velocity calculations and cross-student comparisons that could eventually power an intervention recommendation engine. If we know that students with certain profiles respond faster to specific approaches, we can stop guessing and start prescribing. We’re also restructuring goal writing itself. AI-assisted goal creation and a goal bank should standardize the format while preserving clinical flexibility. Data capture gets cleaner when goals are written in ways that naturally produce comparable outputs. Further out, we’re training the system to flag whether poor performance stems from individual providers or systemic account-level factors, using statistical benchmarks against organization-wide averages. When a whole account underperforms, the diagnosis changes. These tools make the clinical team faster. They don’t replace their judgment. Federal law requires districts to deliver special education services. It says almost nothing about proving those services work. The gap between mandate and measurement has persisted because the problem is hard and nobody was forced to close it. We’re closing it. And the question that should follow isn’t whether 98% is impressive. It’s why every provider of special education services isn’t held to the same standard of proof.

Meryll Dindin