Class-level RCT
Tests interventions at the classroom level to prevent cross-group contamination
The Educational Randomized Controlled Trial (ERCT) Standard is a rigorous framework addressing key challenges in educational studies. With 12 criteria across 3 progressive levels, it resolves issues like bias, limited scope, and short-term focus – enabling researchers to produce actionable results that improve education systems worldwide.
The ERCT Standard has 3 levels, each containing 4 criteria
Tests interventions at the classroom level to prevent cross-group contamination
Uses standardized exams for objective and comparable results
Ensures studies last at least one academic term to measure meaningful impacts
Requires detailed control group data for proper comparisons
Expands testing to whole schools for real-world relevance
Removes bias by using third-party evaluators.
Ensures studies last at least one academic year to measure meaningful impacts
Ensures equal time and resources for both groups to isolate the intervention's impact.
Independently replicated study
Assesses effects across all core subjects, avoiding imbalances
Tracks students until graduation to evaluate long-term impacts.
Increases transparency by publishing study plans before data collection
The study is randomized at the school level, which satisfies class-level randomization.
Outcomes were measured using Texas STAAR, a standardized state assessment.
The study ran across two school years, exceeding the minimum one-term requirement.
The control condition is clearly described and supported by detailed baseline and implementation documentation.
Random assignment was implemented at the school level (64 schools).
RAND conducted the evaluation independently of Zearn, supported by IES funding.
Outcomes were tracked over two full academic years (2022-2023 and 2023-2024).
Although Zearn provided additional implementation supports, teachers reported similar total math instructional time across groups and the supports appear integral to the tested intervention package.
An independent, peer-reviewed randomized study by a different author compared Zearn Math to another program, serving as an external replication effort on the intervention.
The paper reports standardized outcomes only for mathematics, not for all core subjects.
The study reports outcomes through the end of the second study year and does not track students to graduation.
The paper states a REES registry ID, but the public registry entry date could not be verified to be before study start.
Zearn Math is a popular software platform for K-8 mathematics learning, designed to enable all students to successfully access grade-level content. RAND researchers collaborated with Zearn, the product's developer, to design this evaluation. Then RAND conducted the study independently, randomly...
Randomisation occurred at the school level, which meets or exceeds the class‑level RCT requirement.
The study uses the nationally standardized ENLACE exam for objective, comparable assessment.
Outcomes were collected approximately three years after the intervention began, exceeding the minimum of one academic term.
Baseline demographics and pre‑intervention scores for the control group are provided in detail.
Randomisation at the school level fulfills the School‑level RCT requirement.
The evaluation was conducted by researchers unaffiliated with SEP, ensuring independence of the study’s implementation and analysis.
Participants were followed for approximately three years, which is longer than one academic year from intervention start to final outcomes.
The difference in time or resources between groups is trivial and integral to the intervention, unlikely to bias the results.
The paper does not reference any independent replication studies, and none were found in external literature.
Only math and language outcomes are measured, omitting other main subjects.
The RCT measured on-time high school completion, meaning participants were followed through the completion of that educational level.
No pre-registration statement or registry ID is provided in the paper or its references.
We use data from the randomized control trial of the Percepciones pilot to study whether providing 10th grade students with information about the average earnings associated with different educational attainments, life expectancy, and obtaining funding for higher education can contribute...
Randomization occurred at the class (school) level, avoiding within-class mixing and meeting the class-level RCT standard.
Student learning was measured with the TerraNova standardized test, a well‑established, nationally normed exam.
The intervention spanned a full academic year of seventh-grade instruction, exceeding the one-term minimum.
The control group’s practices, demographics, and baseline scores are clearly documented for proper comparison.
Entire schools (not just classes) were randomized, satisfying the school-level RCT requirement.
An independent research team evaluated the intervention, and no authors had a financial interest in ASSISTments.
The trial ran for a full academic year (with data from the second year cohort), meeting the one‑year duration requirement.
Both groups followed identical homework policies and content; the ASSISTments tool and training were the core treatment.
An independent replication by another research team found similar positive results, confirming the original findings.
Only mathematics achievement was assessed; other subjects were not tested.
Students were originally tracked only through 7th grade; a separate follow-up measured outcomes at the end of 8th grade, but tracking did not extend to high school graduation.
The study was not pre-registered; no registry or preregistration statement is provided.
In a randomized field trial with 2,850 seventh-grade mathematics students, we evaluated whether an educational technology intervention increased mathematics learning. Assigning homework is common yet sometimes controversial. Building on prior research on formative assessment and adaptive teaching, we predicted that...
Randomization was at the school level, which satisfies the class-level RCT requirement.
Outcomes were measured using EGRA and EGMA, which are standardized assessment systems.
Midline outcomes were measured about 1.5 years after implementation began, exceeding one academic term.
The control group is described with sample sizes and baseline characteristics, including a baseline balance table.
Treatment was assigned at the school level, meeting the school-level RCT requirement.
Data collection involved an external firm and the authors analyze secondary data rather than directly implementing the intervention.
Midline outcomes were measured about 1.5 years after implementation began, exceeding one academic year.
The intervention adds facilitated discussions, and those added resources are the treatment being tested rather than an uncontrolled add-on.
No independent reproduction of this specific study was identified in the paper, and none could be verified from accessible sources.
The study measures mathematics and reading only, not all core subjects.
Participants were followed through endline in 2016, which is not tracking the cohort to graduation.
The study reports AEA RCT Registry registration, but the registry entry timing could not be verified from accessible sources for this check.
This article evaluates the impact that facilitated discussions about girls’ education have on education outcomes for students in rural Zimbabwe. The staggered implementation of components of a randomized education project allowed for the causal analysis of a dialogue-based engagement campaign....
Centers (preschool sites) were randomly assigned, which is class-level or stronger randomization.
The outcomes use established, widely used assessments (IGDI, FastBridge, HTKS) rather than a custom test created for this study.
Outcomes are measured from fall to spring within a school year, which is at least one academic term (and in practice close to a full year).
The paper clearly defines a comparison control group (including sample sizes) and reports balance checks and baseline equivalence statements.
Randomization occurred at the preschool center level, which meets the school-level RCT requirement in this context.
The evaluation is conducted by NORC at the University of Chicago, while the program is implemented in collaboration with Kidango as the implementation partner.
The study measures outcomes from fall to spring within a school year, and the intervention is described as occurring during the 2017-2018 school year.
Any additional resources (PD and coaching) are the intervention being tested, and the comparison group is business-as-usual with delayed treatment.
No peer-reviewed, independent replication of this specific SEEDS PD RCT was found in the available literature.
The study focuses on language/literacy and executive function outcomes and does not assess all core subjects (for example, mathematics).
The study reports outcomes within preschool years (fall-to-spring) and does not track students through graduation from the educational stage.
The paper does not cite a pre-registration record or registry identifier, and no verified pre-registration entry was found.
NORC at the University of Chicago designed and implemented an impact evaluation of the SEEDS of Learning (SEEDS) professional development (PD) program on behalf of the Kenneth Rainin Foundation, in collaboration with Kidango. SEEDS of Learning is an evidence-based PD...
Whole schools were randomly assigned to intervention and control, exceeding the class-level randomisation requirement.
Outcomes were measured using the standardised NWEA MAP Growth assessment rather than a custom test.
The study followed students from Autumn 2023 through the July 2024 post-assessment, exceeding one full term.
The report documents the control group definition, baseline school characteristics, and control schools business as usual practices.
Twenty schools were randomised at the school level, satisfying the school-level RCT requirement.
The report separates the developer from the evaluator, with WhatWorked conducting randomisation and evaluation activities.
Outcomes were tracked from September 2023 to July 2024, covering an academic year.
The intervention did not add unbalanced time or resources: homework frequency and duration were similar and controls used other platforms.
No independent peer-reviewed replication of this specific school-level RCT was found.
Only mathematics attainment was assessed, not standardised outcomes in all core subjects.
Follow-up stops in July 2024 for the Year 7 cohort, with no graduation tracking reported.
The report provides no trial registration or protocol ID, and no public pre-registration could be verified.
This report evaluates the impact of Eedi, a digital mathematics platform, on raising maths attainment amongst Key Stage 3 students (Year 7). The study used a randomised controlled trial design with 20 schools, where schools were randomly assigned to either...
Randomization was conducted at the school level, satisfying the class-level requirement.
The study uses the ECE, Peru's national standardized assessment for primary schools.
Outcomes were measured approximately 9 months after the intervention began, exceeding the one-term requirement.
The control group is clearly defined as non-participating schools and their baseline characteristics are extensively documented in Table 2.
Randomization occurred at the school level across 6,218 schools.
The study evaluation was conducted by independent academics, distinct from the Ministry that designed the intervention.
The study measured outcomes after 1 and 3 years of program implementation.
The intervention explicitly tests the impact of adding significant resources (coaching), making the resource imbalance integral to the study design.
The study has not been independently reproduced.
Assessments were limited to mathematics and reading, omitting other core subjects like science.
Tracking ended at Grade 4, prior to primary school graduation.
No pre-registration of the study protocol is mentioned.
We evaluate the impact of a large-scale teacher coaching program in Peru, a context with high teacher turnover, on teachers' pedagogical skills and student learning. Previous studies find that small-scale coaching programs can improve teaching of reading and science in...
The study randomized entire villages (clusters), satisfying the requirement for class-level or stronger randomization.
The study used EGRA and EGMA, which are widely recognized standardized assessments.
The study duration spanned multiple years, significantly exceeding the one-term requirement.
The control group is well-documented, including demographics and confirmation that they received no educational intervention.
Randomization occurred at the village level, which serves as the implementation unit, satisfying the school-level RCT criterion.
Data collection was conducted by an independent organization (GHTC) with blinded administrators, ensuring independent conduct.
The study assessed outcomes approximately 30 months after the intervention began, satisfying the year duration criterion.
The study explicitly tested the impact of the additional resources (para-instructor intervention) as the treatment variable.
The intervention methodology has been replicated by independent teams (e.g., J-PAL/Banerjee et al.) as cited in the paper.
The study assessed outcomes in Reading and Mathematics, covering the main subjects for the target grade levels.
The study explicitly states that no further follow-up was conducted after the intervention ended, failing the graduation tracking requirement.
The paper cites a protocol published after the trial start but does not provide a specific pre-registration registry link or date in the text.
In common with many other low- and middle-income countries (LMICs), India has witnessed a massive expansion in school enrolment over the last 20 years, and yet many students finish primary education without the foundational literacy and numeracy skills that would...
Randomization was conducted at the school (cluster) level, satisfying the class‑level RCT criterion.
The study measured outcomes using national examination scores, a standardized assessment.
Intervention and follow‑up lasted ten months, exceeding a full academic term.
The control group’s size and baseline characteristics are clearly documented in the methods and tables.
Entire schools were randomized, satisfying the school‑level RCT criterion.
The same team that designed the intervention conducted and analyzed the trial without independent oversight.
Participants were tracked from February through December—one full academic year.
HIIT was integrated into standard PE time without additional class time or resources, keeping groups balanced.
No independent replication of this intervention has been reported.
Outcomes were measured only in mathematics and Mongolian language, not all main subjects.
Participants were not followed until graduation; follow-up ended at study completion.
Trial registration was completed before the study began (registered 1st February 2018).
OBJECTIVES: Physical inactivity is an important health concern worldwide. We examined the effects of an exercise intervention on children’s academic achievement, cognitive function, physical fitness, and other health-related outcomes. METHODS: We conducted a population-based cluster RCT among 2301 fourth‑grade students...
The study randomised treatment at the grade‐by‐school level, satisfying the class‐level RCT requirement.
They used the Smarter Balanced standardized exams for Math and ELA as outcome measures.
The intervention lasted from late October through May, satisfying the full academic term requirement.
Control group characteristics and communications are clearly documented in the methods and Table 1.
Randomisation occurred at the grade‐by‐school level rather than entire schools.
The study was designed, implemented, and analyzed by the same team without external evaluation.
The study intervention covered a full academic year.
No additional instructional time or budget was provided, only low‑cost informational text messages.
A separate research team reproduced the intervention in another context and reported similar positive results.
Only Math and ELA were assessed via standardized exams, failing to cover all core subjects.
Participants were tracked only through the end of the school year, not until graduation.
The study’s analysis plan and outcomes were publicly pre-registered before data collection began.
While leveraging parents has the potential to increase student performance, programs that do so are often costly to implement or they target younger children. We partner text‐messaging technology with school information systems to automate the gathering and provision of information...
Randomization occurred at the department-grade (cohort) level, meeting the class-level RCT requirement.
The study measured outcomes using official final grades from institutional examination records.
The study ran for the full Spring 2024 semester (February-May), meeting the term-duration requirement.
The control condition is clearly described as business-as-usual with unrestricted phone access and documented baseline balance.
Randomization was within institutions (departments/cohorts), not between institutions, so it is not a school-level RCT.
The authors designed, implemented, and analyzed the study themselves, with no independent evaluator described.
The study covers one semester rather than a full academic year, so the year-duration requirement is not met.
Phone boxes were installed in all classrooms and the intervention was a policy enforcement rather than added instructional resources, so inputs are balanced.
No peer-reviewed independent replication of this specific RCT was found.
The outcome reflects overall grades across (nearly) all courses, providing an all-subject style assessment rather than a single subject test.
The paper reports only semester-end outcomes and does not track students to graduation.
The paper links to a pre-registration, and the registry date is before the Spring 2024 data collection period.
Widespread smartphone bans are being implemented in classrooms worldwide, yet their causal effects on student outcomes remain unclear. In a randomized controlled trial involving nearly 17,000 students, we find that mandatory in-class phone collection led to higher grades - particularly...
Randomisation occurred at the school level with classes equally divided into intervention and control groups, satisfying the Class‑level RCT criterion.
The study used GL Assessment Progress Tests—standardized instruments in English, mathematics, and science—scored blind by the test publisher, fulfilling the ERCT Standard’s Exam‑based Assessment criterion.
Primary outcomes were measured 20 weeks after intervention start, exceeding a single academic term.
Control cohorts are described alongside intervention cohorts, with baseline comparisons implied.
The study’s RCT was conducted at the whole-school level: entire schools were randomly assigned to either the dialogic teaching intervention or the control condition, fulfilling the School‑level RCT criterion.
The RCT was conducted by an independent evaluation team, separate from the intervention’s designers.
The study lasted 20 weeks, well short of a full academic year.
Control group teachers did not receive the intervention’s teacher induction, training, or mentoring support.
No independent replication of this dialogic teaching trial is reported in the paper or elsewhere.
Student performance was assessed in English, mathematics and science, covering the core primary curriculum.
No follow‑up tracking to graduation is described in the study or in any subsequent publications.
No pre‑registration or protocol identifier is provided in the paper (the trial was only registered retrospectively on ISRCTN, after completion).
This paper considers the development and randomised control trial (RCT) of a dialogic teaching intervention designed to maximise the power of classroom talk to enhance students’ engagement and learning. Building on the author’s earlier work, the intervention’s pedagogical strand instantiates...
The unit of randomization was the classroom (teacher), meeting the class-level RCT requirement.
The primary outcomes were measured using established, standardized assessments (for example PAT-2, Woodcock-Johnson III, and PPVT-IV).
Outcomes were measured across the school year, exceeding a single academic term.
The control condition and baseline characteristics are documented, with both demographics (Table 1) and descriptions of control instruction.
Randomization was at the classroom (teacher) level rather than at the school level.
The intervention was developed and evaluated by the same research team, so the evaluation was not independent of the intervention designers.
The intervention and measurement spanned the full academic school year.
The treatment replaced part of the normal literacy block rather than adding extra student instruction time, and the study evaluates the full implementation package (curriculum plus required teacher training and coaching) against business-as-usual.
No independent replication by a different research team was identified.
Outcomes were limited to language and literacy, with no standardized measures reported for other core subjects.
The study did not track participants through graduation or an equivalent endpoint.
The paper does not report a pre-registration record or registry identifier for the trial.
The goal of the present study was to assess the effectiveness of Foundations for Literacy for deaf and hard-of-hearing (DHH) children. Forty-eight teachers in 14 states were randomly assigned to intervention or control groups. Teachers in the intervention group used...
Randomisation was student-level, but the intervention is targeted small- group instruction outside normal English lessons, fitting the tutoring exception.
Outcomes were measured using recognised standardised digital reading tests (NGRT and ART).
The outcome measurement occurred after approximately two terms (about six months), exceeding the one-term minimum.
The control group and its business-as-usual condition were described, and the paper reports baseline equivalence plus detailed counterfactual information about interventions used with control students.
Randomisation was at the individual student level, not the school level.
The intervention is attributed to Fischer Family Trust Literacy (FFTL), while the trial is authored by university researchers and funded by an independent foundation.
Outcomes were measured after approximately two terms (about six months), which is shorter than a full academic year.
The intervention adds time, training, and materials, but these resources are integral to the intervention being tested against business as usual.
I found no peer-reviewed independent replication of this specific high- school FFTL Reciprocal Reading evaluation by a different research team.
Only reading outcomes were assessed; impacts on all main subjects were not measured.
The study reports post-test at the end of the intervention and provides no evidence of tracking participants to graduation; also, criterion Y is not met so G cannot be met under the ERCT rules.
The trial was registered after enrolment had begun, so it does not meet the requirement for a prospectively pre-registered protocol.
Targeted reciprocal reading instruction can lead to improved reading attainment. Though tested in elementary schools, the technique is less studied with older students. This paper reports results from a Phase 3 definitive trial designed to detect attainment gains previously identified...
The study randomized entire schools to treatment or control, satisfying the requirement for class‑level RCT.
The study employed official state standardized exams for ELA and math, meeting the exam‑based assessment requirement.
Outcomes were measured in subsequent grades, well after at least one full academic term had elapsed, fulfilling the term‑duration requirement.
The control group’s makeup, baseline statistics, and alternate program are clearly documented, satisfying the requirement for a documented control group.
Randomization at the school level fulfills the school‑level RCT criterion.
The study was implemented and analyzed by the same team that developed the program, so there is no independent evaluation group.
Students’ outcomes were measured over multiple grades, covering at least a full academic year of follow‑up.
The control group’s activities were far less intensive than INSIGHTS, so resource allocation was not balanced.
There is no reference to an external, independent replication of the INSIGHTS trial.
The authors measured only ELA and math; other core subjects were not assessed.
No data are reported beyond sixth grade (middle school entry), so graduation tracking is incomplete.
No pre‑registered protocol or registry reference is provided in the paper.
Social‑Emotional Learning (SEL) programs are school‑based preventive interventions that aim to improve children’s social‑emotional skills and behaviors. Although meta‑analytic research has shown that SEL programs implemented in early childhood can improve academic and behavioral outcomes in the short‑term, there is...
Student-level randomization is acceptable here because the intervention is tutoring.
Outcomes were measured with standardized assessments (DIBELS-8 and i-Ready).
Outcomes were measured at end of year, well more than one term after start.
The control group is described as business-as-usual and baseline characteristics are reported.
Randomization occurred within classrooms rather than at the school level.
The paper evaluates externally developed, district-implemented programs rather than a researcher-designed intervention.
Implementation began in November, so the study does not span a full academic year from start of year.
Extra tutoring time is the treatment being tested, so a business-as-usual control is acceptable.
No independent replication of this paper's para-tutoring implementations and findings was found.
Only literacy and math outcomes are measured, not all core subjects.
The study reports end-of-year outcomes only and does not track students through graduation.
The paper claims preregistration but provides no registry link, ID, or date that can be verified.
Using embedded paraprofessionals to provide personalized instruction is a promising model for differentiating instruction within the classroom. This study examines two randomized controlled trials of paraprofessional-led tutoring in early-grade math and literacy. However, intent-to-treat (ITT) analyses revealed no overall achievement...
Randomization is at the student level, but the intervention is an individualized AI personalized learning platform, fitting ERCT's personal teaching exception.
The study reports using a standardized test bank (LCME) with reliability and validity evidence rather than a bespoke study-created exam.
Outcomes were measured at Week 12 after the intervention began, matching a term-length (approximately 12 weeks) follow-up window.
The control condition is described in detail and baseline characteristics are reported for both groups.
Randomization occurs at the student level rather than assigning whole schools or sites.
The paper does not provide evidence that the evaluation was conducted by an independent team distinct from the intervention designer.
Despite stating a one-year study period, outcomes are reported at Week 12 rather than after a full academic year of follow-up.
The intervention intentionally adds an AI platform as the treatment, so the control remains business-as-usual by design under ERCT's resource treatment exception.
No independent replication study of this specific intervention trial was found in available sources at the time of this ERCT check.
Outcomes are limited to a single course or domain rather than standardized exams across all main subjects.
The study does not track participants until graduation and, since Year Duration is not met, Graduation Tracking cannot be met under ERCT rules.
The paper does not report a pre-registered protocol or registry entry that can be verified as registered prior to data collection.
This study aims to evaluate the comprehensive impact of an artificial intelligence (AI)-driven personalized learning platform based on the Coze platform on medical students' learning outcomes, learning satisfaction, and self-directed learning abilities. It seeks to explore its practical application value...
Randomization occurred at the school (above-class) level, satisfying the class-level RCT requirement.
Outcomes were measured using the standardized ITBS reading test.
Reading outcomes were assessed after 12–20 weeks (roughly one academic term).
Control group conditions and baseline characteristics were thoroughly described.
No entire school was solely a treatment or solely a control site; randomization was within schools.
An independent evaluation team (with an external data center) conducted the study, separate from the program’s creators.
Outcomes were measured only midyear (half-year), with no full-year follow-up.
The Reading Recovery group got extra daily tutoring that the control group did not receive, resulting in unbalanced time/resources.
No evidence was found of an independent replication of this Reading Recovery study by another team.
Only reading was tested; no standard exams in other core subjects were reported.
The study did not track participants through to graduation.
No pre-registration or registry listing was provided for the study.
Reading Recovery (RR) is a short-term, one-to-one intervention designed to help the lowest achieving readers in first grade. This article presents first-year results from the multisite randomized controlled trial (RCT) and implementation study under the $55 million Investing in Innovation...
The study randomized at the section (class) level within schools, which meets the requirement for class-level randomization.
The study used the SIMCE, which is the Chilean national standardized exam.
The intervention lasted approximately seven months, which exceeds the minimum one-term duration requirement.
The control group is clearly documented with demographic data and a description of their "business as usual" condition.
Randomization was performed at the section (class) level within schools, not at the school level.
The study was not conducted independently; the lead author developed the program and the author team managed the implementation.
The intervention duration was seven months, which is less than the full academic year (typically 9-10 months) required by the criterion.
The extra time and resources were an integral part of the "bundled" intervention explicitly being tested against business-as-usual, satisfying the exception for this criterion.
There is no evidence provided of an independent replication of this study by a different research team.
The study assessed Math and Language but did not assess Science, which is stated as a subject taught by the teachers.
The study tracked students only until the end of the intervention period (Grade 4), not until graduation.
There is no evidence in the text that the study protocol was pre-registered before data collection began.
This paper presents results from a randomized evaluation of a bundled program employing an external coordinator to aid 4th grade teachers with the integration of a math learning platform that partially replaced regular school math instruction in Chile. Students in...
Randomization includes student-level lotteries, but because the intervention is tutoring, the ERCT tutoring exception applies.
Outcomes are measured using established standardized assessments such as state tests, NWEA MAP, i-Ready, STAR, and PSAT/SAT.
Interventions and end-of-year outcome measurement occur over at least an academic-term scale from intervention start to testing.
The BAU control condition is defined and detailed baseline balance tables document control group characteristics.
Randomization is at student, classroom, teacher, or grade level, not at the school level.
Researchers co-designed the tutoring models with partner districts, so evaluation was not fully independent of intervention design.
Several sites implemented tutoring only in spring 2024 (or for about 12 weeks), which is shorter than a full academic year.
The study explicitly tests the effect of providing additional tutoring resources relative to business-as-usual.
No independent peer-reviewed replication of this specific PLI 2023-24 study design was identified.
Primary outcomes are standardized tests in the tutored subject, not across all core subjects.
This interim report reports end-of-year outcomes and does not track students to graduation; additionally, Y is not met.
An OSF link is provided, but the pre-registration timestamp could not be verified to precede the study start.
This report summarizes the ongoing work by the Personalized Learning Initiative (PLI) research team to understand whether and how scaling high dosage tutoring (HDT) works in the post-pandemic environment. The study involved a large-scale randomized controlled trial with eight partners...
Student-level randomization is acceptable here because the intervention is delivered as small-group tutoring (2-4 students).
The study used widely recognized standardized reading assessments (TOWRE-2, TOSREC, GMRT) with standard scores and reported reliability.
Outcomes were measured after an intervention period running from October to February, which exceeds one academic term.
The Business-as-Usual control is described and the paper reports control-group demographics and details of services received.
Randomization occurred at the student level within schools rather than at the school level.
The authors developed the Engaged Learners program and the study was researcher-implemented with coaching by the first author.
The intervention and measurement window (October to February) is under a full academic year.
The intervention adds substantial instructional time and staffing, but these added resources are integral to what the study is testing against Business-as-Usual.
No peer-reviewed independent replication by other authors was found, and the paper itself calls for future replications.
The study measured reading and attention outcomes only, not standardized outcomes across all core subjects.
The study does not track students to graduation, and Criterion G cannot be met because Criterion Y is not met.
The authors explicitly state that the study was not pre-registered.
We investigate the efficacy of a reading intervention integrated with Engaged Learners, a program that applies behavioral and cognitive principles to increase student behavioral attention and reduce distractions during instruction. Using a three-arm randomized controlled trial, we randomized 159 Grade...
Randomization occurred at the classroom level, satisfying the class-level RCT requirement.
A standardized reading test (Gray Silent Reading Test) was used for outcome measurement.
Post-tests were administered after about 6–7 months of intervention, exceeding a single term.
The control group’s composition and baseline performance were clearly documented and comparable to the intervention group.
Randomization was done at the class level within schools, not at the school level (no whole-school assignment).
The researchers who developed the intervention also implemented the study (no independent evaluators were involved).
The study spanned about 7 months of one school year, with no outcomes tracked for a full year or longer.
Both groups had equal instructional time and curricular resources; ITSS replaced part of the normal class time rather than adding extra time.
The study has not been independently replicated by an unrelated research team.
Outcomes were limited to reading comprehension; no other core subjects were tested.
Participants were not tracked beyond the immediate post-test in 4th grade (no long-term follow-up through graduation).
No pre-registered study protocol was identified for this trial.
Reading comprehension is a challenge for K‑12 learners and adults. Nonfiction texts, such as expository texts that inform and explain, are particularly challenging and vital for students’ understanding because of their frequent use in formal schooling (e.g., textbooks) as well...
Randomization was performed at the school level, exceeding the class-level requirement.
The study used custom-designed tests rather than a recognized standardized exam.
Outcomes were measured about 15 months after the intervention start, exceeding one academic term.
The control group’s size and treatment condition are clearly described, fulfilling documentation requirements.
Entire schools, rather than individual classes, were randomized to treatment and control.
The evaluation was performed by an independent team (IDB and academic partners), distinct from the OLPC Foundation designers.
Outcomes were measured 15 months post-start, satisfying the full academic year requirement.
Additional resources (laptops and training) were the treatment variable being tested, so the control condition appropriately remained business-as-usual.
An independent replication in Uruguay has confirmed the results.
Only math and language outcomes were assessed, not all main subjects.
A long-term follow-up study tracked student outcomes through graduation, meeting this criterion.
No statement of pre-registration is provided.
This paper presents results from a large-scale randomized evaluation of the One Laptop per Child program, using data collected after 15 months of implementation in 318 primary schools in rural Peru. The program increased the ratio of computers per student...
The study utilized a clustered randomized controlled trial design randomizing at the school level, which satisfies the requirement for class-level or higher randomization.
The primary innovation outcomes rely on custom-developed measures rather than standardized exams, although standardized tests were used for secondary academic outcomes.
The intervention spanned two full academic years, significantly exceeding the one-term duration requirement.
The control group's condition (self-directed preparation) and baseline characteristics are clearly documented and compared to the treatment group.
The study randomized 80 schools to treatment or control conditions, satisfying the school-level randomization requirement.
The intervention was implemented by an independent NGO (Inqui-Lab), while the evaluation was conducted by an academic researcher from Stanford.
The study tracked students over two full academic years, exceeding the one-year duration requirement.
The intervention provided additional resources (kits, training) that were integral to the treatment being tested, while educational time was balanced across groups.
No independent replications of this specific intervention were found in peer-reviewed literature.
The study assesses Math and Science but does not assess other core subjects like Language Arts or Social Studies using standardized exams.
The study tracked students through Grade 9 but did not track them until graduation.
The study was pre-registered with the AEA Trial Registry prior to the start of data collection.
Innovation fuels long-run economic growth, yet education systems in developing countries often overlook the skills required for innovation. This paper provides the first experimental evidence that students can learn core innovation-related skills. I conduct a large-scale clustered randomized controlled trial...
The study randomized individual students via admissions lotteries, not classes or schools.
Outcomes are measured via course take-up and course passing, not via a standardized exam score.
Outcomes are measured across multiple grade levels (9th-11th), exceeding the one-term follow-up requirement.
The paper documents the control group’s composition and baseline characteristics using a detailed characteristics table and narrative.
Randomization is done via student admission lotteries, not by randomizing schools.
The intervention model is supported by a separate organization, while the study is conducted by researchers affiliated with research institutions.
The paper reports outcomes spanning grades 9-11, satisfying a one-year duration requirement.
The intervention is explicitly described as a comprehensive reform model whose supports and resource-intensive elements are integral to what is being tested.
Independent researchers (AIR) report a follow-up study of Early Colleges based on admission lotteries, providing replication evidence for the model.
The study reports mathematics course outcomes only and does not use standardized exam outcomes across all core subjects.
Follow-up publications by the same research program report high school graduation outcomes and longer-term outcomes after high school.
No public pre-registration record (with a registration date prior to study start) is identified in the paper or via registry searches.
This mixed methods experimental study examined the impacts of the Early College High School model on students' college readiness in mathematics measured by their success in college preparatory mathematics courses in the 9th through 11th grades, and disaggregated for academically...
The original RCT randomized at the school level, which satisfies the class-level (or stronger) RCT requirement.
The long-term follow-up cognitive outcome uses a custom rapid math test rather than a standardized exam.
The intervention ran for eight months, exceeding a full academic term.
The paper documents the control group with baseline and follow-up descriptive statistics in Table 1.
Randomization is at the school level with 34 schools as clusters.
The evaluation team is not the Kumon organization and the paper declares no conflict of interest.
Outcomes are measured in a follow-up conducted six years after the original RCT period.
The intervention adds time and materials, but these added resources are integral to the treatment being evaluated.
No independent replication by a different research team was found for this specific Kumon RCT.
Only mathematics was assessed (and not via standardized exams), so the study does not provide all-subject standardized exam outcomes.
The paper does not show systematic tracking of participants through graduation for the full cohort.
The paper cites an AEA RCT Registry ID but does not provide (and we could not verify) a registration date before the study start.
The COVID-19 pandemic and associated school closures exacerbated the global learning crisis, especially for children in developing countries. Teaching at the right level is gaining greater importance in the policy arena as a means to recover learning loss. This study...
The study randomized at the school level, satisfying the ERCT class-level-or-higher randomization requirement.
Outcomes rely on internal school grades and questionnaires, not a standardized exam-based assessment.
The interval from intervention start after the January pre-test to the May/June post-test exceeds one academic term.
The paper documents the PAU control group size, characteristics, and what support it could receive.
Schools were the unit of randomization, meeting the school-level RCT criterion.
The intervention was evaluated with substantial involvement of the intervention developers and author-led trainer training.
Outcomes were tracked from the December/January pre-test to an October/November follow-up, spanning roughly 9-10 months.
The extra time and staffing are integral to the intervention being tested (PLOS-extra), so PAU as the control is acceptable under the ERCT criterion B exception.
No independent replication of this specific PLOS-extra effectiveness trial was located.
Criterion A is not met because criterion E (standardized exam-based assessment) is not met.
The trial followed students for six months, not through graduation, and no graduation follow-up paper was located.
The paper reports prospective preregistration in REES before data collection, but the registry entry date could not be independently verified without login access.
In secondary education, many students have difficulties planning their schoolwork. These difficulties may not only lead to short-term consequences such as lower grades, but also to long-term psychosocial, professional and financial challenges. To support students with planning problems, we developed...
Randomization was at the individual child level (lottery seats), not at the class or school level, and no tutoring-style exception applies.
Outcomes were measured using widely used standardized assessments (e.g., Woodcock-Johnson, HTKS, digit span), not custom tests created for the study.
The study tracked outcomes from baseline in fall 2021 through spring 2024, far exceeding one academic term.
The control group is clearly defined as lottery non-winners, with sample sizes and alternative preschool enrollment described.
Randomization occurred within lotteries at the child level rather than by random assignment of schools to conditions.
The intervention studied was a business-as-usual Montessori program model not designed by the research team.
Outcomes were tracked from fall 2021 through spring 2024, spanning multiple academic years.
The study explicitly evaluates the real-world Montessori program package (including its resource structure) against typical alternatives, making resource differences part of the treatment definition.
The paper reports that key findings replicate across multiple Montessori preschool RCTs, including at least one independent RCT in another context.
The study does not assess effects across all core school subjects via standardized exam batteries; it reports a selected set of academic and nonacademic outcomes.
Outcomes are tracked only through the end of kindergarten, and the authors explicitly note that longer-run impacts are unknown.
The paper reports registration on REES but does not provide a registration date relative to the study start, so pre-registration before data collection cannot be verified here.
The study uses competitive admission lotteries at 24 oversubscribed U.S. public Montessori schools to estimate impacts of being offered a Montessori PK3 seat on end-of-kindergarten outcomes. The authors report positive impacts on reading and several cognitive outcomes, plus a cost...
Randomisation at the school level satisfies the requirement for a class-level RCT.
The assessments were study-designed instruments, not recognised standardized exams.
Measurement occurred after five terms, satisfying at least one full academic term of follow-up.
The paper provides detailed baseline characteristics and conditions for the control group in Table 1.
Entire schools, not just classes, were randomly assigned to treatment or control.
The same research team and ICS officers designed, implemented, and analyzed the intervention without independent oversight.
The study tracked outcomes for more than an academic year, satisfying the Year Duration requirement.
The additional teacher is the treatment variable, so business-as-usual resourcing in the control group is acceptable.
Multiple independent studies have replicated this finding.
Only math and reading outcomes were measured, failing to cover all core subjects.
No follow-up through to primary school graduation is reported.
The study was not pre-registered before data collection.
Some education policymakers focus on bringing down pupil–teacher ratios. Others argue that resources will have limited impact without systematic reforms to education governance, teacher incentives, and pedagogy. We examine a program under which school committees at randomly selected Kenyan schools...
Pupils rather than whole classes were randomised.
A validated standardised exam (PhAB‑2) was used.
Post‑test occurred four months after the December start.
Control demographics and baseline scores are clearly provided.
Randomisation took place within, not between, schools.
Researchers were not affiliated with the program’s developer.
Only six months of data – under one school year.
Extra resources constituted the treatment itself, making balance proper.
The study’s findings were later replicated by an independent team.
The study measured literacy only, not all subjects.
Follow‑up ended two months after the block, not at graduation.
The paper provides no evidence of pre‑registration.
Background. Many school‑based interventions are delivered without evidence of effectiveness. Aims. This study evaluated the Lexia Reading Core5 program with 4‑ to 6‑year‑olds in Northern Ireland. Sample. One hundred and twenty‑six pupils were screened; ninety‑eight below‑average readers were randomised to an 8‑week block of...
Student-level randomization is acceptable because the intervention is one-to-one tutoring.
Outcomes rely on self-reported grades rather than standardized exam-based assessments.
The first follow-up occurs about six months after randomization, exceeding one academic term.
The paper clearly describes the control condition and reports baseline balance between groups.
Randomization occurred at the student level, not at the school level.
The tutoring program is run by an external nonprofit, while the evaluation is conducted by academic researchers.
Outcomes are tracked from early 2022 to late 2023, exceeding one academic year.
Any additional resources (free tutoring access) are the treatment variable, so a business-as-usual control is acceptable under ERCT.
No independent replication of this specific Lern-Fair RCT by another team was found.
Criterion E is not met, so Criterion A is automatically not met.
The study follows participants for about 18 months, not until graduation, and no follow-up graduation-tracking paper was found.
The study cites an AEA RCT Registry ID but no publicly accessible record with the registration date could be retrieved to verify pre-registration timing.
Tutoring programs for low-performing students, delivered in-person or online, effectively enhance school performance, yet their medium- and longer-term impacts on labor market outcomes remain less understood. To address this gap, we conduct a randomized controlled trial with 839 secondary school...
Student-level random assignment is clearly described.
Outcomes rely on self-reported grades rather than standardized exams.
Follow-up measurement occurs after a substantial interval post- intervention start.
Treatment and control conditions are clearly described and baseline balance is shown.
Randomization is not conducted at the school level.
Key outcomes are self-reported, not independently assessed.
Outcomes span more than one academic year, including a second follow-up in late 2023.
Extra instructional time is the intervention itself and is explicitly tested.
No independent replications were found in available sources.
Not applicable because exam-based assessment is not used and outcomes are not across all core exams.
No evidence of tracking outcomes until graduation for the full cohort was found.
The study reports preregistration in the AEA RCT Registry and analyzes outcomes as registered.
Tutoring programs for low-performing students, delivered in-person or online, effectively enhance school performance, yet their medium- and longer-term impacts on labor market outcomes remain less understood. To address this gap, we conduct a randomized controlled trial with 839 secondary school...
The trial randomized whole Early Years Settings, which satisfies the class-level (or stronger) randomization requirement.
The study used NRDLS, described as a standardized and validated assessment, as the primary outcome measure.
The study included follow-up about 9 weeks after an 8-week intervention, providing roughly term-long tracking from start to T3.
The study removed the treatment-as-usual control arm, leaving no business-as-usual control group.
Entire Early Years Settings were randomized, meeting the school-level RCT requirement.
Intervention developers (paper authors) trained and supervised delivery, so the evaluation was not independent.
Outcomes were tracked only to a 9-week post-test follow-up, far short of a full academic year.
The two arms were explicitly matched on dosage/delivery and both included comparable homework resources, balancing time and inputs.
No peer-reviewed independent replication of the 2025 BEST vs A-DLS trial was found.
The study measured language and communication only, not standardized outcomes across all main subjects/domains.
The study followed children only for weeks to a few months post- intervention, with no tracking to graduation.
The ISRCTN registry record shows registration before first enrolment, indicating prospective pre-registration.
Children's language abilities set the stage for their education, psychosocial development and life chances across the life course. Aims: To compare the efficacy of two preschool language interventions delivered with low dosages in early years settings (EYS): Building Early Sentences...
The study randomized assignment at the school level, which satisfies and exceeds the class-level requirement.
The study employed custom-developed 21-item assessments aligned with the curriculum rather than widely recognized standardized exams.
The study tracked outcomes from August 2022 to May 2024, covering almost two full academic years.
The control group's baseline characteristics, including demographics and test scores, are fully documented and compared in Table 1.
The study randomized 42 schools into treatment and control groups, satisfying the school-level randomization criterion.
The study was conducted by authors affiliated with the Asian Development Bank, which also funded and supported the implementation of the intervention.
The intervention and data collection spanned two full academic years, exceeding the one-year requirement.
The study explicitly tests the provision of hardware and software resources (tablets and modules) as the primary treatment, justifying the resource difference with the control group.
This is an original study and no independent replication of this specific intervention is reported.
The study measured only Math and English outcomes, excluding Science which was part of the intervention content, and failed the standardized exam prerequisite.
The study stopped tracking students once they graduated and did not collect or report graduation data.
The paper does not provide any reference to a pre-registered study protocol or registry ID.
Although Asian economies have increased access to education, students' learning often trails grade level expectations. In the Philippines, learning worsened through prolonged classroom closure during the coronavirus disease (COVID-19) pandemic. Together with the Department of Education, we conducted a 42-school...
Randomisation was conducted at the individual student level within schools rather than at the class level, leading to potential contamination across students in the same class.
The primary outcome is based on post‑intervention grade point averages from school records, not a standardized exam‑based assessment.
Outcomes (end‑of‑year GPAs) were measured at the end of the academic year, at least one full term after the intervention.
The control condition and its baseline data are clearly described, including content, fidelity, and demographics.
Randomisation occurred at the student level within schools, not at the school level.
Independent professional research companies conducted data collection and processing, separate from the intervention designers.
Outcomes were tracked through the end of ninth grade, covering a full academic year.
Both intervention and control groups received equivalent session time and attention, balancing educational inputs.
No independent replication of this national study by a different team is reported.
No standardized exam-based assessments across all core subjects; the study relies on administrative GPAs.
Participants were only tracked through ninth grade; no graduation tracking is reported.
The analysis plan and moderation hypotheses were pre-registered on OSF prior to data analysis.
A global priority for the behavioural sciences is to develop cost-effective, scalable interventions that could improve the academic outcomes of adolescents at a population level, but no such interventions have so far been evaluated in a population-generalizable sample. Here we...
Randomization was done at the Head Start site (center) level, which satisfies or exceeds class-level randomization.
The study measured child-level academic outcomes via teacher ratings, not a standardized, exam-based assessment of each child.
The paper reports that the intervention was conducted over an entire Head Start academic term (fall to spring), meeting the term duration criterion.
The control group’s business-as-usual setting is clearly described, including their staffing support and how it differed from the intervention.
Whole Head Start sites (equivalent to schools) were the unit of randomization, fulfilling the school-level RCT requirement.
The study does not mention any external evaluators. The intervention appears to have been evaluated by its own designers, lacking independent oversight.
The intervention spanned an entire preschool year (approximately 9 months), satisfying the one-year duration criterion.
The intervention group received extra training and mental health consultation services, whereas the control group did not receive comparable resources or attention.
No independent replication by other researchers is reported; this was a single-site study carried out by one team.
Academic performance was only assessed in language/literacy and math (via teacher-rated scales), rather than covering all core subjects with standardized exams.
The original CSRP participants were followed up in later years. A subsequent study by the same research team collected data on these students’ outcomes in high school, fulfilling the graduation tracking criterion.
No pre-registered analysis plan or study registration is mentioned. There is no evidence that the trial was registered before data collection.
The role of subsequent school contexts in the long-term effects of early childhood interventions has received increasing attention, but has been understudied in the literature. Using data from the Chicago School Readiness Project (CSRP), a cluster-randomized controlled trial conducted in...
The study randomized individual students within schools rather than assigning entire classes or schools to conditions.
The study relied on school-assigned GPAs and grades rather than widely recognized standardized exam-based assessments.
Outcomes were measured over the final three quarters of the school year, which is significantly longer than one academic term.
The control group's demographics, baseline data, and specific activities (neutral writing exercises) are well-documented.
Randomization occurred at the student level within schools, not at the school level.
The study was conducted by the authors themselves, who are also associated with the design of the intervention in previous studies.
Outcomes were measured across the full academic year (Terms 2, 3, and 4) following the start of the intervention in September.
The control group received a placebo activity that matched the treatment group in terms of time and resources.
The replication was conducted by the same authors as the original study, not by an independent research team.
Although the study covered core subjects, it relied on GPA rather than standardized exams, failing the prerequisite Criterion E.
The study tracked students only through the end of the sixth-grade year, not until graduation.
The study protocol was preregistered on OSF before the start of the study.
Recent randomized studies suggest brief social-psychological interventions can help students reappraise common social and academic worries during the difficult transition to middle school and, in turn, improve school performance. We conducted a preregistered student-level randomized controlled trial to assess the...
Randomization was conducted at the individual student level rather than by class or school, failing the class-level RCT requirement.
Outcomes were measured via course grades and credits, not through a recognized standardized examination.
The intervention courses ran for a full 16-week semester, meeting the term duration requirement.
The control group’s composition, consent rates, and baseline covariates are documented in tables and text, satisfying documentation requirements.
The study randomized individual students rather than entire schools, failing the school-level RCT requirement.
The study was conducted by researchers independent from the designers of the corequisite models, satisfying the ERCT requirement for Criterion I.
The corequisite support lasted one semester; the Year-long intervention requirement is not satisfied.
The DE support hours are integral to the corequisite intervention, so the control’s business-as-usual condition is appropriate.
No independent replication of this RCT is reported in the paper.
Only reading and writing outcomes were measured, failing the all-subject exam requirement.
A follow-up study by the same research team tracked the original student cohort through graduation, satisfying the ERCT requirement for Criterion G.
There is no indication that the study protocol was pre-registered before data collection.
This study provides experimental evidence on the impact of corequisite remediation for students underprepared in reading and writing. We examine the short-term impacts of three corequisite models implemented at five large urban community colleges in Texas. Results indicate that corequisite...
Randomization occurred at the classroom level, preventing contamination across students in the same class.
The study employs bespoke KA Lite assessments, not established standardized exams.
The study measures outcomes after approximately six weeks, not a full term.
The control group’s makeup and activities are thoroughly described.
The study randomized individual classrooms, not whole schools.
The same research team designed and evaluated the intervention.
The intervention and measurement occur within ~12 weeks, not a full year.
Treatment and control groups received identical time and resources.
No independent replication study is mentioned.
Only mathematics outcomes are measured, not all core subjects.
The study ends after the units and does not follow students to graduation.
The protocol was pre‑registered in the AEA registry before implementation.
This randomized experiment implemented with school children in India directly tests an input incentive designed to increase effort on learning activities against both an output incentive that rewards test performance and a control. Students in the input incentive treatment perform...
The unit of randomization is the teacher (classroom), meeting the class-level RCT requirement.
The RCT’s primary outcomes are platform usage logs, not standardized exam scores; any test-score analysis is non-experimental.
Outcomes are tracked from the May 30 start through November 2022, exceeding one academic term.
The control group is clearly defined (no messages) and baseline characteristics for treatment and control are documented.
Randomization was performed at the teacher level, not at the school level.
The lead author holds IP rights to the software being promoted, so the study is not independent of the intervention’s owner/designer.
The study tracks outcomes for less than a full academic year (March–November), falling short of year-duration tracking.
The only additional resource is the WhatsApp messaging itself, which is the treatment being tested; the control group is business-as-usual by design.
No peer-reviewed independent replication of this WhatsApp-to-teachers messaging RCT was found in a web search as of the ERCT check date.
Only mathematics outcomes are discussed, and criterion E is not met, so the all-subject standardized exam requirement is not satisfied.
Students are followed only within the 2022 school year and not until graduation; criterion Y is also not met, which implies G cannot be met.
No pre-registration identifier or registry entry could be located for this RCT in the paper or via a web search.
The use of self-led educational technologies holds significant potential for improving student learning at scale, but sustaining student engagement with these platforms remains a challenge. We present results from an experimental evaluation implemented following the scale-up of a math platform...
The study randomized whole classes within schools, satisfying the class‑level RCT requirement.
The tests were custom assemblies of items from exam books, not formal standardized exams.
Student performance was assessed at the end of the fall semester, meeting the term‑duration requirement.
The control classes’ makeup, treatment conditions, and baseline data are clearly reported.
The trial randomised classes within schools rather than entire schools.
The authors who developed the CAL were also responsible for its implementation and assessment.
Tracking ceased at the semester’s end, not over a full academic year.
The additional CAL sessions are the treatment itself, so the control group’s business‑as‑usual status is appropriate.
The paper contains no mention of independent replication by a different research team.
The study assessed only math and Chinese; other core subjects were omitted.
Student outcomes were not monitored beyond the semester, so no graduation tracking occurred.
There is no evidence the trial was pre-registered before data collection.
The education of the disadvantaged population has been a long-standing challenge to education systems in both developed and developing countries. Although computer-assisted learning (CAL) has been considered one alternative to improve learning outcomes in a cost-effective way, the empirical evidence...
Randomization was conducted at the class level with intact classes assigned to each condition.
The study employed custom-designed tests of graphing and slope problems rather than a recognized standardized exam.
The study measured outcomes after approximately three months, satisfying the term-duration requirement.
The control group’s size and baseline comparability (NAEP scores) are documented in detail.
Randomization was at the class level within one school, not across multiple schools.
The study was conducted and scored by the authors, with no independent external evaluation.
Outcomes were measured within three months, not tracked over an academic year.
Both groups received the same number of assignments, problems, and review sessions, ensuring balanced time and resources.
There is no evidence of an independent replication study by a different research team that confirms these findings.
The study only assessed mathematics graphing and slope problems, not a full range of subjects.
Follow-up ended at 30 days post-review, with no tracking until graduation.
The paper does not report a pre-registration or registry before the intervention began.
A typical mathematics assignment consists primarily of practice problems requiring the strategy introduced in the immediately preceding lesson (e.g., a dozen problems that are solved by using the Pythagorean theorem). This means that students know which strategy is needed to...
Although randomization was not at the class level, the intervention was a fully individualized, at-home, computer-based program, satisfying the personal tutoring exception in the ERCT standard.
The study employed the standardized TOWRE subtests for reading fluency, meeting the exam‑based assessment requirement.
Outcome assessments occurred after at least 16 weeks of training (and within a 6‑month participation period), exceeding the one‑term minimum requirement.
The study lacked a separate control group, relying on a within‑subject baseline, thus failing the documented control group requirement.
The study randomized individual children rather than entire schools, so the school‑level RCT requirement is not met.
The intervention was conducted and monitored by the same team that designed it, failing the independent conduct requirement.
Participants were observed for about 6 months, not a full academic year, so the year‑duration requirement is not met.
The control condition had no training or additional support, so resources were not balanced.
No independent research team has published a replication of this trial, so reproducibility is not established.
The study measured only reading skills without assessing other core subjects, so the all-subject exams requirement is not met.
Participants were followed for about 6 months, with no data collection continuing through graduation, failing the graduation tracking requirement.
The trial was prospectively registered long before participants were enrolled, satisfying the pre-registered protocol requirement.
Given the importance of effective treatments for children with reading impairment, paired with growing concern about the lack of scientific replication in psychological science, the aim of this study was to replicate a quasi‑randomised trial of sight word and phonics...
Randomisation occurred at the school level, which satisfies the class‑level RCT requirement.
The authors used a bespoke nine‑item quiz rather than a recognised standardised exam.
The intervention lasted only a few hours over several weeks, not a full term.
The control group’s composition, baseline performance, and lack of intervention are clearly documented.
Randomisation at the school level fulfills the school‑level RCT criterion.
The intervention was designed, delivered, and assessed by the authors’ team with no third‑party evaluation.
The experiment ran under one academic year, so year‐long criterion is unmet.
The control group received no equivalent time or resources, so control was not balanced.
No independent replication of this intervention has been reported.
Only financial literacy was assessed, so all‐subject exam criterion is unmet.
The study tracked outcomes only up to seven weeks post‑intervention, not through graduation.
The trial was pre‑registered in the AEA RCT Registry before data collection.
This paper provides causal evidence on the effects of parental involvement on student outcomes in a financial education course based on two randomised controlled trials with a total of 2,779 grade 8 and 9 students in Flanders. Using an experimental design...
Randomisation was at the parent (individual) level, not at the class level.
The study used custom and adapted questionnaires rather than standardised exams.
A six‑month follow‑up assessment provided outcome measurement after at least one full academic term.
The wait‑list control group is described with detailed demographics and conditions.
Randomisation occurred at the individual level, with no schools assigned as units.
The authors who designed the program also delivered and assessed it without independent evaluation.
Follow‑up lasted six months, shorter than the full academic year required.
The study explicitly tests additional training resources as the intervention; the control group remained business‑as‑usual.
No independent replication by another research team is mentioned.
Academic outcomes were measured by custom questionnaires, not in all main subjects via standardised exams.
The study conducted only a six‑month follow‑up and did not track to graduation.
The study was registered after the trial began (ACTRN12613000660785), so it was not truly pre‑registered.
This study evaluated the effects of Group Triple P with Chinese parents on parenting and child outcomes as well as outcomes relating to child academic learning in Mainland China. Participants were 81 Chinese parents and their children in Shanghai, who...
The study randomizes intact sections (all students in a meeting) to flipped or lecture for each lesson, avoiding within‑session mixing and satisfying class‑level assignment.
The study used instructor‑designed course exams instead of standardized external assessments.
Learning outcomes were measured at end of semester, satisfying the term duration requirement.
The control (lecture) condition lacks detailed documentation of participant characteristics and baseline outcomes.
Randomization occurred within course sections rather than entire schools.
Authors who designed the intervention also implemented and analyzed the study.
Outcomes were measured only through one semester, not a full academic year.
Flipped lessons included mandatory pre‑class videos and in‑class exercises not equated by the control group.
An independent replication of the flipped classroom experiment by another research team has been published, satisfying this criterion.
Only econometrics outcomes were assessed, with no broad subject coverage.
The study did not track participants until graduation.
No evidence of pre-registration of study protocols.
Despite recent interest in flipped classrooms, rigorous research evaluating their effectiveness is sparse. In this study, the authors implement a randomized controlled trial to evaluate the effect of a flipped classroom technique relative to a traditional lecture in an introductory...
Classes were randomly assigned to conditions, satisfying a class-level RCT.
The outcome measures were study-specific tests rather than standardized exams.
Outcomes were measured in a short pre-post design around a 90-minute intervention, not one term after the start.
The control group is described and the paper provides group composition and background characteristics.
Randomization occurred at the class level, not by random assignment of schools.
The intervention designers and university-affiliated staff were involved in implementation and study conduct.
The study did not track outcomes for a full academic year, and criterion T is not met.
The intervention did not add extra time or budget relative to control; all groups used comparable digital environments.
No independent replication study was identified in the paper or in an external literature search.
Outcomes were not assessed with standardized exams across all main subjects, and criterion E is not met.
There is no evidence of tracking participants to graduation, and criterion Y is not met.
The paper does not report pre-registration, and no matching registry record was identified in an external search.
Students' measurement estimation skills require benchmark knowledge (about measures of known objects) and estimation strategies (ways to compare with benchmarks). While students' estimation skills have been assessed and unpacked in several empirical studies (for length and area but less for...
The study randomized entire schools to treatment conditions, meeting the class-level RCT criterion.
The study used researcher-designed tests, not standardized exams, to measure financial proficiency.
The intervention consisted of four 50-minute lectures, much shorter than a full academic term.
The paper documents the control group's characteristics, size, and conditions in detail, including baseline scores in Table III.
The study randomized entire schools, fulfilling the requirement for a school-level RCT.
The authors conceptualized, designed the methodology, conducted the investigation, and wrote the paper; no independent team was involved.
The intervention involved four 50-minute sessions and a four-week follow-up, falling short of the required full academic year.
The control group did not receive the financial education program or any comparable substitute, creating an imbalance in educational time/resources.
There is no mention in the paper or readily available external evidence of an independent replication of this specific study.
The study only assessed financial proficiency (knowledge and behavior) and did not use standardized exams for other core subjects.
Follow-up was limited to approximately four weeks post-intervention; there was no tracking until graduation.
The study was registered in the AEA RCT Registry (AEARCTR- 0004431), but the registration occurred *after* the intervention and data collection were completed.
Using a computer-based learning environment, the present paper studied the effects of adaptive instruction and elaborated feedback on the learning outcomes of secondary school students in a financial education program. We randomly assigned schools to four conditions on a crossing...
Randomization was within classes, but the intervention is parent-child one-to-one home teaching, which fits the ERCT tutoring/personal teaching exception for Criterion C.
Primary outcomes are measured with researcher-developed numeracy tasks, not with widely recognized standardized exams.
Outcomes were measured about 6 weeks after intervention start, which is shorter than a full academic term (about 3-4 months).
The control condition is described in detail and baseline characteristics are reported with descriptive tables.
Randomization occurred within classes rather than assigning whole schools to intervention vs control.
The paper does not report that an independent third-party evaluation team conducted the study; implementation and evaluation appear to be run by the research team.
The intervention and outcome measurement span only about 6 weeks, not a full academic year.
The study uses a matched active control with equivalent materials, engagement, and support, isolating numeracy content as the main difference.
No independent replication by a different research team is identified, and the paper frames the work as novel.
Criterion E is not met and the study does not measure standardized outcomes across all core subjects.
Year-long tracking is not present and the paper reports no delayed posttest; no follow-up-to-graduation publications were identified.
The paper states an OSF pre-registration link, but the ERCT check could not verify a time-stamped registration date before data collection.
The home numeracy environment is suggested to influence children's numerical development, but causal evidence for this assertion remains limited. Addressing this gap, we randomly assigned 117 predominantly White 4- to 5-year-olds (M = 4.68 years, SD = 0.2, 47% girls)...
The study randomized students in small peer-instruction groups for a tutoring-style intervention, which satisfies the ERCT tutoring exception.
The study used custom lesson-specific pre- and post-tests rather than widely recognized standardized exams.
The study ran across two consecutive weeks with immediate post-tests, so it did not measure outcomes at least one academic term after the intervention began.
The in-class active learning control condition and group characteristics are documented, including comparable demographics and baseline physics background knowledge.
Randomization occurred within one university course, not at the school level.
The AI tutor was designed by the authors and the paper does not describe independent third-party conduct of the study.
Outcomes were measured within a two-week crossover design, not after a full academic year.
The AI tutor condition substituted for the in-class lesson and did not add unmatched time or resources for the intervention group.
No independent peer-reviewed replication of this specific study by another research team was found.
Outcomes were limited to physics topics and the study did not use standardized exams across all main subjects (and criterion E is not met).
The study reports only immediate post-lesson outcomes and does not track students until graduation (and criterion Y is not met).
The paper reports IRB approval but provides no evidence of a pre-registered protocol, and no matching public registry entry was found.
Here we report a randomized, controlled trial measuring college students' learning and their perceptions when content is presented through an AI-powered tutor compared with an active learning class. The novel design of the custom AI tutor is informed by the...
Randomization was at the individual child level, but the intervention was delivered one-to-one, fitting the personal teaching exception.
Outcomes were measured with study-specific tasks on novel letters and pseudowords, not standardized exams.
Outcomes were assessed within three consecutive days, far shorter than one academic term.
The paper documents participant characteristics and reports no baseline differences across groups in control measures.
The study sampled from a single school and randomized children, not schools.
No independent evaluation organization is described in the paper.
The study duration and measurement window were days, not an academic year.
The paper states that exposure and duration were made comparable across conditions, balancing time-on-task and practice quantity.
No independent peer-reviewed replication of this specific experiment was found.
This criterion is not met because Criterion E is not met and outcomes are not standardized exams across subjects.
The study does not track participants to graduation, and the one-year prerequisite is not met.
The paper provides an OSF link for materials and data but does not state or verify preregistration with a pre-data-collection date.
Recent research has revealed that the substitution of handwriting practice for typing may hinder the initial steps of reading development. The current experiment investigated the impact of graphomotor action and output variability in letter and word learning using a variety...
The study randomized individual students within a single institution rather than randomizing at the class level.
The study utilized the California Critical Thinking Skills Test (CCTST), a widely recognized standardized assessment.
The intervention lasted only 8 weeks, which is shorter than the required full academic term (typically 3-4 months).
The study documents the control group's size, demographics, and baseline performance scores in detail.
The study was conducted within a single nursing college with randomization at the student level, not the school level.
The study was conducted by the authors themselves; only the randomization process was handled by an independent researcher.
The study duration was 8 weeks, which does not meet the requirement of one full academic year.
The study tests the integration of LLMs as the variable; both groups received equal course time (16 hours), satisfying the balance requirement.
No independent replication of this specific study was found in peer-reviewed journals.
The study measured only critical thinking skills and the specific course test score, not all main subjects taught in the school.
Outcomes were measured immediately after the course ended, with no tracking until graduation.
The paper mentions following CONSORT but does not provide a registration number or evidence of pre-registered protocol.
Background: The integration of Large Language Models (LLMs) into nursing education presents a novel approach to enhancing critical thinking skills. This study evaluated the effectiveness of LLM-assisted Problem-Based Learning (PBL) compared to traditional PBL in improving critical thinking skills among...
Randomization was performed at the school level, satisfying the requirement for class-level assignment.
The study utilized specific assessments developed by the research group (CSA and TechCheck) rather than widely recognized standardized exams.
The paper specifies the number of lessons but does not provide specific dates or a duration interval to confirm the intervention spanned at least one full academic term.
The control group's demographics are tabulated, and their "business as usual" condition is clearly described.
The study randomized entire schools rather than classes or students, satisfying the school-level RCT requirement.
The study was conducted and analyzed by the same researchers who developed the curriculum.
The study tracked outcomes only until the end of the 24-lesson curriculum, not for a full academic year.
The control group did not receive resources, time, or professional development equivalent to the treatment group.
Previous evaluations were conducted by the same authors/research group; no independent replication is cited.
The study limited assessment to coding and computational thinking, ignoring other core subjects.
Tracking ended immediately after the curriculum implementation.
The paper mentions IRB approval but does not cite a public pre-registration of the study protocol.
Background and context: Early childhood computer science (CS) education is a high-priority focus worldwide, but early childhood CS tools are primarily developed and researched within the United States and Europe. As an example, the Coding as Another Language ScratchJr (CAL-ScratchJr)...
The study uses a randomized cross-over design at the peer-group level (student level), which is acceptable under the ERCT exception for personal tutoring interventions.
Outcomes were measured using custom pre- and post-tests designed for the specific lessons, not standardized exam-based assessments.
Outcomes were measured immediately following two single-lesson interventions, falling far short of the one-term duration requirement.
The control group (in-class active learning) is well-documented, including pedagogy, student demographics, and baseline knowledge.
The study was conducted within a single university course, not randomized across multiple schools or institutions.
The study was designed, conducted, and analyzed by the authors, including the course instructors, without independent third-party conduct.
Outcomes were measured over a two-week period, not tracked for a full academic year.
The intervention replaced the control activity without adding extra time; in fact, the intervention group spent less time on task than the control group.
No independent peer-reviewed replication of this specific AI tutoring intervention was found.
The study only assessed physics content knowledge, not all main subjects.
The study tracks learning only for the duration of the lessons and does not follow students to graduation.
The study mentions IRB approval but does not provide evidence of a pre-registered protocol on a public registry.
This study reports a randomized, controlled trial measuring college students' learning and their perceptions when content is presented through an AI-powered tutor compared with an active learning class. We find that students learn significantly more in less time when using...
Randomisation was conducted at the individual‑student level rather than at the class level, so the Class‑level RCT criterion is not satisfied.
The study used researcher‑designed custom tests rather than a standardized, widely recognized exam, so the Exam‑based Assessment criterion is not satisfied.
Outcomes were measured after a 4.5‑month intervention period, which covers at least one term, satisfying the Term Duration criterion.
The study provides detailed baseline characteristics and assessment outcomes for the control group, fulfilling the Documented Control Group criterion.
Randomisation was done at the individual‑student level, not at the school level, so the School‑level RCT criterion is not satisfied.
The same team that designed Mindspark also carried out the trial and analysis, so the Independent Conduct criterion is not satisfied.
Participants were followed for only 4.5 months rather than an academic year, so the Year Duration criterion is not satisfied.
The intervention’s extra instructional time is integral to the treatment, so the Balanced Resources criterion is satisfied.
No independent replication of the study is reported, so the Reproduced criterion is not satisfied.
Only mathematics and Hindi were assessed, so the All‑subject Exams criterion is not satisfied.
Participants were only followed until the endline test, with no graduation tracking, so the Graduation Tracking criterion is not satisfied.
The trial was registered only after data collection began, so the Pre‑registered Protocol criterion is not satisfied.
We study the impact of a personalized technology‑aided after‑school instruction program in middle‑school grades in urban India using a lottery that provided winners with free access to the program. Lottery winners scored 0.37σ higher in math and 0.23σ higher in...
Students were randomized at the individual level rather than by classroom, so cross-group contamination could occur.
The learning outcomes were measured using custom-built tests, not standardized exams.
Outcomes were measured immediately after two class meetings, not after a full academic term.
The control condition is clearly described with baseline group characteristics and identical materials.
Randomisation occurred at the student level, not at the school level.
The same research team designed and conducted the intervention, with no third-party evaluator.
The study spans two sessions with no year‑long follow‑up.
Time and materials were identical for both conditions, with only active engagement toggled.
The study has been independently replicated by a different research team.
Only physics learning was assessed, not all core subjects.
The study ended after the course, without tracking students to graduation, and no follow-up by the authors provided such data.
No pre-registration or protocol registry is mentioned.
We compared students’ self-reported perception of learning with their actual learning under controlled conditions in large-enrollment introductory college physics courses taught using active instruction and passive lecture. Both groups received identical content and handouts, and students were randomly assigned without...
Randomization was conducted at the class (tutorial) level, satisfying the class‑level RCT requirement.
The study used author‑designed pre/post tests rather than a standardized exam.
Outcome measures were collected within days, not after a full academic term.
The handout control group is well described with baseline tasks and scores.
Randomisation occurred at the tutorial‑class level, not at the school level.
The authors’ own team both developed and evaluated the intervention.
Follow‑up lasted only days, not a full academic year.
The AR intervention entailed multimedia app access not matched by the handout.
Independent teams reproduced similar AR‐enhanced learning gains in other contexts.
Only sewing/textiles skills were assessed, not all core subjects.
No follow‑up beyond the immediate post‑workshop period.
No evidence of pre‑registration is provided.
This study contributes to enhancing students’ learning experience and increasing their understanding of complex issues by incorporating an augmented reality (AR) mobile application (app) into a sewing workshop in which a threading task was carried out to facilitate better learning...
The study randomized at the student level rather than at the class or school level.
Outcomes were measured with app-administered tests closely aligned to the treatment exercises rather than a standardized external exam.
The intervention and outcome window was six weeks, which is shorter than a full academic term.
The control group is clearly described, including its notification condition and baseline characteristics.
Randomization occurred at the student level rather than the school level.
The paper does not provide a clear statement that the evaluation was conducted by an independent third party separate from the intervention team.
The intervention and measurement period was six weeks, not an academic year.
The additional input (messages/notifications) is the treatment being tested, so a no-message control is the appropriate comparison.
No independent replication study by other authors was found for this specific intervention.
The study does not measure standardized outcomes across all core subjects, and criterion E is not met.
The study does not track participants to graduation and also fails the year-duration prerequisite (criterion Y).
The paper provides ethics approval and data registration statements, but no evidence of a pre-registered study protocol.
We examine whether highlighting streaks - instances of repeated and consecutive behavior when completing learning tasks - encourages 4th to 6th grade students in Peru to increase their use of an online math platform and improve learning. 60,000 students were...
The study randomized at the student level rather than the class or school level.
The study used a custom endline test administered through the app rather than a widely recognized standardized exam.
The intervention duration was six weeks, which is shorter than the required full academic term.
The control group is clearly defined as receiving no messages, and their demographics and baseline performance are well-documented.
The study utilized student-level randomization, not school-level randomization.
The study appears to be conducted by the authors who designed the intervention without a stated independent third-party evaluator.
The intervention lasted six weeks, failing the one-year duration requirement.
The intervention specifically tested the impact of "nudges" (messages) as the treatment variable, so the lack of messages for the control group is by design and balanced.
The paper states this is the first experimental study of its kind in this context, and no independent replication is cited.
The study only assessed math achievement and did not measure outcomes in other main subjects like reading or science.
The study followed up immediately after the six-week intervention; no graduation tracking was conducted.
There is no evidence of a pre-registered protocol with hypotheses and analysis plans in a public registry before data collection.
We examine whether highlighting streaks—instances of repeated and consecutive behavior when completing learning tasks—encourages 4th to 6th grade students in Peru to increase their use of an online math platform and improve learning. 60,000 students were randomly assigned to receive...
The paper is a within-subject experiment with no random assignment, so it is not a class-level RCT.
Outcomes are based on observational coding, not on standardized, exam-based assessments.
The study compares two 10-minute sessions separated by 2 weeks, far shorter than a term-long follow-up.
The sample and baseline (uninformed) condition are documented with detailed participant characteristics.
There is no school-level randomization; individual dyads were recruited and all experienced both contexts.
The study does not report being run by an external evaluation organization independent of the authors.
The study spans 2 weeks between sessions, not a full academic year.
Time and materials are essentially balanced across contexts; the main difference is the informational prime and a different toy set to reduce familiarity effects.
No independent replication study was identified for this 2025 paper.
The study does not use standardized exams in any subject, and it does not assess outcomes across all core subjects.
The study does not track participants until graduation and cannot meet G because it also fails the Year Duration criterion.
The authors state the analyses were not preregistered.
Home math interventions often incorporate informational priming-explicit prompts emphasizing parental math input. While effective in increasing math talk, its impact on child outcome is mixed. This study examined how informational priming shapes the content and dynamic of math interactions. In...
Randomization was at the individual participant level rather than at the class (or school) level.
Outcomes were scored via teacher ratings and an internal AI judge, not via a standardized externally administered exam.
Outcomes were measured within short sessions rather than at least one academic term after the intervention began.
The control condition and sample are clearly documented, including what the control group could and could not do.
The study did not randomize at the school (site) level.
The study was proposed, designed, executed, and analyzed by the author team, not by an independent evaluator.
The study does not measure outcomes over a full academic year, and criterion T is not met.
Time-on-task is held constant across groups and the tested difference is tool access, not extra instructional time or budget.
No independent replication of this specific study was identified at the time of this ERCT check.
The study does not use standardized exams across all main subjects, and criterion E is not met.
The study does not track participants to graduation and criterion Y is not met.
The paper reports IRB approval but no public preregistration record was identified.
With today's wide adoption of LLM products like ChatGPT from OpenAI, humans and businesses engage and use LLMs on a daily basis. Like any other tool, it carries its own set of advantages and limitations. This study focuses on finding...
The study randomized individual students, not entire classes or schools.
Assessments used researcher-developed quizzes and checklists, not standardized exams.
The intervention period was 6 weeks, shorter than a full academic term.
The control group's demographics, baseline characteristics, and treatment (routine FC) are documented.
Randomization was at the student level within a single university, not at the school level.
The same authors appear to have designed the intervention, conducted the study, and analyzed the data, with no mention of independent conduct.
The study duration, including data collection, was 11 weeks, which is less than a full academic year. Also, Criterion Y was not met.
The intervention group received gamified activities (extra quizzes, points, badges) which constitute additional resources/ engagement time compared to the control group's routine FC, and this difference was not explicitly tested as the treatment variable nor balanced.
The paper does not mention any independent replication of this specific study by another research team.
The study measured only nursing skills competency and related factors, not performance across all core academic subjects. Also, Criterion E was not met.
The study tracked students for 11 weeks, not until graduation. Also, Criterion Y was not met.
The study was prospectively registered on ClinicalTrials.gov before the intervention likely started based on the semester timing.
Background: Flipped learning excessively boosts the conceptual understanding of students through the reversed arrangement of pre-learning and in classroom learning events and challenges students to independently achieve learning objectives. Using a gamification method in flipped classrooms can help students stay...
Individual‑level randomization within one class violates the class‑level RCT requirement.
Assessments were custom course quizzes and a final, not a standardized exam.
Outcomes were collected after only five weeks, not a full term.
Demographics and baseline performance for the control group are fully reported.
Randomization did not occur at the school level.
The same team designed, implemented, and evaluated the study.
Study duration was five weeks, not a full academic year.
Control and treatment groups received equivalent emails and incentives; no extra resources favored treatment.
No independent replication of this RCT is reported.
Outcomes are limited to a single STEM course, not all subjects.
No tracking beyond the short 5‑week course was conducted.
No evidence of prospective trial registration is provided.
Time‑management skills are an essential component of college student success, especially in online classes. Through a randomized control trial of students in a for‑credit online course at a public 4‑year university, we test the efficacy of a scheduling intervention aimed...
Random assignment occurred at the individual student level, not by entire class or school.
The study employed custom 22‑item free‑response tests rather than a standardized exam.
Outcomes were measured over a seven‑day period, not a full term.
The negative control group’s composition and baseline data are clearly documented.
Randomization was at the individual student level, not by school.
The intervention was designed, delivered, and assessed by the same team without independent oversight.
The study tracked outcomes over one week, not a full academic year.
The extra evening sessions are central to the intervention and thus the unmatched control is acceptable.
No independent replication of this RCT is reported.
Only stereochemistry outcomes were measured, and E was not met.
Tracking ended after one week; no graduation‑level follow‑up.
No pre‑registration of the study protocol is mentioned.
The use of the flipped classroom approach in higher education STEM courses has rapidly increased over the past decade, and it appears this type of learning environment will play an important role in improving student success and retention in undergraduate...
The paper does not describe any randomization at the class level.
No standardized exam-based assessment is implemented in the paper.
No term-long outcome measurement is reported in the paper.
Control group demographics and baseline data are not provided.
No school-level random assignment is executed as part of this paper.
The study was conducted by the intervention's own authors.
No outcomes tracked over a full academic year are provided.
Treatment classes had extra teacher time; controls did not.
No independent replication of the interventions is reported.
Outcomes measured only in targeted subjects, not across all.
No long-term tracking through graduation is provided.
The RCT was preregistered on OSF prior to data collection.
The effect of a reduced pupil–teacher ratio has mainly been investigated as that of reduced class size. Hence we know little about alternative methods of reducing the pupil–teacher ratio. Deploying additional teachers in selected subjects may be a more flexible...
Randomisation was at the individual learner level, not classes.
Outcomes used a course-specific final test rather than a recognised standardised exam.
Outcomes were measured within a short-duration course without term-long follow-up.
The control condition is described, but baseline and full control-group characteristics are not documented for most participants.
Randomisation was not conducted at the school (institution) level.
The paper describes the authors implementing the intervention themselves rather than an independent evaluator.
Outcomes were not tracked for a full academic year after the intervention began.
Any added time from writing responses is the treatment itself and is described as minimal.
No independent replication of this specific RCT is identified.
Because E is not met, A is automatically not met.
Because Y is not met, G is automatically not met; no evidence of graduation tracking was found.
The paper provides an OSF link for data and code, but it does not report a pre-registered protocol with a pre-data-collection date.
This study investigates the effectiveness of brief reflection interventions designed to support self-regulated learning in a short, Massive Open Online Course for in-service teachers. Two types of text-based reflection prompts were tested in a randomised controlled trial with over 5,000...
The study is observational and did not randomize at the class level.
The study measures course grades rather than using standardized exams.
There is no intervention with outcomes measured after one academic term.
The paper does not document a distinct control group.
No school-level randomization was performed.
The study was conducted by the authors without an independent evaluator.
There is no intervention tracked for a full academic year.
No attempt to balance class time or resources.
The study's findings have been independently replicated by others.
The study does not use all-subject standardized exams.
No graduation tracking is performed.
No pre-registered protocol is referenced.
We model how class size affects the grade higher education students earn and we test the model using an ordinal logit with and without fixed effects on over 760,000 undergraduate observations from a northeastern public university. We find that class...
Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.
Share your research with us for review
Provide feedback to help us make things better.
If you're the author, let us know about necessary updates or corrections.