ERCT - Educational RCT Standard | ERCT - Educational RCT Standard

C

Class-level RCT

The study is randomized at the school level, which satisfies class-level randomization.
E

Exam-based Assessment

Outcomes were measured using Texas STAAR, a standardized state assessment.
T

Term Duration

The study ran across two school years, exceeding the minimum one-term requirement.
D

Documented Control Group

The control condition is clearly described and supported by detailed baseline and implementation documentation.
S

School-level RCT

Random assignment was implemented at the school level (64 schools).
I

Independent Conduct

RAND conducted the evaluation independently of Zearn, supported by IES funding.
Y

Year Duration

Outcomes were tracked over two full academic years (2022-2023 and 2023-2024).
B

Balanced Resources

Although Zearn provided additional implementation supports, teachers reported similar total math instructional time across groups and the supports appear integral to the tested intervention package.
R

Reproduced Results

An independent, peer-reviewed randomized study by a different author compared Zearn Math to another program, serving as an external replication effort on the intervention.
A

All-subject Exams

The paper reports standardized outcomes only for mathematics, not for all core subjects.
G

Graduation Tracking

The study reports outcomes through the end of the second study year and does not track students to graduation.
P

Pre-registered Protocol

The paper states a REES registry ID, but the public registry entry date could not be verified to be before study start.

mathematics
K12
US
gamification
blended learning
EdTech platform
digital assessment
formative assessment

Efficacy of Zearn Math over two years in grades 3 to 5: An experiment in Texas

John F. Pane, Christopher Doss, Ivy Todd, Dorothy Seaman

Zearn Math is a popular software platform for K-8 mathematics learning, designed to enable all students to successfully access grade-level content. RAND researchers collaborated with Zearn, the product's developer, to design this evaluation. Then RAND conducted the study independently, randomly...

Published: Jun 1, 2025

C

Class-level RCT

Randomisation occurred at the school level, which meets or exceeds the class‑level RCT requirement.
E

Exam-based Assessment

The study uses the nationally standardized ENLACE exam for objective, comparable assessment.
T

Term Duration

Outcomes were collected approximately three years after the intervention began, exceeding the minimum of one academic term.
D

Documented Control Group

Baseline demographics and pre‑intervention scores for the control group are provided in detail.
S

School-level RCT

Randomisation at the school level fulfills the School‑level RCT requirement.
I

Independent Conduct

The evaluation was conducted by researchers unaffiliated with SEP, ensuring independence of the study’s implementation and analysis.
Y

Year Duration

Participants were followed for approximately three years, which is longer than one academic year from intervention start to final outcomes.
B

Balanced Resources

The difference in time or resources between groups is trivial and integral to the intervention, unlikely to bias the results.
R

Reproduced Results

The paper does not reference any independent replication studies, and none were found in external literature.
A

All-subject Exams

Only math and language outcomes are measured, omitting other main subjects.
G

Graduation Tracking

The RCT measured on-time high school completion, meaning participants were followed through the completion of that educational level.
P

Pre-registered Protocol

No pre-registration statement or registry ID is provided in the paper or its references.

mathematics
language arts
K12
Latam
EdTech platform
digital assessment

The heterogeneous effect of information on student performance: Evidence from a randomized control trial in Mexico

Ciro Avitabile and Rafael de Hoyos

We use data from the randomized control trial of the Percepciones pilot to study whether providing 10th grade students with information about the average earnings associated with different educational attainments, life expectancy, and obtaining funding for higher education can contribute...

Published: Jul 27, 2018

C

Class-level RCT

Randomization occurred at the class (school) level, avoiding within-class mixing and meeting the class-level RCT standard.
E

Exam-based Assessment

Student learning was measured with the TerraNova standardized test, a well‑established, nationally normed exam.
T

Term Duration

The intervention spanned a full academic year of seventh-grade instruction, exceeding the one-term minimum.
D

Documented Control Group

The control group’s practices, demographics, and baseline scores are clearly documented for proper comparison.
S

School-level RCT

Entire schools (not just classes) were randomized, satisfying the school-level RCT requirement.
I

Independent Conduct

An independent research team evaluated the intervention, and no authors had a financial interest in ASSISTments.
Y

Year Duration

The trial ran for a full academic year (with data from the second year cohort), meeting the one‑year duration requirement.
B

Balanced Resources

Both groups followed identical homework policies and content; the ASSISTments tool and training were the core treatment.
R

Reproduced Results

An independent replication by another research team found similar positive results, confirming the original findings.
A

All-subject Exams

Only mathematics achievement was assessed; other subjects were not tested.
G

Graduation Tracking

Students were originally tracked only through 7th grade; a separate follow-up measured outcomes at the end of 8th grade, but tracking did not extend to high school graduation.
P

Pre-registered Protocol

The study was not pre-registered; no registry or preregistration statement is provided.

mathematics
K12
US
online homework
EdTech platform
digital assessment

Online Mathematics Homework Increases Student Achievement

Jeremy Roschelle, Mingyu Feng, Robert F. Murphy, Craig A. Mason

In a randomized field trial with 2,850 seventh-grade mathematics students, we evaluated whether an educational technology intervention increased mathematics learning. Assigning homework is common yet sometimes controversial. Building on prior research on formative assessment and adaptive teaching, we predicted that...

Published: Oct 1, 2016

C

Class-level RCT

The study randomised at the school (cluster) level, which is class-level or stronger and reduces contamination risks.
E

Exam-based Assessment

The outcomes were measured using IDELA, a widely used and validated assessment tool rather than a study-created exam.
T

Term Duration

Outcomes were tracked from baseline (October 2022) to endline (October 2023), which exceeds one academic term.
D

Documented Control Group

The control group is clearly described as business-as-usual and baseline characteristics and scores are reported for both groups.
S

School-level RCT

Schools were the cluster unit and were randomly allocated to treatment or control, satisfying school-level randomisation.
I

Independent Conduct

The paper explicitly states independence from the DPL provider for data collection, analysis, and conclusions, with independent enumerators collecting assessment data.
Y

Year Duration

The study assessed outcomes from October 2022 to October 2023 (about 13 months), exceeding 75% of an academic year.
B

Balanced Resources

The treatment received devices and implementation support, but these additional resources are integral to the DPL intervention being tested against business-as-usual schooling.
R

Reproduced Results

No independent peer-reviewed replication of this specific RCT was found in the paper or through internet searching.
A

All-subject Exams

Although IDELA is used, the study assesses literacy and numeracy only and does not measure impacts across all main subject areas.
G

Graduation Tracking

No evidence was found that the study tracked the same learners through a graduation milestone beyond the endline in October 2023, and no follow-up paper reporting graduation tracking was identified.
P

Pre-registered Protocol

The study mentions protocol registration prior to analysis, but external checking indicates the publicly available protocol was published in October 2023 (after baseline began in October 2022), so it does not meet ERCT pre-registration timing requirements.

mathematics
reading
pre-K
kindergarten
Africa
blended learning
EdTech app
mobile learning

Digital personalised learning to improve literacy and numeracy outcomes: a randomised controlled trial in Kenyan pre-primary classrooms

Louis Major, Rebecca Daltry, Mary Otieno, Kevin Otieno, Annette Zhao, Chen Sun, Jessica Hinks, and Aidan Friedberg

Research on digital personalised learning (DPL) alongside classroom teaching is limited in low- and middle-income countries. This study investigates, for the first time, a DPL programme aligned with national curricula and teaching practices. A randomised trial evaluated the impact of...

Published: Jan 7, 2026

C

Class-level RCT

Randomisation was at the school (cluster) level, which meets or exceeds the class-level requirement.
E

Exam-based Assessment

The study used IDELA, a widely used and validated standardized early learning assessment tool.
T

Term Duration

Outcomes were tracked from October 2022 to October 2023, far exceeding a one-term minimum.
D

Documented Control Group

The control group is described as business-as-usual and baseline characteristics and scores are reported for comparison.
S

School-level RCT

Schools were randomly assigned to treatment or control, satisfying the school-level RCT requirement.
I

Independent Conduct

The paper states the DPL provider did not participate in data collection, analysis, or conclusions, supporting independent conduct.
Y

Year Duration

Outcomes were measured over about 13 months and four school terms, meeting the year-duration threshold.
B

Balanced Resources

The intervention added devices and implementation support, but these additional resources are integral to the DPL treatment being tested against business-as-usual.
R

Reproduced Results

No independent replication of this specific trial by a different research team could be identified from the paper or from web searching.
A

All-subject Exams

The study assessed standardized outcomes in literacy and numeracy only, not all main subject areas for the educational setting.
G

Graduation Tracking

No evidence was found that the study tracked learners until graduation, and web searching did not identify follow-up publications tracking this cohort through graduation.
P

Pre-registered Protocol

The protocol was registered/published after data collection began, so it does not meet the ERCT requirement for pre-registration before the study started.

mathematics
reading
pre-K
kindergarten
Africa
blended learning
EdTech app
mobile learning
digital assessment
formative assessment

Digital personalised learning to improve literacy and numeracy outcomes: a randomised controlled trial in Kenyan pre-primary classrooms

Louis Major, Rebecca Daltry, Mary Otieno, Kevin Otieno, Annette Zhao, Chen Sun, Jessica Hinks, and Aidan Friedberg

Research on digital personalised learning (DPL) alongside classroom teaching is limited in low- and middle-income countries. This study investigates, for the first time, a DPL programme aligned with national curricula and teaching practices. A randomised trial evaluated the impact of...

Published: Jan 7, 2026

C

Class-level RCT

Randomization was at the school level, which satisfies the class-level RCT requirement.
E

Exam-based Assessment

Outcomes were measured using EGRA and EGMA, which are standardized assessment systems.
T

Term Duration

Midline outcomes were measured about 1.5 years after implementation began, exceeding one academic term.
D

Documented Control Group

The control group is described with sample sizes and baseline characteristics, including a baseline balance table.
S

School-level RCT

Treatment was assigned at the school level, meeting the school-level RCT requirement.
I

Independent Conduct

Data collection involved an external firm and the authors analyze secondary data rather than directly implementing the intervention.
Y

Year Duration

Midline outcomes were measured about 1.5 years after implementation began, exceeding one academic year.
B

Balanced Resources

The intervention adds facilitated discussions, and those added resources are the treatment being tested rather than an uncontrolled add-on.
R

Reproduced Results

No independent reproduction of this specific study was identified in the paper, and none could be verified from accessible sources.
A

All-subject Exams

The study measures mathematics and reading only, not all core subjects.
G

Graduation Tracking

Participants were followed through endline in 2016, which is not tracking the cohort to graduation.
P

Pre-registered Protocol

The study reports AEA RCT Registry registration, but the registry entry timing could not be verified from accessible sources for this check.

mathematics
reading
K12
Africa
parent involvement

Can Discussions about Girls’ Education Improve Academic Outcomes? Evidence from a Randomized Development Project

Christopher S. Cotton, Ardyn Nordstrom, Jordan Nanowski, and Eric Richert

This article evaluates the impact that facilitated discussions about girls’ education have on education outcomes for students in rural Zimbabwe. The staggered implementation of components of a randomized education project allowed for the causal analysis of a dialogue-based engagement campaign....

Published: Feb 1, 2025

C

Class-level RCT

Centers (preschool sites) were randomly assigned, which is class-level or stronger randomization.
E

Exam-based Assessment

The outcomes use established, widely used assessments (IGDI, FastBridge, HTKS) rather than a custom test created for this study.
T

Term Duration

Outcomes are measured from fall to spring within a school year, which is at least one academic term (and in practice close to a full year).
D

Documented Control Group

The paper clearly defines a comparison control group (including sample sizes) and reports balance checks and baseline equivalence statements.
S

School-level RCT

Randomization occurred at the preschool center level, which meets the school-level RCT requirement in this context.
I

Independent Conduct

The evaluation is conducted by NORC at the University of Chicago, while the program is implemented in collaboration with Kidango as the implementation partner.
Y

Year Duration

The study measures outcomes from fall to spring within a school year, and the intervention is described as occurring during the 2017-2018 school year.
B

Balanced Resources

Any additional resources (PD and coaching) are the intervention being tested, and the comparison group is business-as-usual with delayed treatment.
R

Reproduced Results

No peer-reviewed, independent replication of this specific SEEDS PD RCT was found in the available literature.
A

All-subject Exams

The study focuses on language/literacy and executive function outcomes and does not assess all core subjects (for example, mathematics).
G

Graduation Tracking

The study reports outcomes within preschool years (fall-to-spring) and does not track students through graduation from the educational stage.
P

Pre-registered Protocol

The paper does not cite a pre-registration record or registry identifier, and no verified pre-registration entry was found.

reading
language arts
pre-K
US

SEEDS of Early Learning: A Randomized Controlled Trial of an Early Literacy Teacher Professional Development Program

Marc Hernandez, Sarah Kabourek, Isis Owusu, Arvind Ilamaran

NORC at the University of Chicago designed and implemented an impact evaluation of the SEEDS of Learning (SEEDS) professional development (PD) program on behalf of the Kenneth Rainin Foundation, in collaboration with Kidango. SEEDS of Learning is an evidence-based PD...

Published: Mar 1, 2025

C

Class-level RCT

Whole schools were randomly assigned to intervention and control, exceeding the class-level randomisation requirement.
E

Exam-based Assessment

Outcomes were measured using the standardised NWEA MAP Growth assessment rather than a custom test.
T

Term Duration

The study followed students from Autumn 2023 through the July 2024 post-assessment, exceeding one full term.
D

Documented Control Group

The report documents the control group definition, baseline school characteristics, and control schools business as usual practices.
S

School-level RCT

Twenty schools were randomised at the school level, satisfying the school-level RCT requirement.
I

Independent Conduct

The report separates the developer from the evaluator, with WhatWorked conducting randomisation and evaluation activities.
Y

Year Duration

Outcomes were tracked from September 2023 to July 2024, covering an academic year.
B

Balanced Resources

The intervention did not add unbalanced time or resources: homework frequency and duration were similar and controls used other platforms.
R

Reproduced Results

No independent peer-reviewed replication of this specific school-level RCT was found.
A

All-subject Exams

Only mathematics attainment was assessed, not standardised outcomes in all core subjects.
G

Graduation Tracking

Follow-up stops in July 2024 for the Year 7 cohort, with no graduation tracking reported.
P

Pre-registered Protocol

The report provides no trial registration or protocol ID, and no public pre-registration could be verified.

mathematics
K12
UK
homework
online homework
gamification
EdTech platform
digital assessment
formative assessment

EEDI 2024 IMPACT REPORT: A study to evaluate the effectiveness of Eedi on raising attainment in mathematics at KS3 (Year 7)

Wayne Harrison, Emma Dobson, Steve Higgins, Germaine Uwimpuhwe, Rahil Khowaja

This report evaluates the impact of Eedi, a digital mathematics platform, on raising maths attainment amongst Key Stage 3 students (Year 7). The study used a randomised controlled trial design with 20 schools, where schools were randomly assigned to either...

Published: Aug 1, 2024

C

Class-level RCT

Randomization was conducted at the school level, satisfying the class-level requirement.
E

Exam-based Assessment

The study uses the ECE, Peru's national standardized assessment for primary schools.
T

Term Duration

Outcomes were measured approximately 9 months after the intervention began, exceeding the one-term requirement.
D

Documented Control Group

The control group is clearly defined as non-participating schools and their baseline characteristics are extensively documented in Table 2.
S

School-level RCT

Randomization occurred at the school level across 6,218 schools.
I

Independent Conduct

The study evaluation was conducted by independent academics, distinct from the Ministry that designed the intervention.
Y

Year Duration

The study measured outcomes after 1 and 3 years of program implementation.
B

Balanced Resources

The intervention explicitly tests the impact of adding significant resources (coaching), making the resource imbalance integral to the study design.
R

Reproduced Results

The study has not been independently reproduced.
A

All-subject Exams

Assessments were limited to mathematics and reading, omitting other core subjects like science.
G

Graduation Tracking

Tracking ended at Grade 4, prior to primary school graduation.
P

Pre-registered Protocol

No pre-registration of the study protocol is mentioned.

mathematics
reading
K12
Latam

Can teaching be taught? Improving teachers' pedagogical skills at scale in rural Peru

Juan F. Castro, Paul Glewwe, Alexandra Heredia-Mayo, Stephanie Majerowicz, Ricardo Montero

We evaluate the impact of a large-scale teacher coaching program in Peru, a context with high teacher turnover, on teachers' pedagogical skills and student learning. Previous studies find that small-scale coaching programs can improve teaching of reading and science in...

Published: Jan 1, 2025

C

Class-level RCT

The study randomized entire villages (clusters), satisfying the requirement for class-level or stronger randomization.
E

Exam-based Assessment

The study used EGRA and EGMA, which are widely recognized standardized assessments.
T

Term Duration

The study duration spanned multiple years, significantly exceeding the one-term requirement.
D

Documented Control Group

The control group is well-documented, including demographics and confirmation that they received no educational intervention.
S

School-level RCT

Randomization occurred at the village level, which serves as the implementation unit, satisfying the school-level RCT criterion.
I

Independent Conduct

Data collection was conducted by an independent organization (GHTC) with blinded administrators, ensuring independent conduct.
Y

Year Duration

The study assessed outcomes approximately 30 months after the intervention began, satisfying the year duration criterion.
B

Balanced Resources

The study explicitly tested the impact of the additional resources (para-instructor intervention) as the treatment variable.
R

Reproduced Results

The intervention methodology has been replicated by independent teams (e.g., J-PAL/Banerjee et al.) as cited in the paper.
A

All-subject Exams

The study assessed outcomes in Reading and Mathematics, covering the main subjects for the target grade levels.
G

Graduation Tracking

The study explicitly states that no further follow-up was conducted after the intervention ended, failing the graduation tracking requirement.
P

Pre-registered Protocol

The paper cites a protocol published after the trial start but does not provide a specific pre-registration registry link or date in the text.

mathematics
reading
K12
Asia
gamification
parent involvement
EdTech app
digital assessment
mobile learning
formative assessment

Support To Rural India's Public Education System (STRIPES2) and impact on numeracy and literacy scores: A cluster randomized trial in rural villages of Madhya Pradesh, India

Ila Fazzio, Siddharudha Shivalli, Nicholas Magill, Diana Elbourne, Suzanne Keddie, Dropti Sharma, Sajjan Singh Shekhawat, Arjun Agarwal, Rukmini Banerji, Sridevi Karnati, Harshavardhan Reddy, Tony Brady, Piotr Gawron, Pei-Tseng Jenny Hsieh, Alex Eble, Peter Boone, Chris Frost

In common with many other low- and middle-income countries (LMICs), India has witnessed a massive expansion in school enrolment over the last 20 years, and yet many students finish primary education without the foundational literacy and numeracy skills that would...

Published: Sep 12, 2025

C

Class-level RCT

Randomization was conducted at the school (cluster) level, satisfying the class‑level RCT criterion.
E

Exam-based Assessment

The study measured outcomes using national examination scores, a standardized assessment.
T

Term Duration

Intervention and follow‑up lasted ten months, exceeding a full academic term.
D

Documented Control Group

The control group’s size and baseline characteristics are clearly documented in the methods and tables.
S

School-level RCT

Entire schools were randomized, satisfying the school‑level RCT criterion.
I

Independent Conduct

The same team that designed the intervention conducted and analyzed the trial without independent oversight.
Y

Year Duration

Participants were tracked from February through December—one full academic year.
B

Balanced Resources

HIIT was integrated into standard PE time without additional class time or resources, keeping groups balanced.
R

Reproduced Results

No independent replication of this intervention has been reported.
A

All-subject Exams

Outcomes were measured only in mathematics and Mongolian language, not all main subjects.
G

Graduation Tracking

Participants were not followed until graduation; follow-up ended at study completion.
P

Pre-registered Protocol

Trial registration was completed before the study began (registered 1st February 2018).

mathematics
language arts
K12
Asia

Exercise Intervention for Academic Achievement Among Children: A Randomized Controlled Trial

Kenji Takehara, Ganchimeg Togoobaatar, Akihito Kikuchi, Gundegmaa Lkhagvasuren, Altantsetseg Lkhagvasuren, Ai Aoki, Takemune Fukuie, Bat‑Erdene Shagdar, Kazuya Suwabe, Masashi Mikami, Rintaro Mori, and Hideaki Soya

OBJECTIVES: Physical inactivity is an important health concern worldwide. We examined the effects of an exercise intervention on children’s academic achievement, cognitive function, physical fitness, and other health-related outcomes. METHODS: We conducted a population-based cluster RCT among 2301 fourth‑grade students...

Published: Nov 1, 2021

C

Class-level RCT

The study randomised treatment at the grade‐by‐school level, satisfying the class‐level RCT requirement.
E

Exam-based Assessment

They used the Smarter Balanced standardized exams for Math and ELA as outcome measures.
T

Term Duration

The intervention lasted from late October through May, satisfying the full academic term requirement.
D

Documented Control Group

Control group characteristics and communications are clearly documented in the methods and Table 1.
S

School-level RCT

Randomisation occurred at the grade‐by‐school level rather than entire schools.
I

Independent Conduct

The study was designed, implemented, and analyzed by the same team without external evaluation.
Y

Year Duration

The study intervention covered a full academic year.
B

Balanced Resources

No additional instructional time or budget was provided, only low‑cost informational text messages.
R

Reproduced Results

A separate research team reproduced the intervention in another context and reported similar positive results.
A

All-subject Exams

Only Math and ELA were assessed via standardized exams, failing to cover all core subjects.
G

Graduation Tracking

Participants were tracked only through the end of the school year, not until graduation.
P

Pre-registered Protocol

The study’s analysis plan and outcomes were publicly pre-registered before data collection began.

mathematics
reading
K12
US
parent involvement
EdTech platform

Leveraging Technology to Engage Parents at Scale: Evidence from a Randomized Controlled Trial

Peter Bergman and Eric W. Chan

While leveraging parents has the potential to increase student performance, programs that do so are often costly to implement or they target younger children. We partner text‐messaging technology with school information systems to automate the gathering and provision of information...

Published: May 1, 2017

C

Class-level RCT

Randomization occurred at the department-grade (cohort) level, meeting the class-level RCT requirement.
E

Exam-based Assessment

The study measured outcomes using official final grades from institutional examination records.
T

Term Duration

The study ran for the full Spring 2024 semester (February-May), meeting the term-duration requirement.
D

Documented Control Group

The control condition is clearly described as business-as-usual with unrestricted phone access and documented baseline balance.
S

School-level RCT

Randomization was within institutions (departments/cohorts), not between institutions, so it is not a school-level RCT.
I

Independent Conduct

The authors designed, implemented, and analyzed the study themselves, with no independent evaluator described.
Y

Year Duration

The study covers one semester rather than a full academic year, so the year-duration requirement is not met.
B

Balanced Resources

Phone boxes were installed in all classrooms and the intervention was a policy enforcement rather than added instructional resources, so inputs are balanced.
R

Reproduced Results

No peer-reviewed independent replication of this specific RCT was found.
A

All-subject Exams

The outcome reflects overall grades across (nearly) all courses, providing an all-subject style assessment rather than a single subject test.
G

Graduation Tracking

The paper reports only semester-end outcomes and does not track students to graduation.
P

Pre-registered Protocol

The paper links to a pre-registration, and the registry date is before the Spring 2024 data collection period.

higher education
Asia
mobile learning

Removing Phones from Classrooms Improves Academic Performance

Alp Sungu, Pradeep Kumar Choudhury, Andreas Bjerre-Nielsen

Widespread smartphone bans are being implemented in classrooms worldwide, yet their causal effects on student outcomes remain unclear. In a randomized controlled trial involving nearly 17,000 students, we find that mandatory in-class phone collection led to higher grades - particularly...

Published: Aug 12, 2025

C

Class-level RCT

Randomisation occurred at the school level with classes equally divided into intervention and control groups, satisfying the Class‑level RCT criterion.
E

Exam-based Assessment

The study used GL Assessment Progress Tests—standardized instruments in English, mathematics, and science—scored blind by the test publisher, fulfilling the ERCT Standard’s Exam‑based Assessment criterion.
T

Term Duration

Primary outcomes were measured 20 weeks after intervention start, exceeding a single academic term.
D

Documented Control Group

Control cohorts are described alongside intervention cohorts, with baseline comparisons implied.
S

School-level RCT

The study’s RCT was conducted at the whole-school level: entire schools were randomly assigned to either the dialogic teaching intervention or the control condition, fulfilling the School‑level RCT criterion.
I

Independent Conduct

The RCT was conducted by an independent evaluation team, separate from the intervention’s designers.
Y

Year Duration

The study lasted 20 weeks, well short of a full academic year.
B

Balanced Resources

Control group teachers did not receive the intervention’s teacher induction, training, or mentoring support.
R

Reproduced Results

No independent replication of this dialogic teaching trial is reported in the paper or elsewhere.
A

All-subject Exams

Student performance was assessed in English, mathematics and science, covering the core primary curriculum.
G

Graduation Tracking

No follow‑up tracking to graduation is described in the study or in any subsequent publications.
P

Pre-registered Protocol

No pre‑registration or protocol identifier is provided in the paper (the trial was only registered retrospectively on ISRCTN, after completion).

mathematics
language arts
science
K12
UK

Developing dialogic teaching: genesis, process, trial

Robin Alexander

This paper considers the development and randomised control trial (RCT) of a dialogic teaching intervention designed to maximise the power of classroom talk to enhance students’ engagement and learning. Building on the author’s earlier work, the intervention’s pedagogical strand instantiates...

Published: Jun 17, 2018

C

Class-level RCT

Criterion C is met because random assignment occurred at the school level, which is class-level or stronger.
E

Exam-based Assessment

Criterion E is met because outcomes include standardized exam-based measures (EOG and MAP).
T

Term Duration

Criterion T is met because outcomes were measured well beyond one term after the intervention began.
D

Documented Control Group

Criterion D is met because the control condition and baseline comparability information are explicitly documented.
S

School-level RCT

Criterion S is met because randomization occurred at the school level.
I

Independent Conduct

Criterion I is not met because the paper does not provide explicit evidence that the evaluation was conducted independently from the intervention designers.
Y

Year Duration

Criterion Y is met because outcomes were tracked across multiple school years from the start of implementation through spring of Grade 4.
B

Balanced Resources

Criterion B is met because the additional resources (PD, lessons, books, and a digital app) are integral to the MORE intervention package being tested against business-as-usual.
R

Reproduced Results

Criterion R is not met because no independent replication by a different research team is identified or documented.
A

All-subject Exams

Criterion A is not met because the standardized exam outcomes reported are limited to reading and math, not all main subjects.
G

Graduation Tracking

Criterion G is not met because follow-up is reported through spring of Grade 4, not through graduation.
P

Pre-registered Protocol

Criterion P is not met because no pre-registration ID/link and pre-intervention registration date are provided or verifiable from accessible sources.

reading
mathematics
K12
US
parent involvement
EdTech app

Mapping the Mechanisms of Interdisciplinary Learning Transfer from Reading to Math Achievement: Evidence from a Large-Scale Randomized Controlled Trial

Joshua B. Gilbert and James S. Kim

Far transfer—the application of learning across distant domains—remains elusive in intervention research, and even when it is found, its mechanisms remain unclear or unexplored. This study analyzes data from the Model of Reading Engagement (MORE), a sustained content literacy intervention...

Published: Dec 18, 2025

C

Class-level RCT

Randomisation occurred at the ECD-centre (site) level, which is stronger than class-level randomisation and satisfies criterion C.
E

Exam-based Assessment

The study uses ELOM, described as a standardised tool with established validity and reliability, meeting criterion E.
T

Term Duration

Outcomes were measured at Time 2 after 26 weeks, which exceeds a typical academic term duration.
D

Documented Control Group

The paper documents the wait-list business-as-usual control group and reports group sizes and baseline characteristics, meeting D.
S

School-level RCT

ECD centres (sites) were randomly assigned to conditions, meeting the school/site-level RCT requirement.
I

Independent Conduct

The paper does not clearly document evaluation independence from the intervention organisation/designers, and a competing interest is declared, so criterion I is not met.
Y

Year Duration

Outcomes were assessed after 26 weeks, which is below 75% of the paper’s stated 36-week full school-year programme, so Y is not met.
B

Balanced Resources

Although the intervention adds resources (materials and teacher training/support), these are integral to the package being tested against business-as-usual, so criterion B is met.
R

Reproduced Results

No independent, peer-reviewed replication by a different research team was identified, so criterion R is not met.
A

All-subject Exams

The study assesses broad learning domains (including emergent literacy/language and emergent numeracy) using the standardised ELOM, satisfying criterion A for broad coverage in this context.
G

Graduation Tracking

The study does not track participants to graduation, and because criterion Y is not met, criterion G is not met as well.
P

Pre-registered Protocol

The paper links to an OSF pre-registration but the registration date (and its precedence to data collection) is not documented in the paper, so criterion P is not met.

reading
language arts
pre-K
Africa

The impact of a story-based intervention on language, literacy, and cognitive development in South African pre-schoolers: randomised controlled trials for two language groups

Kate Cain; Jane Oakhill; Shelley O’Carroll; Daleen Klop; Monique Visser; James Jackson; Brian Francis

Literacy rates in South Africa are low and many children start school without the requisite levels of emergent language and literacy skills needed to succeed. We report two RCTs of a story-based intervention delivered by preschool teachers to two language...

Published: Dec 25, 2025

C

Class-level RCT

Randomisation was student-level, but the intervention is targeted small- group instruction outside normal English lessons, fitting the tutoring exception.
E

Exam-based Assessment

Outcomes were measured using recognised standardised digital reading tests (NGRT and ART).
T

Term Duration

The outcome measurement occurred after approximately two terms (about six months), exceeding the one-term minimum.
D

Documented Control Group

The control group and its business-as-usual condition were described, and the paper reports baseline equivalence plus detailed counterfactual information about interventions used with control students.
S

School-level RCT

Randomisation was at the individual student level, not the school level.
I

Independent Conduct

The intervention is attributed to Fischer Family Trust Literacy (FFTL), while the trial is authored by university researchers and funded by an independent foundation.
Y

Year Duration

Outcomes were measured after approximately two terms (about six months), which is shorter than a full academic year.
B

Balanced Resources

The intervention adds time, training, and materials, but these resources are integral to the intervention being tested against business as usual.
R

Reproduced Results

I found no peer-reviewed independent replication of this specific high- school FFTL Reciprocal Reading evaluation by a different research team.
A

All-subject Exams

Only reading outcomes were assessed; impacts on all main subjects were not measured.
G

Graduation Tracking

The study reports post-test at the end of the intervention and provides no evidence of tracking participants to graduation; also, criterion Y is not met so G cannot be met under the ERCT rules.
P

Pre-registered Protocol

The trial was registered after enrolment had begun, so it does not meet the requirement for a prospectively pre-registered protocol.

reading
K12
UK
digital assessment

Results from a Phase 3 definitive trial of reciprocal reading in English high schools

Maria Cockerill, Allen Thurston, Joanne O'Keeffe, Tien-Hui Chiang

Targeted reciprocal reading instruction can lead to improved reading attainment. Though tested in elementary schools, the technique is less studied with older students. This paper reports results from a Phase 3 definitive trial designed to detect attainment gains previously identified...

Published: Jan 29, 2025

C

Class-level RCT

The study randomized entire schools to treatment or control, satisfying the requirement for class‑level RCT.
E

Exam-based Assessment

The study employed official state standardized exams for ELA and math, meeting the exam‑based assessment requirement.
T

Term Duration

Outcomes were measured in subsequent grades, well after at least one full academic term had elapsed, fulfilling the term‑duration requirement.
D

Documented Control Group

The control group’s makeup, baseline statistics, and alternate program are clearly documented, satisfying the requirement for a documented control group.
S

School-level RCT

Randomization at the school level fulfills the school‑level RCT criterion.
I

Independent Conduct

The study was implemented and analyzed by the same team that developed the program, so there is no independent evaluation group.
Y

Year Duration

Students’ outcomes were measured over multiple grades, covering at least a full academic year of follow‑up.
B

Balanced Resources

The control group’s activities were far less intensive than INSIGHTS, so resource allocation was not balanced.
R

Reproduced Results

There is no reference to an external, independent replication of the INSIGHTS trial.
A

All-subject Exams

The authors measured only ELA and math; other core subjects were not assessed.
G

Graduation Tracking

No data are reported beyond sixth grade (middle school entry), so graduation tracking is incomplete.
P

Pre-registered Protocol

No pre‑registered protocol or registry reference is provided in the paper.

mathematics
language arts
K12
US
parent involvement

Long‑Term Effects of Social‑Emotional Learning on Academic Skills: Evidence from a Randomized Trial of INSIGHTS

Meghan P. McCormick, Robin Neuhaus, Erin E. O’Connor, Hope I. White, E. Parham Horn, Samantha Harding, Elise Cappella, and Sandee McClowry

Social‑Emotional Learning (SEL) programs are school‑based preventive interventions that aim to improve children’s social‑emotional skills and behaviors. Although meta‑analytic research has shown that SEL programs implemented in early childhood can improve academic and behavioral outcomes in the short‑term, there is...

Published: Nov 3, 2020

C

Class-level RCT

The study randomizes within classrooms, but it evaluates tutoring (small-group personal instruction), which meets the tutoring exception under Criterion C.
E

Exam-based Assessment

The study measures outcomes using the NWEA MAP Math assessment, which the paper describes as a widely used standardized test.
T

Term Duration

Outcomes are measured from January 2024 (baseline) to May 2024 (endline), which spans roughly one academic term.
D

Documented Control Group

The control group is clearly defined and its baseline characteristics and sample size are documented in Table 1.
S

School-level RCT

The study is conducted in one school and does not randomize treatment assignment at the school level.
I

Independent Conduct

The paper does not explicitly state that the evaluation was conducted by an independent third-party team separate from the author and implementation partners.
Y

Year Duration

The evaluation runs from January 2024 to May 2024, which is substantially less than 75% of an academic year.
B

Balanced Resources

The intervention provides additional tutoring time and tutor labor relative to business-as-usual, but these added resources are the treatment being tested, so an unbalanced business-as- usual control is acceptable under Criterion B’s intent check.
R

Reproduced Results

No independent, peer-reviewed replication of this specific experiment was found during the ERCT check.
A

All-subject Exams

The study uses a standardized assessment (MAP) but reports outcomes only for math rather than for all main subjects.
G

Graduation Tracking

The study does not track students through graduation, and per the ERCT dependency rule, Criterion G cannot be met because Criterion Y is not met.
P

Pre-registered Protocol

The paper states the study was pre-registered (AEARCTR-0012858), but the registry page could not be accessed to verify the exact registration date relative to study start.

mathematics
K12
US

The Trade-off between Quality and Quantity: Evidence from a Field Experiment on Tutoring

Rohen Shah

High-dosage tutoring has the potential to substantially raise adolescent academic achievement. However, at scale, schools may not have the financial ability to deliver small-group tutoring frequently. In this paper, I test the relative importance of group size (quality) versus tutoring...

Published: Jan 5, 2026

C

Class-level RCT

Student-level randomization is acceptable here because the intervention is tutoring.
E

Exam-based Assessment

Outcomes were measured with standardized assessments (DIBELS-8 and i-Ready).
T

Term Duration

Outcomes were measured at end of year, well more than one term after start.
D

Documented Control Group

The control group is described as business-as-usual and baseline characteristics are reported.
S

School-level RCT

Randomization occurred within classrooms rather than at the school level.
I

Independent Conduct

The paper evaluates externally developed, district-implemented programs rather than a researcher-designed intervention.
Y

Year Duration

Implementation began in November, so the study does not span a full academic year from start of year.
B

Balanced Resources

Extra tutoring time is the treatment being tested, so a business-as-usual control is acceptable.
R

Reproduced Results

No independent replication of this paper's para-tutoring implementations and findings was found.
A

All-subject Exams

Only literacy and math outcomes are measured, not all core subjects.
G

Graduation Tracking

The study reports end-of-year outcomes only and does not track students through graduation.
P

Pre-registered Protocol

The paper claims preregistration but provides no registry link, ID, or date that can be verified.

mathematics
reading
kindergarten
K12
US
digital assessment

Beyond the One-Teacher Model: Experimental Evidence on Using Embedded Paraprofessionals as Personalized Instructors

Elizabeth Huffaker, Monica G. Lee, Helen Zhou, Carly D. Robinson, Susanna Loeb

Using embedded paraprofessionals to provide personalized instruction is a promising model for differentiating instruction within the classroom. This study examines two randomized controlled trials of paraprofessional-led tutoring in early-grade math and literacy. However, intent-to-treat (ITT) analyses revealed no overall achievement...

Published: Nov 1, 2025

C

Class-level RCT

The unit of randomization was the classroom (teacher), meeting the class-level RCT requirement.
E

Exam-based Assessment

The primary outcomes were measured using established, standardized assessments (for example PAT-2, Woodcock-Johnson III, and PPVT-IV).
T

Term Duration

Outcomes were measured across the school year, exceeding a single academic term.
D

Documented Control Group

The control condition and baseline characteristics are documented, with both demographics (Table 1) and descriptions of control instruction.
S

School-level RCT

Randomization was at the classroom (teacher) level rather than at the school level.
I

Independent Conduct

The intervention was developed and evaluated by the same research team, so the evaluation was not independent of the intervention designers.
Y

Year Duration

The intervention and measurement spanned the full academic school year.
B

Balanced Resources

The treatment replaced part of the normal literacy block rather than adding extra student instruction time, and the study evaluates the full implementation package (curriculum plus required teacher training and coaching) against business-as-usual.
R

Reproduced Results

No independent replication by a different research team was identified.
A

All-subject Exams

Outcomes were limited to language and literacy, with no standardized measures reported for other core subjects.
G

Graduation Tracking

The study did not track participants through graduation or an equivalent endpoint.
P

Pre-registered Protocol

The paper does not report a pre-registration record or registry identifier for the trial.

reading
language arts
pre-K
kindergarten
US

A Randomized Controlled Trial of Foundations for Literacy With Children Who Are Deaf or Hard of Hearing

Amy R. Lederberg, Susan Easterbrooks, Lee Branum-Martin, Victoria Burke & Stacey Tucci

The goal of the present study was to assess the effectiveness of Foundations for Literacy for deaf and hard-of-hearing (DHH) children. Forty-eight teachers in 14 states were randomly assigned to intervention or control groups. Teachers in the intervention group used...

Published: Jun 26, 2025

C

Class-level RCT

The study randomizes at the school level, which is stronger than class-level randomization and satisfies criterion C.
E

Exam-based Assessment

Academic outcomes were measured using standardized instruments from a national framework developed with ACER and the Afghanistan Ministry of Education.
T

Term Duration

The paper states that post-intervention measurement occurred after a minimum four-month intervention period, meeting term-duration follow-up.
D

Documented Control Group

The paper clearly defines the control condition as usual teaching with no intervention and presents baseline characteristics for both groups.
S

School-level RCT

Schools were randomized to intervention and control conditions, meeting the school-level RCT requirement.
I

Independent Conduct

The authors trained and supervised intervention delivery, and the paper does not document an external independent evaluator leading the study.
Y

Year Duration

Post-intervention measurement is described as occurring after about 3–4 months, which is below the ERCT requirement of at least 75% of an academic year.
B

Balanced Resources

The intervention adds substantial time/training/support relative to the control, but these inputs are integral to the intervention package being tested against business-as-usual schooling.
R

Reproduced Results

No independent peer-reviewed replication by a different research team was identified, and the paper itself only discusses replication as a future possibility.
A

All-subject Exams

The study reports standardized academic outcomes for reading/general knowledge and mathematics, but it does not assess all core subjects.
G

Graduation Tracking

The study does not track participants until graduation, and ERCT rules require Y to be met for G to be met.
P

Pre-registered Protocol

The trial was registered in ISRCTN in late 2024, after enrolment and the study period described in the paper, so it is not pre-registered.

reading
mathematics
K12
Asia
parent involvement

Improving child mental health and learning outcomes and reducing stigma and discrimination in conflict setting: findings from a cluster randomized controlled trial of a classroom-based psychosocial intervention in rural primary schools in Afghanistan

Jean-Francois Trani, Yiqi Zhu, Saria Bechara, Shuya Yin, Parul Bakhshi, Ian Kaplan, Ramkrishna K. Singh, Mohammed A. Modaber, Hashim Rawab, Madelyn Yoo, Kim Thuy Seelinger, Ganesh M. Babulal, and Ramesh Raghavan

Background: Conflict and crises have long-lasting and dramatic consequences on the mental health of children. We aimed to investigate the effectiveness of a psychosocial intervention on child mental health in Afghanistan. Methods: A two-arm cluster-randomized controlled trial was conducted in...

Published: Jan 25, 2026

C

Class-level RCT

Randomization is at the student level, but the intervention is an individualized AI personalized learning platform, fitting ERCT's personal teaching exception.
E

Exam-based Assessment

The study reports using a standardized test bank (LCME) with reliability and validity evidence rather than a bespoke study-created exam.
T

Term Duration

Outcomes were measured at Week 12 after the intervention began, matching a term-length (approximately 12 weeks) follow-up window.
D

Documented Control Group

The control condition is described in detail and baseline characteristics are reported for both groups.
S

School-level RCT

Randomization occurs at the student level rather than assigning whole schools or sites.
I

Independent Conduct

The paper does not provide evidence that the evaluation was conducted by an independent team distinct from the intervention designer.
Y

Year Duration

Despite stating a one-year study period, outcomes are reported at Week 12 rather than after a full academic year of follow-up.
B

Balanced Resources

The intervention intentionally adds an AI platform as the treatment, so the control remains business-as-usual by design under ERCT's resource treatment exception.
R

Reproduced Results

No independent replication study of this specific intervention trial was found in available sources at the time of this ERCT check.
A

All-subject Exams

Outcomes are limited to a single course or domain rather than standardized exams across all main subjects.
G

Graduation Tracking

The study does not track participants until graduation and, since Year Duration is not met, Graduation Tracking cannot be met under ERCT rules.
P

Pre-registered Protocol

The paper does not report a pre-registered protocol or registry entry that can be verified as registered prior to data collection.

science
higher education
China
blended learning
EdTech platform

Evaluation of the impact of AI-driven personalized learning platform on medical students' learning performance

Yajun Chen

This study aims to evaluate the comprehensive impact of an artificial intelligence (AI)-driven personalized learning platform based on the Coze platform on medical students' learning outcomes, learning satisfaction, and self-directed learning abilities. It seeks to explore its practical application value...

Published: Sep 12, 2025

C

Class-level RCT

Randomization occurred at the school (above-class) level, satisfying the class-level RCT requirement.
E

Exam-based Assessment

Outcomes were measured using the standardized ITBS reading test.
T

Term Duration

Reading outcomes were assessed after 12–20 weeks (roughly one academic term).
D

Documented Control Group

Control group conditions and baseline characteristics were thoroughly described.
S

School-level RCT

No entire school was solely a treatment or solely a control site; randomization was within schools.
I

Independent Conduct

An independent evaluation team (with an external data center) conducted the study, separate from the program’s creators.
Y

Year Duration

Outcomes were measured only midyear (half-year), with no full-year follow-up.
B

Balanced Resources

The Reading Recovery group got extra daily tutoring that the control group did not receive, resulting in unbalanced time/resources.
R

Reproduced Results

No evidence was found of an independent replication of this Reading Recovery study by another team.
A

All-subject Exams

Only reading was tested; no standard exams in other core subjects were reported.
G

Graduation Tracking

The study did not track participants through to graduation.
P

Pre-registered Protocol

No pre-registration or registry listing was provided for the study.

reading
K12
US

Year One Results From the Multisite Randomized Evaluation of the i3 Scale-Up of Reading Recovery

Henry May, Abigail Gray, Philip Sirinides, Heather Goldsworthy, Michael Armijo, Cecile Sam, Jessica N. Gillespie, and Namrata Tognatta

Reading Recovery (RR) is a short-term, one-to-one intervention designed to help the lowest achieving readers in first grade. This article presents first-year results from the multisite randomized controlled trial (RCT) and implementation study under the $55 million Investing in Innovation...

Published: Jun 1, 2015

C

Class-level RCT

Randomization occurred at the classroom level, satisfying the class-level RCT requirement.
E

Exam-based Assessment

A standardized reading test (Gray Silent Reading Test) was used for outcome measurement.
T

Term Duration

Post-tests were administered after about 6–7 months of intervention, exceeding a single term.
D

Documented Control Group

The control group’s composition and baseline performance were clearly documented and comparable to the intervention group.
S

School-level RCT

Randomization was done at the class level within schools, not at the school level (no whole-school assignment).
I

Independent Conduct

The researchers who developed the intervention also implemented the study (no independent evaluators were involved).
Y

Year Duration

The study spanned about 7 months of one school year, with no outcomes tracked for a full year or longer.
B

Balanced Resources

Both groups had equal instructional time and curricular resources; ITSS replaced part of the normal class time rather than adding extra time.
R

Reproduced Results

The study has not been independently replicated by an unrelated research team.
A

All-subject Exams

Outcomes were limited to reading comprehension; no other core subjects were tested.
G

Graduation Tracking

Participants were not tracked beyond the immediate post-test in 4th grade (no long-term follow-up through graduation).
P

Pre-registered Protocol

No pre-registered study protocol was identified for this trial.

reading
K12
US
EdTech platform

Large-scale randomized controlled trial with 4th graders using intelligent tutoring of the structure strategy to improve nonfiction reading comprehension

Kausalai Kay Wijekumar, Bonnie J. F. Meyer, Puiwa Lei

Reading comprehension is a challenge for K‑12 learners and adults. Nonfiction texts, such as expository texts that inform and explain, are particularly challenging and vital for students’ understanding because of their frequent use in formal schooling (e.g., textbooks) as well...

Published: Jun 26, 2012

C

Class-level RCT

Randomization occurred at the teacher/classroom level (not within a single classroom by student), meeting the class-level RCT requirement.
E

Exam-based Assessment

The study used the standardized Woodcock–Johnson IV Tests of Achievement (WJ-IV) as an outcome measure.
T

Term Duration

Outcomes were measured at the beginning and end of the school year, exceeding a one-term minimum.
D

Documented Control Group

The BAU control condition and its participants are documented with group sizes, demographics, and descriptions of BAU instruction.
S

School-level RCT

Randomization occurred at the teacher/classroom level rather than assigning entire schools to SIWI vs BAU.
I

Independent Conduct

The intervention developers/authors appear to have led SIWI delivery supports (PD and coaching), so the study was not independently conducted, despite limited external oversight.
Y

Year Duration

Outcomes were collected at the beginning and end of the academic year, which meets the year-duration requirement, even though COVID-19 reduced instructional exposure for some participants.
B

Balanced Resources

SIWI teachers received substantial additional resources (paid PD, ongoing coaching, on-site visits, and provided technology) that were not matched for BAU, and these extra inputs were not framed as the treatment variable.
R

Reproduced Results

No independent (different-team) replication of this specific RCT was found; the paper is itself a replication of earlier SIWI work but conducted by the SIWI research team.
A

All-subject Exams

The study used a standardized writing/language assessment (WJ-IV), but did not assess impacts across all core school subjects.
G

Graduation Tracking

The study reports outcomes only within an academic year and does not report tracking students until graduation.
P

Pre-registered Protocol

The paper provides IRB information and a data availability DOI but does not provide a pre-registration registry/ID and pre-data- collection registration date.

language arts
K12
US

Replication of Strategic and Interactive Writing Instruction in a Nationwide Randomized Controlled Trial

Kimberly Wolbers, Hannah M. Dostal, Lee Branum-Martin, Steve Graham, Jennifer Renée Kilpatrick, Thomas Allen, Rachel Saulsburry, Leala Holcomb, and Kelsey Spurgin

This study reports findings from a nationwide replication and the second randomized controlled trial (RCT) of Strategic and Interactive Writing Instruction (SIWI), a linguistically responsive framework for teaching writing to deaf students. A total of 50 teachers and their 294...

Published: Jan 7, 2026

C

Class-level RCT

Cohorts (cluster groups of roughly 20–30 students) were randomly assigned to treatment or control within each school and grade, satisfying class-level cluster randomization.
E

Exam-based Assessment

Outcomes were measured using NWEA MAP, a standardized assessment administered as part of the schools’ regular assessment program.
T

Term Duration

The intervention began in late January 2023 and the follow-up MAP assessment occurred in mid-May 2023, which is approximately a full academic term after the intervention start.
D

Documented Control Group

The control condition is described as continuing Rocketship’s regular Learning Lab reading supports, and the paper reports clear treatment/control sample sizes and baseline equivalence checks.
S

School-level RCT

Randomization occurred within schools at the cohort level rather than assigning entire schools to treatment versus control.
I

Independent Conduct

The paper does not explicitly state that the study was conducted by an independent third party separate from the intervention’s designers/provider.
Y

Year Duration

Outcomes were measured from late January 2023 to mid-May 2023, which is far less than 75% of an academic year after the intervention began.
B

Balanced Resources

Both groups used the same scheduled Learning Lab time, and the added tutoring/platform resources are the core treatment being tested rather than an unbalanced add-on of extra instructional time.
R

Reproduced Results

No independent replication of this specific BookNook Rocketship RCT by a different research team was identified, and none is reported in the paper.
A

All-subject Exams

The study uses a standardized reading assessment (MAP Reading) but does not assess outcomes across all core subjects.
G

Graduation Tracking

The study does not track students through graduation, and per ERCT rules graduation tracking cannot be met when year-duration (Y) is not met.
P

Pre-registered Protocol

The paper provides no registry identifier or link to a pre- registered protocol, and no external pre-registration record was found for this study.

reading
K12
US
EdTech platform
formative assessment

The Effects of In-School Virtual Tutoring on Student Reading Development: Evidence from a Short-Cycle Randomized Controlled Trial

Douglas D. Ready; Sierra G. McCormick; Rebecca J. Shmoys

This paper describes a 12-week cluster randomized controlled trial that examined the efficacy of BookNook, a virtual tutoring platform focused on reading. Cohorts of first- through fourth-grade students attending six Rocketship public charter schools in Northern California were randomly assigned...

Published: Apr 3, 2024

C

Class-level RCT

The study randomized at the section (class) level within schools, which meets the requirement for class-level randomization.
E

Exam-based Assessment

The study used the SIMCE, which is the Chilean national standardized exam.
T

Term Duration

The intervention lasted approximately seven months, which exceeds the minimum one-term duration requirement.
D

Documented Control Group

The control group is clearly documented with demographic data and a description of their "business as usual" condition.
S

School-level RCT

Randomization was performed at the section (class) level within schools, not at the school level.
I

Independent Conduct

The study was not conducted independently; the lead author developed the program and the author team managed the implementation.
Y

Year Duration

The intervention duration was seven months, which is less than the full academic year (typically 9-10 months) required by the criterion.
B

Balanced Resources

The extra time and resources were an integral part of the "bundled" intervention explicitly being tested against business-as-usual, satisfying the exception for this criterion.
R

Reproduced Results

There is no evidence provided of an independent replication of this study by a different research team.
A

All-subject Exams

The study assessed Math and Language but did not assess Science, which is stated as a subject taught by the teachers.
G

Graduation Tracking

The study tracked students only until the end of the intervention period (Grade 4), not until graduation.
P

Pre-registered Protocol

There is no evidence in the text that the study protocol was pre-registered before data collection began.

mathematics
K12
Latam
gamification
EdTech platform

Integrating learning platforms within regular school time: experimental evidence from Chilean primary schools

Roberto Araya, Elena Arias Ortiz, Nicolas Bottan, Julian Cristia

This paper presents results from a randomized evaluation of a bundled program employing an external coordinator to aid 4th grade teachers with the integration of a math learning platform that partially replaced regular school math instruction in Chile. Students in...

Published: Apr 2, 2025

C

Class-level RCT

Randomization includes student-level lotteries, but because the intervention is tutoring, the ERCT tutoring exception applies.
E

Exam-based Assessment

Outcomes are measured using established standardized assessments such as state tests, NWEA MAP, i-Ready, STAR, and PSAT/SAT.
T

Term Duration

Interventions and end-of-year outcome measurement occur over at least an academic-term scale from intervention start to testing.
D

Documented Control Group

The BAU control condition is defined and detailed baseline balance tables document control group characteristics.
S

School-level RCT

Randomization is at student, classroom, teacher, or grade level, not at the school level.
I

Independent Conduct

Researchers co-designed the tutoring models with partner districts, so evaluation was not fully independent of intervention design.
Y

Year Duration

Several sites implemented tutoring only in spring 2024 (or for about 12 weeks), which is shorter than a full academic year.
B

Balanced Resources

The study explicitly tests the effect of providing additional tutoring resources relative to business-as-usual.
R

Reproduced Results

No independent peer-reviewed replication of this specific PLI 2023-24 study design was identified.
A

All-subject Exams

Primary outcomes are standardized tests in the tutored subject, not across all core subjects.
G

Graduation Tracking

This interim report reports end-of-year outcomes and does not track students to graduation; additionally, Y is not met.
P

Pre-registered Protocol

An OSF link is provided, but the pre-registration timestamp could not be verified to precede the study start.

mathematics
reading
K12
US
blended learning
EdTech platform
digital assessment

Personalized Learning Initiative Interim Report: Findings from 2023-24

Monica P. Bhatt, Terence Chau, Barbara Condliffe, Rebecca Davis, Jean Grossman, Jonathan Guryan, Jens Ludwig, Matteo Magnaricotte, Shira Kolnik Mattera, Fatemeh Momeni, Philip Oreopoulos, and Greg Stoddard

This report summarizes the ongoing work by the Personalized Learning Initiative (PLI) research team to understand whether and how scaling high dosage tutoring (HDT) works in the post-pandemic environment. The study involved a large-scale randomized controlled trial with eight partners...

Published: Jun 1, 2025

C

Class-level RCT

Student-level randomization is acceptable here because the intervention is delivered as small-group tutoring (2-4 students).
E

Exam-based Assessment

The study used widely recognized standardized reading assessments (TOWRE-2, TOSREC, GMRT) with standard scores and reported reliability.
T

Term Duration

Outcomes were measured after an intervention period running from October to February, which exceeds one academic term.
D

Documented Control Group

The Business-as-Usual control is described and the paper reports control-group demographics and details of services received.
S

School-level RCT

Randomization occurred at the student level within schools rather than at the school level.
I

Independent Conduct

The authors developed the Engaged Learners program and the study was researcher-implemented with coaching by the first author.
Y

Year Duration

The intervention and measurement window (October to February) is under a full academic year.
B

Balanced Resources

The intervention adds substantial instructional time and staffing, but these added resources are integral to what the study is testing against Business-as-Usual.
R

Reproduced Results

No peer-reviewed independent replication by other authors was found, and the paper itself calls for future replications.
A

All-subject Exams

The study measured reading and attention outcomes only, not standardized outcomes across all core subjects.
G

Graduation Tracking

The study does not track students to graduation, and Criterion G cannot be met because Criterion Y is not met.
P

Pre-registered Protocol

The authors explicitly state that the study was not pre-registered.

reading
K12
US

The Efficacy of Integrating Engaged Learners Into Small Group Reading Instruction on Reading and Attention Outcomes: A Randomized Controlled Trial

Garrett J. Roberts, Greg Roberts, Philip Capin, Anna-Mari Fall, Brian B. Vedder, Anna Handy, and Michelle Jestice

We investigate the efficacy of a reading intervention integrated with Engaged Learners, a program that applies behavioral and cognitive principles to increase student behavioral attention and reduce distractions during instruction. Using a three-arm randomized controlled trial, we randomized 159 Grade...

Published: Dec 2, 2025

C

Class-level RCT

Randomization was performed at the school level, exceeding the class-level requirement.
E

Exam-based Assessment

The study used custom-designed tests rather than a recognized standardized exam.
T

Term Duration

Outcomes were measured about 15 months after the intervention start, exceeding one academic term.
D

Documented Control Group

The control group’s size and treatment condition are clearly described, fulfilling documentation requirements.
S

School-level RCT

Entire schools, rather than individual classes, were randomized to treatment and control.
I

Independent Conduct

The evaluation was performed by an independent team (IDB and academic partners), distinct from the OLPC Foundation designers.
Y

Year Duration

Outcomes were measured 15 months post-start, satisfying the full academic year requirement.
B

Balanced Resources

Additional resources (laptops and training) were the treatment variable being tested, so the control condition appropriately remained business-as-usual.
R

Reproduced Results

An independent replication in Uruguay has confirmed the results.
A

All-subject Exams

Only math and language outcomes were assessed, not all main subjects.
G

Graduation Tracking

A long-term follow-up study tracked student outcomes through graduation, meeting this criterion.
P

Pre-registered Protocol

No statement of pre-registration is provided.

mathematics
language arts
K12
Latam
EdTech platform
mobile learning

Technology and Child Development: Evidence from the One Laptop per Child Program

Julian Cristia, Pablo Ibarrarán, Santiago Cueto, Ana Santiago, and Eugenio Severín

This paper presents results from a large-scale randomized evaluation of the One Laptop per Child program, using data collected after 15 months of implementation in 318 primary schools in rural Peru. The program increased the ratio of computers per student...

Published: Jul 1, 2017

C

Class-level RCT

The study utilized a clustered randomized controlled trial design randomizing at the school level, which satisfies the requirement for class-level or higher randomization.
E

Exam-based Assessment

The primary innovation outcomes rely on custom-developed measures rather than standardized exams, although standardized tests were used for secondary academic outcomes.
T

Term Duration

The intervention spanned two full academic years, significantly exceeding the one-term duration requirement.
D

Documented Control Group

The control group's condition (self-directed preparation) and baseline characteristics are clearly documented and compared to the treatment group.
S

School-level RCT

The study randomized 80 schools to treatment or control conditions, satisfying the school-level randomization requirement.
I

Independent Conduct

The intervention was implemented by an independent NGO (Inqui-Lab), while the evaluation was conducted by an academic researcher from Stanford.
Y

Year Duration

The study tracked students over two full academic years, exceeding the one-year duration requirement.
B

Balanced Resources

The intervention provided additional resources (kits, training) that were integral to the treatment being tested, while educational time was balanced across groups.
R

Reproduced Results

No independent replications of this specific intervention were found in peer-reviewed literature.
A

All-subject Exams

The study assesses Math and Science but does not assess other core subjects like Language Arts or Social Studies using standardized exams.
G

Graduation Tracking

The study tracked students through Grade 9 but did not track them until graduation.
P

Pre-registered Protocol

The study was pre-registered with the AEA Trial Registry prior to the start of data collection.

mathematics
science
K12
Asia
project-based learning

Can Innovation Be Taught? Evidence from a School-Based RCT in India

Saloni Gupta

Innovation fuels long-run economic growth, yet education systems in developing countries often overlook the skills required for innovation. This paper provides the first experimental evidence that students can learn core innovation-related skills. I conduct a large-scale clustered randomized controlled trial...

Published: Apr 4, 2025

C

Class-level RCT

Randomization occurred at the school level, which meets (and exceeds) the class-level RCT requirement.
E

Exam-based Assessment

Academic outcomes were based on teacher assessments of meeting expectations, not standardized exam scores.
T

Term Duration

Outcomes were measured at least through the end of the school year (and beyond), meeting the term-duration requirement.
D

Documented Control Group

The wait-list control group is clearly described with sample sizes, exposure status, and baseline covariates.
S

School-level RCT

The unit of randomization was the school, satisfying the school-level RCT requirement.
I

Independent Conduct

The RCT is described as designed and led by the government of Manitoba rather than the intervention developer.
Y

Year Duration

Participants were followed from Grade 1 through Grade 9, far exceeding 75% of an academic year.
B

Balanced Resources

The added training/materials are integral to delivering PAX-GBG, and classroom delivery occurs during regular school activities rather than adding extra instructional time.
R

Reproduced Results

No independent peer-reviewed replication of this specific Manitoba First Nations administrative-data clustered RCT was found.
A

All-subject Exams

Criterion E is not met, so All-subject Exams cannot be met; additionally, outcomes are not standardized exams across all core subjects.
G

Graduation Tracking

The paper reports follow-up through Grade 9 rather than tracking the cohort through graduation, and no later cohort-follow-up paper to graduation was found online.
P

Pre-registered Protocol

No pre-registration registry entry (ID and prospective registration date) for this trial was found or reported.

mathematics
reading
language arts
K12
Canada
gamification

Examining the effectiveness of the PAX-GOOD BEHAVIOUR GAME in improving the mental health and academic outcomes of FIRST NATIONS children in Canada: a clustered randomized controlled trial using administrative data

Mariette J. Chartier; Frank Turner; Depeng Jiang; Wendy Au; Scott McCulloch; Marni Brownell; Rob Santos; Nora Murdock; Amanda Martinson; Leanne Boyd; James Bolton; Jitender Sareen

Purpose PAX Good Behaviour Game (PAX-GBG), a school-based mental health promotion approach, has been shown to improve children’s mental health and academic outcomes. Given that these effects have yet to be shown in Indigenous populations, a partnership with First Nations...

Published: Jan 14, 2026

C

Class-level RCT

Randomization occurred at the kindergarten-center (cluster) level, which meets or exceeds the class-level randomization requirement.
E

Exam-based Assessment

Outcomes were measured via teacher-report questionnaires rather than standardized exam-based assessments.
T

Term Duration

Outcomes were measured at least six months after baseline, which exceeds one academic term from intervention commencement to the latest follow-up measurement.
D

Documented Control Group

The paper documents the control condition and provides control group sample size and demographics.
S

School-level RCT

Entire kindergarten centers (sites) were randomized, satisfying the school-level RCT requirement.
I

Independent Conduct

The paper does not clearly document that the evaluation was conducted independently from the intervention's designers.
Y

Year Duration

The study includes measurements spanning baseline and two later timepoints totaling about 12 months, meeting the year-duration requirement.
B

Balanced Resources

Although the intervention required additional teacher training and materials, these appear integral to the intervention package and child sessions were embedded in the regular schedule.
R

Reproduced Results

No peer-reviewed independent replication of this specific trial's findings was found or documented.
A

All-subject Exams

Because Criterion E is not met, the all-subject standardized exam requirement cannot be met.
G

Graduation Tracking

The study follows children only through the transition into the first year of school and does not track participants until graduation.
P

Pre-registered Protocol

The trial was prospectively registered on ANZCTR on 30/09/2019, before child recruitment at the start of 2020.

kindergarten
Australia

A Preschool Rhythm and Movement Intervention: RCT Evidence for Improved Social and Behavioral Development

Kate E. Williams and Laura Bentley

Active music and movement engagement has been widely integrated in human socialization across history and cultures, and is particularly prevalent in early childhood play and learning. For clinical populations, music therapy is known to support social skills and wellbeing for...

Published: Jan 12, 2026

C

Class-level RCT

The study randomized individual students via admissions lotteries, not classes or schools.
E

Exam-based Assessment

Outcomes are measured via course take-up and course passing, not via a standardized exam score.
T

Term Duration

Outcomes are measured across multiple grade levels (9th-11th), exceeding the one-term follow-up requirement.
D

Documented Control Group

The paper documents the control group’s composition and baseline characteristics using a detailed characteristics table and narrative.
S

School-level RCT

Randomization is done via student admission lotteries, not by randomizing schools.
I

Independent Conduct

The intervention model is supported by a separate organization, while the study is conducted by researchers affiliated with research institutions.
Y

Year Duration

The paper reports outcomes spanning grades 9-11, satisfying a one-year duration requirement.
B

Balanced Resources

The intervention is explicitly described as a comprehensive reform model whose supports and resource-intensive elements are integral to what is being tested.
R

Reproduced Results

Independent researchers (AIR) report a follow-up study of Early Colleges based on admission lotteries, providing replication evidence for the model.
A

All-subject Exams

The study reports mathematics course outcomes only and does not use standardized exam outcomes across all core subjects.
G

Graduation Tracking

Follow-up publications by the same research program report high school graduation outcomes and longer-term outcomes after high school.
P

Pre-registered Protocol

No public pre-registration record (with a registration date prior to study start) is identified in the paper or via registry searches.

mathematics
K12
US
project-based learning
formative assessment

Improving College Readiness in Mathematics in the Context of a Comprehensive High School Reform

Nina Arshavsky, Julie A. Edmunds, Fatih Unlu, and Lily Fesler

This mixed methods experimental study examined the impacts of the Early College High School model on students' college readiness in mathematics measured by their success in college preparatory mathematics courses in the 9th through 11th grades, and disaggregated for academically...

Published: Jan 1, 2025

C

Class-level RCT

The original RCT randomized at the school level, which satisfies the class-level (or stronger) RCT requirement.
E

Exam-based Assessment

The long-term follow-up cognitive outcome uses a custom rapid math test rather than a standardized exam.
T

Term Duration

The intervention ran for eight months, exceeding a full academic term.
D

Documented Control Group

The paper documents the control group with baseline and follow-up descriptive statistics in Table 1.
S

School-level RCT

Randomization is at the school level with 34 schools as clusters.
I

Independent Conduct

The evaluation team is not the Kumon organization and the paper declares no conflict of interest.
Y

Year Duration

Outcomes are measured in a follow-up conducted six years after the original RCT period.
B

Balanced Resources

The intervention adds time and materials, but these added resources are integral to the treatment being evaluated.
R

Reproduced Results

No independent replication by a different research team was found for this specific Kumon RCT.
A

All-subject Exams

Only mathematics was assessed (and not via standardized exams), so the study does not provide all-subject standardized exam outcomes.
G

Graduation Tracking

The paper does not show systematic tracking of participants through graduation for the full cohort.
P

Pre-registered Protocol

The paper cites an AEA RCT Registry ID but does not provide (and we could not verify) a registration date before the study start.

mathematics
K12
Asia

Self-learning at the right level, COVID-19, school closure, and non-cognitive abilities

Minhaj Mahmud, Yasuyuki Sawada, Mai Seki, Kazuma Takakura

The COVID-19 pandemic and associated school closures exacerbated the global learning crisis, especially for children in developing countries. Teaching at the right level is gaining greater importance in the policy arena as a means to recover learning loss. This study...

Published: May 19, 2025

C

Class-level RCT

The study randomized at the school level, satisfying the ERCT class-level-or-higher randomization requirement.
E

Exam-based Assessment

Outcomes rely on internal school grades and questionnaires, not a standardized exam-based assessment.
T

Term Duration

The interval from intervention start after the January pre-test to the May/June post-test exceeds one academic term.
D

Documented Control Group

The paper documents the PAU control group size, characteristics, and what support it could receive.
S

School-level RCT

Schools were the unit of randomization, meeting the school-level RCT criterion.
I

Independent Conduct

The intervention was evaluated with substantial involvement of the intervention developers and author-led trainer training.
Y

Year Duration

Outcomes were tracked from the December/January pre-test to an October/November follow-up, spanning roughly 9-10 months.
B

Balanced Resources

The extra time and staffing are integral to the intervention being tested (PLOS-extra), so PAU as the control is acceptable under the ERCT criterion B exception.
R

Reproduced Results

No independent replication of this specific PLOS-extra effectiveness trial was located.
A

All-subject Exams

Criterion A is not met because criterion E (standardized exam-based assessment) is not met.
G

Graduation Tracking

The trial followed students for six months, not through graduation, and no graduation follow-up paper was located.
P

Pre-registered Protocol

The paper reports prospective preregistration in REES before data collection, but the registry entry date could not be independently verified without login access.

mathematics
language arts
K12
EU
homework

The effectiveness of a school-based planning program for first year students in secondary school: A cluster randomized controlled trial

Kim Wolters, Saskia van der Oord, Steven W. Evans, Barbara van den Hoofdakker, Bianca E. Boyer

In secondary education, many students have difficulties planning their schoolwork. These difficulties may not only lead to short-term consequences such as lower grades, but also to long-term psychosocial, professional and financial challenges. To support students with planning problems, we developed...

Published: Oct 4, 2025

C

Class-level RCT

Randomization was at the individual child level (lottery seats), not at the class or school level, and no tutoring-style exception applies.
E

Exam-based Assessment

Outcomes were measured using widely used standardized assessments (e.g., Woodcock-Johnson, HTKS, digit span), not custom tests created for the study.
T

Term Duration

The study tracked outcomes from baseline in fall 2021 through spring 2024, far exceeding one academic term.
D

Documented Control Group

The control group is clearly defined as lottery non-winners, with sample sizes and alternative preschool enrollment described.
S

School-level RCT

Randomization occurred within lotteries at the child level rather than by random assignment of schools to conditions.
I

Independent Conduct

The intervention studied was a business-as-usual Montessori program model not designed by the research team.
Y

Year Duration

Outcomes were tracked from fall 2021 through spring 2024, spanning multiple academic years.
B

Balanced Resources

The study explicitly evaluates the real-world Montessori program package (including its resource structure) against typical alternatives, making resource differences part of the treatment definition.
R

Reproduced Results

The paper reports that key findings replicate across multiple Montessori preschool RCTs, including at least one independent RCT in another context.
A

All-subject Exams

The study does not assess effects across all core school subjects via standardized exam batteries; it reports a selected set of academic and nonacademic outcomes.
G

Graduation Tracking

Outcomes are tracked only through the end of kindergarten, and the authors explicitly note that longer-run impacts are unknown.
P

Pre-registered Protocol

The paper reports registration on REES but does not provide a registration date relative to the study start, so pre-registration before data collection cannot be verified here.

pre-K
kindergarten
US

A national randomized controlled trial of the impact of public Montessori preschool at the end of kindergarten

Angeline S. Lillard, David Loeb, Juliette Berg, Maya Escueta, Karen Manship, Alison Hauser, Emily D. Daggett

The study uses competitive admission lotteries at 24 oversubscribed U.S. public Montessori schools to estimate impacts of being offered a Montessori PK3 seat on end-of-kindergarten outcomes. The authors report positive impacts on reading and several cognitive outcomes, plus a cost...

Published: Oct 20, 2025

C

Class-level RCT

Randomisation at the school level satisfies the requirement for a class-level RCT.
E

Exam-based Assessment

The assessments were study-designed instruments, not recognised standardized exams.
T

Term Duration

Measurement occurred after five terms, satisfying at least one full academic term of follow-up.
D

Documented Control Group

The paper provides detailed baseline characteristics and conditions for the control group in Table 1.
S

School-level RCT

Entire schools, not just classes, were randomly assigned to treatment or control.
I

Independent Conduct

The same research team and ICS officers designed, implemented, and analyzed the intervention without independent oversight.
Y

Year Duration

The study tracked outcomes for more than an academic year, satisfying the Year Duration requirement.
B

Balanced Resources

The additional teacher is the treatment variable, so business-as-usual resourcing in the control group is acceptable.
R

Reproduced Results

Multiple independent studies have replicated this finding.
A

All-subject Exams

Only math and reading outcomes were measured, failing to cover all core subjects.
G

Graduation Tracking

No follow-up through to primary school graduation is reported.
P

Pre-registered Protocol

The study was not pre-registered before data collection.

mathematics
reading
K12
Africa
parent involvement

School governance, teacher incentives, and pupil–teacher ratios: Experimental evidence from Kenyan primary schools

Esther Duflo, Pascaline Dupas, and Michael Kremer

Some education policymakers focus on bringing down pupil–teacher ratios. Others argue that resources will have limited impact without systematic reforms to education governance, teacher incentives, and pedagogy. We examine a program under which school committees at randomly selected Kenyan schools...

Published: Dec 9, 2014

C

Class-level RCT

Schools (matched sets of schools) were randomly assigned, so the unit of randomization was at least class-level and in fact school-level.
E

Exam-based Assessment

Outcomes were measured with mental health instruments (e.g., SDQ), not standardized exam-based academic assessments.
T

Term Duration

The intervention began in Grade 1 and outcomes used in this paper were assessed decades later (age 34), exceeding one term.
D

Documented Control Group

The control group is described as receiving no intervention and the paper reports control-group sample sizes and pre-treatment comparisons.
S

School-level RCT

Randomization/assignment occurred at the school (matched school set) level.
I

Independent Conduct

The paper discloses that key investigators are developers of the Fast Track curriculum/PATHS curriculum, and it does not clearly document independent third-party evaluation.
Y

Year Duration

Outcomes considered in this paper were assessed many years after intervention start, far exceeding 75% of an academic year.
B

Balanced Resources

The intervention provided substantial additional programming, but those resources are integral to the Fast Track treatment being tested versus a no-intervention/business-as-usual control.
R

Reproduced Results

No peer-reviewed independent replication by a different research team (with clearly documented reproduction of this study) was identified.
A

All-subject Exams

Because exam-based academic assessment (Criterion E) is not met, the all-subject standardized-exams requirement is also not met.
G

Graduation Tracking

Although the target paper focuses on violence/mental health outcomes, a follow-up publication on the same Fast Track RCT explicitly ascertained whether participants graduated from high school or received a GED.
P

Pre-registered Protocol

The ClinicalTrials.gov registration was first submitted in 2012, while the study start date is in 1991, so the protocol was not pre-registered before the study began.

K12
US
parent involvement

The randomized controlled trial Fast Track multilevel intervention for children with early-emerging conduct problems breaks intergenerational transmission of violence across three generations

Laura Gorla, W. Andrew Rothenberg, Jennifer Godwin, and Conduct Problems Prevention Research Group

Background: Domestic violence mechanisms are frequently transmitted across generations, representing a global issue demanding particular attention. This study investigates the intergenerational transmission of intimate partner violence (IPV) and parent-to-child violence (PCV) and whether participating in a multilevel preventive intervention (Fast...

Published: Jan 15, 2026

C

Class-level RCT

Schools were the unit of randomisation, meeting (and exceeding) the class-level RCT requirement.
E

Exam-based Assessment

Outcomes relied on adapted self-report items and a bespoke performance test rather than widely recognised standardised exams.
T

Term Duration

Outcomes were measured again at week 16 (three months after the intervention), which is roughly one academic term after baseline.
D

Documented Control Group

The control group is described as curriculum-as-usual and baseline characteristics are reported by condition.
S

School-level RCT

Schools were randomised to condition, satisfying the school-level RCT criterion.
I

Independent Conduct

The trial protocol explicitly distinguishes the Guardian Foundation as developer and universities as evaluators and states the trial was conducted by an independent academic team.
Y

Year Duration

The longest follow-up described is week 16 (around four months), which is far short of 75% of an academic year.
B

Balanced Resources

The intervention provides additional structured learning and supports, and these added inputs are integral to the NewsWise treatment package being evaluated.
R

Reproduced Results

No independent, peer-reviewed replication of this specific NewsWise RCT was identified.
A

All-subject Exams

Because standardised exams were not used (E not met), the all-subject standardised exam requirement is not met.
G

Graduation Tracking

The study does not track pupils through graduation, and because Y is not met, G cannot be met under the ERCT rules.
P

Pre-registered Protocol

The study is registered (ISRCTN13350949), but the registry shows registration occurred after the registry’s date of first enrolment.

social studies
language arts
K12
UK
project-based learning

News literacy education and civic engagement among primary school children: a mixed-methods evaluation

Gianfranco Polizzi, Tom Harrison, and Shane McLoughlin

Developing children’s news literacy skills to critically engage with news is crucial to their development as informed and engaged citizens. Yet, primary school children remain largely overlooked by research and practice, which have focused primarily on older cohorts. This article...

Published: Jan 20, 2026

C

Class-level RCT

Student-level randomization is acceptable because the intervention is one-to-one tutoring.
E

Exam-based Assessment

Outcomes rely on self-reported grades rather than standardized exam-based assessments.
T

Term Duration

The first follow-up occurs about six months after randomization, exceeding one academic term.
D

Documented Control Group

The paper clearly describes the control condition and reports baseline balance between groups.
S

School-level RCT

Randomization occurred at the student level, not at the school level.
I

Independent Conduct

The tutoring program is run by an external nonprofit, while the evaluation is conducted by academic researchers.
Y

Year Duration

Outcomes are tracked from early 2022 to late 2023, exceeding one academic year.
B

Balanced Resources

Any additional resources (free tutoring access) are the treatment variable, so a business-as-usual control is acceptable under ERCT.
R

Reproduced Results

No independent replication of this specific Lern-Fair RCT by another team was found.
A

All-subject Exams

Criterion E is not met, so Criterion A is automatically not met.
G

Graduation Tracking

The study follows participants for about 18 months, not until graduation, and no follow-up graduation-tracking paper was found.
P

Pre-registered Protocol

The study cites an AEA RCT Registry ID but no publicly accessible record with the registration date could be retrieved to verify pre-registration timing.

mathematics
language arts
L2 languages
K12
EU
EdTech platform

Online Tutoring, School Performance, and School-to-Work Transitions: Evidence from a Randomized Controlled Trial

Silke Anger, Bernhard Christoph, Agata Galkiewicz, Shushanik Margaryan, Malte Sandner, Thomas Siedler

Tutoring programs for low-performing students, delivered in-person or online, effectively enhance school performance, yet their medium- and longer-term impacts on labor market outcomes remain less understood. To address this gap, we conduct a randomized controlled trial with 839 secondary school...

Published: Dec 1, 2025

C

Class-level RCT

Student-level random assignment is clearly described.
E

Exam-based Assessment

Outcomes rely on self-reported grades rather than standardized exams.
T

Term Duration

Follow-up measurement occurs after a substantial interval post- intervention start.
D

Documented Control Group

Treatment and control conditions are clearly described and baseline balance is shown.
S

School-level RCT

Randomization is not conducted at the school level.
I

Independent Conduct

Key outcomes are self-reported, not independently assessed.
Y

Year Duration

Outcomes span more than one academic year, including a second follow-up in late 2023.
B

Balanced Resources

Extra instructional time is the intervention itself and is explicitly tested.
R

Reproduced Results

No independent replications were found in available sources.
A

All-subject Exams

Not applicable because exam-based assessment is not used and outcomes are not across all core exams.
G

Graduation Tracking

No evidence of tracking outcomes until graduation for the full cohort was found.
P

Pre-registered Protocol

The study reports preregistration in the AEA RCT Registry and analyzes outcomes as registered.

K12
EU
EdTech platform

Online Tutoring, School Performance, and School-to-Work Transitions: Evidence from a Randomized Controlled Trial

Silke Anger, Bernhard Christoph, Agata Galkiewicz, Shushanik Margaryan, Malte Sandner, Thomas Siedler

Tutoring programs for low-performing students, delivered in-person or online, effectively enhance school performance, yet their medium- and longer-term impacts on labor market outcomes remain less understood. To address this gap, we conduct a randomized controlled trial with 839 secondary school...

Published: Dec 1, 2025

C

Class-level RCT

The trial randomized whole Early Years Settings, which satisfies the class-level (or stronger) randomization requirement.
E

Exam-based Assessment

The study used NRDLS, described as a standardized and validated assessment, as the primary outcome measure.
T

Term Duration

The study included follow-up about 9 weeks after an 8-week intervention, providing roughly term-long tracking from start to T3.
D

Documented Control Group

The study removed the treatment-as-usual control arm, leaving no business-as-usual control group.
S

School-level RCT

Entire Early Years Settings were randomized, meeting the school-level RCT requirement.
I

Independent Conduct

Intervention developers (paper authors) trained and supervised delivery, so the evaluation was not independent.
Y

Year Duration

Outcomes were tracked only to a 9-week post-test follow-up, far short of a full academic year.
B

Balanced Resources

The two arms were explicitly matched on dosage/delivery and both included comparable homework resources, balancing time and inputs.
R

Reproduced Results

No peer-reviewed independent replication of the 2025 BEST vs A-DLS trial was found.
A

All-subject Exams

The study measured language and communication only, not standardized outcomes across all main subjects/domains.
G

Graduation Tracking

The study followed children only for weeks to a few months post- intervention, with no tracking to graduation.
P

Pre-registered Protocol

The ISRCTN registry record shows registration before first enrolment, indicating prospective pre-registration.

language arts
pre-K
UK
parent involvement
homework

A Cluster Randomized Controlled Trial Comparing the Efficacy of Pre-School Language Interventions - Building Early Sentences Therapy and an Adapted Derbyshire Language Scheme

Cristina McKean, Christine Jack, Sean Pert, Carolyn Letts, Helen Stringer, Mark Masidlover, Anastasia Trebacz, Robert Rush, Emily Armstrong, Kate Conn, Jenny Sandham, Elaine Ashton, and Naomi Rose

Children's language abilities set the stage for their education, psychosocial development and life chances across the life course. Aims: To compare the efficacy of two preschool language interventions delivered with low dosages in early years settings (EYS): Building Early Sentences...

Published: Apr 26, 2025

C

Class-level RCT

The study randomized assignment at the school level, which satisfies and exceeds the class-level requirement.
E

Exam-based Assessment

The study employed custom-developed 21-item assessments aligned with the curriculum rather than widely recognized standardized exams.
T

Term Duration

The study tracked outcomes from August 2022 to May 2024, covering almost two full academic years.
D

Documented Control Group

The control group's baseline characteristics, including demographics and test scores, are fully documented and compared in Table 1.
S

School-level RCT

The study randomized 42 schools into treatment and control groups, satisfying the school-level randomization criterion.
I

Independent Conduct

The study was conducted by authors affiliated with the Asian Development Bank, which also funded and supported the implementation of the intervention.
Y

Year Duration

The intervention and data collection spanned two full academic years, exceeding the one-year requirement.
B

Balanced Resources

The study explicitly tests the provision of hardware and software resources (tablets and modules) as the primary treatment, justifying the resource difference with the control group.
R

Reproduced Results

This is an original study and no independent replication of this specific intervention is reported.
A

All-subject Exams

The study measured only Math and English outcomes, excluding Science which was part of the intervention content, and failed the standardized exam prerequisite.
G

Graduation Tracking

The study stopped tracking students once they graduated and did not collect or report graduation data.
P

Pre-registered Protocol

The paper does not provide any reference to a pre-registered study protocol or registry ID.

mathematics
reading
K12
Asia
blended learning
EdTech platform
mobile learning

Learning at the Last Mile: Evidence from a Randomized Controlled Trial of Computer-Assisted Instruction in Remote Schools of the Philippines

Paul Glewwe, David A. Raitzer, Uttam Sharma, Kenn Chua, and Milan Thomas

Although Asian economies have increased access to education, students' learning often trails grade level expectations. In the Philippines, learning worsened through prolonged classroom closure during the coronavirus disease (COVID-19) pandemic. Together with the Department of Education, we conducted a 42-school...

Published: Dec 1, 2025

C

Class-level RCT

Randomisation was conducted at the individual student level within schools rather than at the class level, leading to potential contamination across students in the same class.
E

Exam-based Assessment

The primary outcome is based on post‑intervention grade point averages from school records, not a standardized exam‑based assessment.
T

Term Duration

Outcomes (end‑of‑year GPAs) were measured at the end of the academic year, at least one full term after the intervention.
D

Documented Control Group

The control condition and its baseline data are clearly described, including content, fidelity, and demographics.
S

School-level RCT

Randomisation occurred at the student level within schools, not at the school level.
I

Independent Conduct

Independent professional research companies conducted data collection and processing, separate from the intervention designers.
Y

Year Duration

Outcomes were tracked through the end of ninth grade, covering a full academic year.
B

Balanced Resources

Both intervention and control groups received equivalent session time and attention, balancing educational inputs.
R

Reproduced Results

No independent replication of this national study by a different team is reported.
A

All-subject Exams

No standardized exam-based assessments across all core subjects; the study relies on administrative GPAs.
G

Graduation Tracking

Participants were only tracked through ninth grade; no graduation tracking is reported.
P

Pre-registered Protocol

The analysis plan and moderation hypotheses were pre-registered on OSF prior to data analysis.

mathematics
science
social studies
language arts
K12
US
EdTech website

A national experiment reveals where a growth mindset improves achievement

David S. Yeager, Paul Hanselman, Gregory M. Walton, Jared S. Murray, Robert Crosnoe, Chandra Muller, Elizabeth Tipton, Barbara Schneider, Chris S. Hulleman, Cintia P. Hinojosa, David Paunesku, Carissa Romero, Kate Flint, Alice Roberts, Jill Trott, Ronaldo Iachan, Jenny Buontempo, Sophia Man Yang, Carlos M. Carvalho, P. Richard Hahn, Maithreyi Gopalan, Pratik Mhatre, Ronald Ferguson, Angela L. Duckworth & Carol S. Dweck

A global priority for the behavioural sciences is to develop cost-effective, scalable interventions that could improve the academic outcomes of adolescents at a population level, but no such interventions have so far been evaluated in a population-generalizable sample. Here we...

Published: Aug 7, 2019

C

Class-level RCT

Pupils rather than whole classes were randomised.
E

Exam-based Assessment

A validated standardised exam (PhAB‑2) was used.
T

Term Duration

Post‑test occurred four months after the December start.
D

Documented Control Group

Control demographics and baseline scores are clearly provided.
S

School-level RCT

Randomisation took place within, not between, schools.
I

Independent Conduct

Researchers were not affiliated with the program’s developer.
Y

Year Duration

Only six months of data – under one school year.
B

Balanced Resources

Extra resources constituted the treatment itself, making balance proper.
R

Reproduced Results

The study’s findings were later replicated by an independent team.
A

All-subject Exams

The study measured literacy only, not all subjects.
G

Graduation Tracking

Follow‑up ended two months after the block, not at graduation.
P

Pre-registered Protocol

The paper provides no evidence of pre‑registration.

reading
kindergarten
UK
EdTech app

A randomized controlled trial of an early‑intervention, computer‑based literacy program to boost phonological skills in 4‑ to 6‑year‑old children

Paul O’Callaghan; Aimee McIvor; Claire McVeigh; Teresa Rushe

Background. Many school‑based interventions are delivered without evidence of effectiveness. Aims. This study evaluated the Lexia Reading Core5 program with 4‑ to 6‑year‑olds in Northern Ireland. Sample. One hundred and twenty‑six pupils were screened; ninety‑eight below‑average readers were randomised to an 8‑week block of...

Published: Jan 1, 2016

C

Class-level RCT

Randomization was done at the Head Start site (center) level, which satisfies or exceeds class-level randomization.
E

Exam-based Assessment

The study measured child-level academic outcomes via teacher ratings, not a standardized, exam-based assessment of each child.
T

Term Duration

The paper reports that the intervention was conducted over an entire Head Start academic term (fall to spring), meeting the term duration criterion.
D

Documented Control Group

The control group’s business-as-usual setting is clearly described, including their staffing support and how it differed from the intervention.
S

School-level RCT

Whole Head Start sites (equivalent to schools) were the unit of randomization, fulfilling the school-level RCT requirement.
I

Independent Conduct

The study does not mention any external evaluators. The intervention appears to have been evaluated by its own designers, lacking independent oversight.
Y

Year Duration

The intervention spanned an entire preschool year (approximately 9 months), satisfying the one-year duration criterion.
B

Balanced Resources

The intervention group received extra training and mental health consultation services, whereas the control group did not receive comparable resources or attention.
R

Reproduced Results

No independent replication by other researchers is reported; this was a single-site study carried out by one team.
A

All-subject Exams

Academic performance was only assessed in language/literacy and math (via teacher-rated scales), rather than covering all core subjects with standardized exams.
G

Graduation Tracking

The original CSRP participants were followed up in later years. A subsequent study by the same research team collected data on these students’ outcomes in high school, fulfilling the graduation tracking criterion.
P

Pre-registered Protocol

No pre-registered analysis plan or study registration is mentioned. There is no evidence that the trial was registered before data collection.

pre-K
kindergarten
K12
US

Academic performance of subsequent schools and impacts of early interventions: Evidence from a randomized controlled trial in Head Start settings

Fuhua Zhai, C. Cybele Raver, Stephanie M. Jones

The role of subsequent school contexts in the long-term effects of early childhood interventions has received increasing attention, but has been understudied in the literature. Using data from the Chicago School Readiness Project (CSRP), a cluster-randomized controlled trial conducted in...

Published: Feb 3, 2012

C

Class-level RCT

Randomization was clustered (school-level or class-level), meeting the class-level-or-stronger ERCT randomization requirement.
E

Exam-based Assessment

Outcomes were measured with well-being questionnaires/scales, not standardized exam-based educational assessments.
T

Term Duration

Outcomes were measured at a 26-week follow-up, exceeding a term-length (3–4 month) tracking window from baseline/intervention start.
D

Documented Control Group

The paper clearly describes both active and inactive control groups and reports group sizes and measurement timing.
S

School-level RCT

The paper states that randomization was performed at the school level and later reiterates school-level execution of randomization.
I

Independent Conduct

The paper does not clearly document independent external evaluation; key trial activities (data collection and analysis) were performed by the author team.
Y

Year Duration

The longest follow-up reported in this paper is 26 weeks, which is below the ERCT threshold of at least 75% of a typical academic year.
B

Balanced Resources

The active control closely matches the intervention on session duration and home-practice recommendations, and the extra time/resources are part of the intervention packages being tested.
R

Reproduced Results

No peer-reviewed independent replication of this specific trial by a separate research team was found.
A

All-subject Exams

Because the study does not use standardized exam-based academic assessments (Criterion E is not met), the all-subject exams criterion is automatically not met.
G

Graduation Tracking

Graduation tracking is not reported, and ERCT rules also require G to be not met when Y (year duration) is not met.
P

Pre-registered Protocol

The trial registration is described as retrospective and is dated after data collection began.

K12
EU

Effect of a Universal Mindfulness Program on Well-Being in Adolescents: A Cluster Randomized Controlled Trial

Jemina Qvick, Mirka Hintsanen, Tero Vahlberg, and Salla-Maarit Volanen

Mental health disorders often emerge during adolescence. Mindfulness interventions may support adolescents’ well-being. However, the evidence supporting the effectiveness of universal mindfulness interventions for adolescents’ well-being is limited and hampered by methodological weaknesses. The present study is the first large-scale...

Published: Jan 16, 2026

C

Class-level RCT

Schools were randomized at the school level, which is stronger than class-level randomization and therefore satisfies Criterion C.
E

Exam-based Assessment

The outcomes are leadership, wellbeing, behavior, and physical activity/fitness measures, not standardized academic exams.
T

Term Duration

The intervention began in Term 2 and outcomes were assessed at intervention end in late Term 3, which is at least one term later.
D

Documented Control Group

The wait-list control is described as continuing usual practice during Terms 2 and 3, with baseline characteristics reported for both groups.
S

School-level RCT

The unit of randomization was the school, satisfying Criterion S.
I

Independent Conduct

Randomization involved an independent researcher, but the paper does not show a fully independent third-party evaluation team separate from the intervention developers.
Y

Year Duration

The intervention began in Term 2 and outcomes were assessed at intervention end in late Term 3, which is well under 75% of an academic year.
B

Balanced Resources

The intervention provided substantial additional resources (workshop, equipment, implementation support) that were not matched in the control condition, and these extra inputs were not framed as the treatment variable.
R

Reproduced Results

No independent replication by a separate research team was found for this specific Learning to Lead cluster RCT.
A

All-subject Exams

Criterion E is not met (no standardized academic exams), so the all-subject standardized exam requirement cannot be satisfied.
G

Graduation Tracking

The study does not track participants through graduation, and Criterion Y is not met which also forces Criterion G to be not met.
P

Pre-registered Protocol

The trial was prospectively registered (ACTRN12621000376842) with a registration date preceding first participant enrolment.

K12
Australia

Effects of a school-based leadership program on student leaders and their peers: The Learning to Lead cluster randomized controlled trial

Levi Wade, Mark R. Beauchamp, Nicole Nathan, Jordan J. Smith, Angus A. Leahy, Ran Bao, Sarah G. Kennedy, James Boyer, Thierno M.O. Diallo, Sam Beacroft, and David R. Lubans

Schools provide an ideal context for developing students’ leadership skills; however, most leadership opportunities (e.g., serving as class president) are typically offered to students who already demonstrate leadership qualities. The aim of our study was to evaluate the effects of...

Published: Jan 8, 2026

C

Class-level RCT

The study randomized individual students within schools rather than assigning entire classes or schools to conditions.
E

Exam-based Assessment

The study relied on school-assigned GPAs and grades rather than widely recognized standardized exam-based assessments.
T

Term Duration

Outcomes were measured over the final three quarters of the school year, which is significantly longer than one academic term.
D

Documented Control Group

The control group's demographics, baseline data, and specific activities (neutral writing exercises) are well-documented.
S

School-level RCT

Randomization occurred at the student level within schools, not at the school level.
I

Independent Conduct

The study was conducted by the authors themselves, who are also associated with the design of the intervention in previous studies.
Y

Year Duration

Outcomes were measured across the full academic year (Terms 2, 3, and 4) following the start of the intervention in September.
B

Balanced Resources

The control group received a placebo activity that matched the treatment group in terms of time and resources.
R

Reproduced Results

The replication was conducted by the same authors as the original study, not by an independent research team.
A

All-subject Exams

Although the study covered core subjects, it relied on GPA rather than standardized exams, failing the prerequisite Criterion E.
G

Graduation Tracking

The study tracked students only through the end of the sixth-grade year, not until graduation.
P

Pre-registered Protocol

The study protocol was preregistered on OSF before the start of the study.

K12
US
language arts

Replicating a Middle-School Belonging Intervention: Evidence from a Randomized Trial Within a New School District

Geoffrey D. Borman, Trisha H. Borman, So Jung Park, and Bo Zhu

Recent randomized studies suggest brief social-psychological interventions can help students reappraise common social and academic worries during the difficult transition to middle school and, in turn, improve school performance. We conducted a preregistered student-level randomized controlled trial to assess the...

Published: Jan 1, 2025

C

Class-level RCT

Randomization was conducted at the individual student level rather than by class or school, failing the class-level RCT requirement.
E

Exam-based Assessment

Outcomes were measured via course grades and credits, not through a recognized standardized examination.
T

Term Duration

The intervention courses ran for a full 16-week semester, meeting the term duration requirement.
D

Documented Control Group

The control group’s composition, consent rates, and baseline covariates are documented in tables and text, satisfying documentation requirements.
S

School-level RCT

The study randomized individual students rather than entire schools, failing the school-level RCT requirement.
I

Independent Conduct

The study was conducted by researchers independent from the designers of the corequisite models, satisfying the ERCT requirement for Criterion I.
Y

Year Duration

The corequisite support lasted one semester; the Year-long intervention requirement is not satisfied.
B

Balanced Resources

The DE support hours are integral to the corequisite intervention, so the control’s business-as-usual condition is appropriate.
R

Reproduced Results

No independent replication of this RCT is reported in the paper.
A

All-subject Exams

Only reading and writing outcomes were measured, failing the all-subject exam requirement.
G

Graduation Tracking

A follow-up study by the same research team tracked the original student cohort through graduation, satisfying the ERCT requirement for Criterion G.
P

Pre-registered Protocol

There is no indication that the study protocol was pre-registered before data collection.

reading
language arts
higher education
US

Assessing the Effect of Corequisite English Instruction Using a Randomized Controlled Trial

T. Miller; L. Daugherty; P. Martorell; R. Gerber

This study provides experimental evidence on the impact of corequisite remediation for students underprepared in reading and writing. We examine the short-term impacts of three corequisite models implemented at five large urban community colleges in Texas. Results indicate that corequisite...

Published: Aug 11, 2021

C

Class-level RCT

Classes (not individual students within a class) were used as the randomization unit, meeting the class-level RCT criterion.
E

Exam-based Assessment

Outcomes were measured using a mental-health questionnaire (GAD-7), not standardized exam-based educational assessments.
T

Term Duration

Outcomes were measured about 12 weeks after intervention start, which meets the minimum term-length follow-up requirement.
D

Documented Control Group

The control group is described as standard PE and baseline characteristics are reported by group, including the control.
S

School-level RCT

Randomization was conducted at the class level within campuses, not by randomly assigning whole campuses (schools/sites).
I

Independent Conduct

The paper does not clearly document an independent external evaluation team separate from the study team and implementers.
Y

Year Duration

The intervention and outcome measurement covered about 12 weeks, far short of 75% of an academic year.
B

Balanced Resources

Student instructional time appears comparable because activities were inserted within standard PE lessons, and extra supports (training/monitoring) are part of the intervention delivery.
R

Reproduced Results

No independent replication of this specific trial was found in the paper or via targeted literature searching.
A

All-subject Exams

Criterion E is not met, so the all-subject standardized exam requirement cannot be met.
G

Graduation Tracking

The paper reports no post-intervention follow-up, and criterion Y is not met, so graduation tracking cannot be met.
P

Pre-registered Protocol

The trial is reported as registered (NCT05561192) and registry information indicates posting occurred before the study start.

K12
Latam

Effects of Implementing Exercise Interventions in School Physical Education on Anxiety Symptoms: A Randomized Clinical Trial

Tiago Wally Hartwig, Gicele de Oliveira Karini da Cunha, and Gabriel Gustavo Bergmann

This study assessed the effects of incorporating 3 different interventions into parts of physical education classes on anxiety symptoms in high school students. A parallel, 4-arm randomized clinical trial was conducted with technical high school students from 2 campuses in...

Published: Feb 26, 2026

C

Class-level RCT

Teachers (clusters) were randomly assigned to conditions, satisfying the class-level (or stronger) randomization requirement.
E

Exam-based Assessment

The study used aimswebPlus oral reading fluency with standardized passages and standardized directions, indicating a standardized assessment.
T

Term Duration

The longest stated start-to-measurement interval is about 11 weeks, which is shorter than a typical academic term (about 3–4 months).
D

Documented Control Group

The comparison condition procedures, group sizes, and baseline equivalence at pretest are reported, providing sufficient control/comparison group documentation.
S

School-level RCT

Randomization occurred at the teacher (cluster) level rather than by assigning schools to conditions.
I

Independent Conduct

The authors created key materials and trained teachers, and testing was teacher-administered, so independent third-party conduct is not clearly documented.
Y

Year Duration

The study duration (about 11 weeks from start to latest follow-up) is far shorter than 75% of an academic year, and because T is not met, Y is also not met.
B

Balanced Resources

Both groups received the same schedule and session length, used the same passages and repeated-reading frequency, and answered the same comprehension questions; remaining differences are integral to the instructional contrast being tested.
R

Reproduced Results

No independent replication of this specific clustered RCT was found, and the paper itself only recommends that future studies replicate it.
A

All-subject Exams

Outcomes were limited to oral reading fluency (WCPM) rather than standardized exams across all core subjects.
G

Graduation Tracking

The study reports only short follow-up (4 weeks post intervention), and because Y is not met, G is also not met; no later graduation-tracking follow-up publication was found.
P

Pre-registered Protocol

The paper links to OSF for shared data but does not state protocol pre-registration, provide a registry ID for a pre-registration, or give a registration date showing it occurred before data collection.

reading
K12
US

The Effects of Repeated Reading Interventions on the Oral Reading Fluency of Middle School Students With Reading Difficulties and Disabilities

Kristie Calvin, Lindsay Ellis Lee, Christy Austin, Stephanie Gouge, and Angela Watson

Fluency is a multidimensional construct that requires automaticity with foundational skills. Fluency is not an end in itself but serves as a bridge between decoding and reading comprehension. However, many secondary students struggle with proficient reading and fail to attain...

Published: Jan 8, 2026

C

Class-level RCT

Randomization was at the individual participant level (not class- or school-level), and no one-to-one tutoring exception applies.
E

Exam-based Assessment

The outcome exams were developed by the course director and instructors for this course rather than being a widely recognized standardized exam.
T

Term Duration

The paper reports the course running from September 15, 2022, through December 31, 2023, which exceeds a term from start to the end-of-course outcome assessments.
D

Documented Control Group

The comparator groups (including the lecture-based condition) and baseline characteristics are clearly documented (e.g., Table 1 and the LBL schedule description).
S

School-level RCT

The study was conducted within one institution and randomized individual participants, not schools/sites.
I

Independent Conduct

The paper does not document independent third-party conduct of the intervention evaluation; the author team and course staff appear to have delivered and assessed the course.
Y

Year Duration

The reported course period from September 15, 2022, through December 31, 2023, exceeds 75% of an academic year from start to end-of-course measurement.
B

Balanced Resources

The groups received the same preparatory materials, cases, and session count; staffing/simulation differences appear integral to the teaching models being tested rather than separable add-ons.
R

Reproduced Results

No independent, peer-reviewed replication of this specific MDTBL vs PBL vs LBL pulmonary-nodule course RCT was identified.
A

All-subject Exams

Criterion A is not met because Criterion E is not met and because outcomes are limited to a course-specific pulmonary-nodule test rather than all core subjects.
G

Graduation Tracking

The study reports only within-course testing (pre/mid/final) and does not track participants until graduation; no follow-up paper reporting graduation outcomes was identified.
P

Pre-registered Protocol

The study explicitly states it was retrospectively registered in 2023, after the course start date (September 15, 2022), and no public pre-registration record was found.

science
higher education
adult education
China

Randomized, controlled study evaluating multi-disciplinary team-based learning (MDTBL) as optimal teaching paradigm for residents, comparing with problem-based learning (PBL) and lecture-based learning (LBL)

Runyi Tao, Shan Gao, Jinteng Feng, Yizhao Sun, Yixing Li, Zhiyu Wang, Bohao Liu, Xingzhuo Zhu, Hongyi Wang, Xi Jia, Guangjian Zhang, Rui Gao

Background This study aims to compare multi-disciplinary team-based learning (MDTBL), problem-based learning (PBL) and lecture-based learning (LBL) on the learning outcomes and experiences of medical students. Methods A randomized controlled study was designed to recruit 30 medical students with a...

Published: Jan 8, 2026

C

Class-level RCT

Participants were randomized at the individual child level rather than by class (and the intervention is not described as one-to-one tutoring), so the class-level RCT requirement is not satisfied.
E

Exam-based Assessment

The study includes a widely recognized standardized test of decoding (TOWRE), satisfying the exam-based assessment requirement.
T

Term Duration

The study measures outcomes through follow-up over about 16 weeks from baseline, which is approximately one full academic term.
D

Documented Control Group

The paper clearly defines both control conditions and reports baseline comparability across groups, satisfying the documented control group requirement.
S

School-level RCT

Schools/language units were recruitment sites, but randomization was conducted at the child level rather than by school, so the school-level RCT criterion is not satisfied.
I

Independent Conduct

GraphoLearn is described as an externally developed program rather than one designed by the study authors, supporting independent conduct relative to the intervention’s designers.
Y

Year Duration

The study spans about 16 weeks from baseline to follow-up, which is far shorter than 75% of an academic year.
B

Balanced Resources

The active control matches the game dosage, but the TAU condition is not described as receiving comparable time and resources, and the paper does not clearly state that added resources/time are the treatment variable.
R

Reproduced Results

No independent replication of this specific RCT could be identified, and the paper itself frames the trial as the first RCT of this kind in this population.
A

All-subject Exams

The study assesses reading/decoding outcomes rather than standardized exam outcomes across all main school subjects, so the all-subject exams criterion is not met.
G

Graduation Tracking

The study ends after a short follow-up (weeks) and does not track students until graduation; additionally, ERCT specifies that failing year-duration (Y) implies failing graduation tracking (G).
P

Pre-registered Protocol

The trial is registered (NCT05295472), but registry dates indicate it was first posted after the study start and after data collection had begun, so it is not pre-registered.

reading
K12
EU
gamification
EdTech app

Evaluating a phonics-based reading intervention for children with developmental language disorder – A single-blinded randomized controlled trial

Karin Nilsson; Marika Habbe; Kristiina Tammimies; Nelli Kalnak

Individuals with developmental language disorder (DLD) often struggle with reading, both with reading comprehension and decoding. Although decoding difficulties are common in the population with DLD, studies investigating phonics-based interventions for these individuals are sparse. This study investigated the effect...

Published: Jan 6, 2026

C

Class-level RCT

Randomization was performed at the individual student level, not at the class (or higher) level, and the intervention was not a one-to-one tutoring intervention.
E

Exam-based Assessment

The primary educational outcome used a study-specific set of multiple-choice questions rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were measured up to a twelve-month follow-up after the intervention began, which exceeds the minimum term-length requirement.
D

Documented Control Group

The control group is clearly described (no intervention) and the paper reports group sizes and baseline characteristics for all groups (e.g., Table 1).
S

School-level RCT

The trial was not randomized at the school level (it was conducted in a single school with individual-level assignment).
I

Independent Conduct

Allocation concealment involved an independent researcher, but the paper does not show that the intervention evaluation was conducted by an external, independent team.
Y

Year Duration

The study tracked outcomes from February 2024 to February 2025, which is approximately one full year and exceeds 75% of an academic year.
B

Balanced Resources

The intervention arms received additional instructional inputs (training time and VR equipment), while the control arm received no training; this imbalance is explicitly the intended treatment comparison (training vs no training).
R

Reproduced Results

No independent replication of this specific trial was identified, and the paper itself does not report being replicated.
A

All-subject Exams

Because the study does not use standardized exams (Criterion E is not met), it also cannot satisfy the all-subject standardized exam requirement.
G

Graduation Tracking

The study includes a twelve-month follow-up but does not report tracking participants until graduation; no follow-up graduation paper by the same authors was identified.
P

Pre-registered Protocol

The paper reports ethics approval and data-collection dates but does not state that a protocol was pre-registered or provide a registry ID and registration date.

science
K12
EU
EdTech platform

Virtual reality versus theoretical training in CPR among adolescents: a randomized trial with a one-year longitudinal follow-up

Ana Belén Ocampo Cervantes, Carmen Amalia López López, Cristina Cerezo Espinosa, Robert Greif, Manuel Pardo Rios, and Daniel Guillén Martínez

Objective: This study evaluated the effectiveness of a virtual reality (VR)-based cardiopulmonary resuscitation (CPR) training program compared with traditional theoretical instruction and a non-intervention control group, on improving and retaining knowledge, attitudes, and self-efficacy of secondary school students. Design Randomized...

Published: Nov 25, 2025

C

Class-level RCT

The unit of randomization is the teacher (classroom), meeting the class-level RCT requirement.
E

Exam-based Assessment

The RCT’s primary outcomes are platform usage logs, not standardized exam scores; any test-score analysis is non-experimental.
T

Term Duration

Outcomes are tracked from the May 30 start through November 2022, exceeding one academic term.
D

Documented Control Group

The control group is clearly defined (no messages) and baseline characteristics for treatment and control are documented.
S

School-level RCT

Randomization was performed at the teacher level, not at the school level.
I

Independent Conduct

The lead author holds IP rights to the software being promoted, so the study is not independent of the intervention’s owner/designer.
Y

Year Duration

The study tracks outcomes for less than a full academic year (March–November), falling short of year-duration tracking.
B

Balanced Resources

The only additional resource is the WhatsApp messaging itself, which is the treatment being tested; the control group is business-as-usual by design.
R

Reproduced Results

No peer-reviewed independent replication of this WhatsApp-to-teachers messaging RCT was found in a web search as of the ERCT check date.
A

All-subject Exams

Only mathematics outcomes are discussed, and criterion E is not met, so the all-subject standardized exam requirement is not satisfied.
G

Graduation Tracking

Students are followed only within the 2022 school year and not until graduation; criterion Y is also not met, which implies G cannot be met.
P

Pre-registered Protocol

No pre-registration identifier or registry entry could be located for this RCT in the paper or via a web search.

mathematics
K12
Latam
online homework
EdTech platform

Messaging teachers to boost student EdTech use

Roberto Araya, Julian Cristia, Lisseth Escalante, Raissa Fabregas, Carolina Méndez, Gera Ríos

The use of self-led educational technologies holds significant potential for improving student learning at scale, but sustaining student engagement with these platforms remains a challenge. We present results from an experimental evaluation implemented following the scale-up of a math platform...

Published: Sep 11, 2025

C

Class-level RCT

Randomization occurred at the classroom level, preventing contamination across students in the same class.
E

Exam-based Assessment

The study employs bespoke KA Lite assessments, not established standardized exams.
T

Term Duration

The study measures outcomes after approximately six weeks, not a full term.
D

Documented Control Group

The control group’s makeup and activities are thoroughly described.
S

School-level RCT

The study randomized individual classrooms, not whole schools.
I

Independent Conduct

The same research team designed and evaluated the intervention.
Y

Year Duration

The intervention and measurement occur within ~12 weeks, not a full year.
B

Balanced Resources

Treatment and control groups received identical time and resources.
R

Reproduced Results

No independent replication study is mentioned.
A

All-subject Exams

Only mathematics outcomes are measured, not all core subjects.
G

Graduation Tracking

The study ends after the units and does not follow students to graduation.
P

Pre-registered Protocol

The protocol was pre‑registered in the AEA registry before implementation.

mathematics
K12
Asia
pay-to-learn
EdTech platform
digital assessment

Incentives for Effort or Outputs? A Field Experiment to Improve Student Performance

Sarojini R. Hirshleifer

This randomized experiment implemented with school children in India directly tests an input incentive designed to increase effort on learning activities against both an output incentive that rewards test performance and a control. Students in the input incentive treatment perform...

Published: Oct 2, 2021

C

Class-level RCT

Randomization was conducted at the class level with intact classes assigned to each condition.
E

Exam-based Assessment

The study employed custom-designed tests of graphing and slope problems rather than a recognized standardized exam.
T

Term Duration

The study measured outcomes after approximately three months, satisfying the term-duration requirement.
D

Documented Control Group

The control group’s size and baseline comparability (NAEP scores) are documented in detail.
S

School-level RCT

Randomization was at the class level within one school, not across multiple schools.
I

Independent Conduct

The study was conducted and scored by the authors, with no independent external evaluation.
Y

Year Duration

Outcomes were measured within three months, not tracked over an academic year.
B

Balanced Resources

Both groups received the same number of assignments, problems, and review sessions, ensuring balanced time and resources.
R

Reproduced Results

There is no evidence of an independent replication study by a different research team that confirms these findings.
A

All-subject Exams

The study only assessed mathematics graphing and slope problems, not a full range of subjects.
G

Graduation Tracking

Follow-up ended at 30 days post-review, with no tracking until graduation.
P

Pre-registered Protocol

The paper does not report a pre-registration or registry before the intervention began.

mathematics
K12
US

Interleaved Practice Improves Mathematics Learning

Doug Rohrer, Robert F. Dedrick, and Sandra Stershic

A typical mathematics assignment consists primarily of practice problems requiring the strategy introduced in the immediately preceding lesson (e.g., a dozen problems that are solved by using the Pythagorean theorem). This means that students know which strategy is needed to...

Published: Jul 11, 2014

C

Class-level RCT

Although randomization was not at the class level, the intervention was a fully individualized, at-home, computer-based program, satisfying the personal tutoring exception in the ERCT standard.
E

Exam-based Assessment

The study employed the standardized TOWRE subtests for reading fluency, meeting the exam‑based assessment requirement.
T

Term Duration

Outcome assessments occurred after at least 16 weeks of training (and within a 6‑month participation period), exceeding the one‑term minimum requirement.
D

Documented Control Group

The study lacked a separate control group, relying on a within‑subject baseline, thus failing the documented control group requirement.
S

School-level RCT

The study randomized individual children rather than entire schools, so the school‑level RCT requirement is not met.
I

Independent Conduct

The intervention was conducted and monitored by the same team that designed it, failing the independent conduct requirement.
Y

Year Duration

Participants were observed for about 6 months, not a full academic year, so the year‑duration requirement is not met.
B

Balanced Resources

The control condition had no training or additional support, so resources were not balanced.
R

Reproduced Results

No independent research team has published a replication of this trial, so reproducibility is not established.
A

All-subject Exams

The study measured only reading skills without assessing other core subjects, so the all-subject exams requirement is not met.
G

Graduation Tracking

Participants were followed for about 6 months, with no data collection continuing through graduation, failing the graduation tracking requirement.
P

Pre-registered Protocol

The trial was prospectively registered long before participants were enrolled, satisfying the pre-registered protocol requirement.

reading
K12
Australia
EdTech platform

Replicability of sight word training and phonics training in poor readers: a randomised controlled trial

G. McArthur, S. Kohnen, K. Jones, P. Eve, E. Banales, L. Larsen, and A. Castles

Given the importance of effective treatments for children with reading impairment, paired with growing concern about the lack of scientific replication in psychological science, the aim of this study was to replicate a quasi‑randomised trial of sight word and phonics...

Published: May 5, 2015

C

Class-level RCT

Randomization was at the individual participant level within one course, not at the class (or stronger) level, and no one-to-one tutoring exception applies.
E

Exam-based Assessment

Outcomes were measured using exams and questionnaires developed by the course team for this study rather than a widely recognized standardized exam.
T

Term Duration

The paper reports that participants took part in the course from September 15, 2022 through December 31, 2023, which exceeds a full academic term.
D

Documented Control Group

The comparison groups are described with group sizes and baseline characteristics (gender, age, clinical experience) reported and compared.
S

School-level RCT

Randomization occurred among individual course participants, not among schools or equivalent educational institutions.
I

Independent Conduct

The paper indicates the authors developed MDTBL and designed the assessments, with no documented independent third-party evaluation team.
Y

Year Duration

The reported participation/course window (September 15, 2022 to December 31, 2023) exceeds 75% of an academic year.
B

Balanced Resources

Core instructional materials, cases, and exams were matched across groups; the additional staffing and simulated patient in MDTBL are integral parts of the MDTBL intervention definition.
R

Reproduced Results

No independent replication study of this specific MDTBL trial (or a clearly equivalent MDTBL intervention) by a different team was found.
A

All-subject Exams

Because the assessment is not a standardized exam (E not met), the study cannot meet the all-subject standardized exam requirement.
G

Graduation Tracking

The study reports only within-course assessments and recommends future long-term follow-up, and no follow-up papers tracking the cohort to graduation were found.
P

Pre-registered Protocol

The study was retrospectively registered in 2023, after the course/study began in September 2022.

science
higher education
adult education
China

Randomized, controlled study evaluating multi-disciplinary team-based learning (MDTBL) as optimal teaching paradigm for residents, comparing with problem-based learning (PBL) and lecture-based learning (LBL)

Runyi Tao, Shan Gao, Jinteng Feng, Yizhao Sun, Yixing Li, Zhiyu Wang, Bohao Liu, Xingzhuo Zhu, Hongyi Wang, Xi Jia, Guangjian Zhang, and Rui Gao

This study aims to compare multi-disciplinary team-based learning (MDTBL), problem-based learning (PBL) and lecture-based learning (LBL) on the learning outcomes and experiences of medical students. A randomized controlled study was designed to recruit 30 medical students with a minimum of...

Published: Jan 8, 2026

C

Class-level RCT

The study randomized clusters at the class level, meeting the class-level RCT requirement.
E

Exam-based Assessment

Knowledge was measured with a study-specific 15-item MCQ rather than a standardized exam.
T

Term Duration

Outcomes were measured immediately after a single session rather than at least one academic term after intervention start.
D

Documented Control Group

The comparator condition and baseline group characteristics are documented, including a baseline characteristics table by group.
S

School-level RCT

Randomization occurred at the class (cluster) level rather than the school/site level.
I

Independent Conduct

The paper does not document that an external independent team conducted the study separate from the intervention team.
Y

Year Duration

Outcomes were measured immediately after a single session, so the study does not meet the year-duration requirement.
B

Balanced Resources

Time and instructional content are matched across groups; the delivery-mode differences (including technology) are integral to the treatment being tested.
R

Reproduced Results

No independent peer-reviewed replication of this specific trial was found.
A

All-subject Exams

The study does not use standardized exams, so it cannot meet the all-subject standardized exam requirement.
G

Graduation Tracking

The study does not track participants to graduation, and year-long tracking is not present.
P

Pre-registered Protocol

The paper reports registration in the Thai Clinical Trials Registry before participant enrollment, but the registry entry posting date could not be independently verified here.

science
higher education
Asia
blended learning
EdTech platform
digital assessment

Web-Based Virtual Environment Versus Face-To-Face Delivery for Team-Based Learning of Anesthesia Techniques Among Undergraduate Medical Students: Randomized Controlled Trial

Darunee Sripadungkul; Suhattaya Boonmak; Monsicha Somjit; Narin Plailaharn; Wimonrat Sriraj; Polpun Boonmak

Background: Foundational knowledge of anesthesia techniques is essential for medical students. Team-based learning (TBL) improves engagement. Web-based virtual environments (WBVEs) allow many learners to join the same session in real time while being guided by an instructor. Objective: This study...

Published: Jan 15, 2026

C

Class-level RCT

The study used cluster randomization at the class (scheduled session) level.
E

Exam-based Assessment

Knowledge was assessed with a study-specific 15-item MCQ rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were measured immediately after a single session rather than at least one academic term after the intervention began.
D

Documented Control Group

The paper documents the control condition and provides baseline characteristics and denominators by group.
S

School-level RCT

The study was conducted at a single medical school and randomized scheduled-session clusters, not schools.
I

Independent Conduct

The paper does not document a third-party independent evaluation team; implementation occurred within the same instructional setting.
Y

Year Duration

Outcomes were measured immediately post-session, so the study does not include at least 75% of an academic year of outcome tracking.
B

Balanced Resources

The study explicitly kept instructional time, content, and materials the same across groups, varying only delivery mode.
R

Reproduced Results

No independent replication study by a different research team could be identified.
A

All-subject Exams

Because the study does not use standardized exams (Criterion E is not met), it also cannot meet the all-subject standardized exam requirement.
G

Graduation Tracking

The study does not track participants to graduation and does not meet the year-duration prerequisite.
P

Pre-registered Protocol

The trial reports registration in the Thai Clinical Trials Registry before participant enrollment, and an earlier preprint version states a registration date preceding study conduct.

science
higher education
Asia
blended learning
EdTech platform
digital assessment

Web-Based Virtual Environment Versus Face-To-Face Delivery for Team-Based Learning of Anesthesia Techniques Among Undergraduate Medical Students: Randomized Controlled Trial

Darunee Sripadungkul; Suhattaya Boonmak; Monsicha Somjit; Narin Plailaharn; Wimonrat Sriraj; Polpun Boonmak

This randomized, controlled, assessor-blinded trial compared a web-based virtual environment (Spatial) with face-to-face delivery of an otherwise identical team-based learning session on anesthesia techniques for fifth-year medical students at Khon Kaen University (Thailand). Knowledge was assessed with a 15-item multiple-choice...

Published: Jan 15, 2026

C

Class-level RCT

The study randomized whole classes within schools, satisfying the class‑level RCT requirement.
E

Exam-based Assessment

The tests were custom assemblies of items from exam books, not formal standardized exams.
T

Term Duration

Student performance was assessed at the end of the fall semester, meeting the term‑duration requirement.
D

Documented Control Group

The control classes’ makeup, treatment conditions, and baseline data are clearly reported.
S

School-level RCT

The trial randomised classes within schools rather than entire schools.
I

Independent Conduct

The authors who developed the CAL were also responsible for its implementation and assessment.
Y

Year Duration

Tracking ceased at the semester’s end, not over a full academic year.
B

Balanced Resources

The additional CAL sessions are the treatment itself, so the control group’s business‑as‑usual status is appropriate.
R

Reproduced Results

The paper contains no mention of independent replication by a different research team.
A

All-subject Exams

The study assessed only math and Chinese; other core subjects were omitted.
G

Graduation Tracking

Student outcomes were not monitored beyond the semester, so no graduation tracking occurred.
P

Pre-registered Protocol

There is no evidence the trial was pre-registered before data collection.

mathematics
language arts
K12
China
EdTech platform

Does computer-assisted learning improve learning outcomes? Evidence from a randomized experiment in migrant schools in Beijing

Fang Lai, Renfu Luo, Linxiu Zhang, Xinzhe Huang, Scott Rozelle

The education of the disadvantaged population has been a long-standing challenge to education systems in both developed and developing countries. Although computer-assisted learning (CAL) has been considered one alternative to improve learning outcomes in a cost-effective way, the empirical evidence...

Published: Apr 8, 2015

C

Class-level RCT

Individual students (not classes or schools) were randomized, and the intervention is not a tutoring-style exception.
E

Exam-based Assessment

Outcomes were measured via scenario-based checklists and manikin software rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were assessed up to one year after the intervention, exceeding a term-length follow-up.
D

Documented Control Group

Both conditions are clearly described and baseline characteristics are reported, providing sufficient documentation for comparison.
S

School-level RCT

Randomization occurred among individual students rather than among schools (or equivalent institutional units).
I

Independent Conduct

Key trial activities were performed by the authors/their department rather than a clearly external independent evaluator.
Y

Year Duration

Outcomes were assessed one year after the last educational intervention, satisfying the year-duration requirement.
B

Balanced Resources

The added time/instructor input is the intervention contrast being tested (additional practice with continuous assessment vs a practical examination), so unequal resources are integral rather than a confounding add-on.
R

Reproduced Results

No independent replication of this specific RCT was found as of the ERCT check date.
A

All-subject Exams

Because standardized exams are not used (Criterion E not met), the all-subject standardized exam criterion cannot be met.
G

Graduation Tracking

Follow-up ends at one year and does not track participants through graduation from their educational stage.
P

Pre-registered Protocol

The paper states "Clinical trial number Not applicable" and does not provide a pre-registration record.

higher education
EU

The efficacy of BLS training among fifth-year medical students—a randomized, assessor-blinded, parallel group trial

Gábor Fritúz, Gergely Kovács, András Kállai, Tamás Nagy, Csaba Maár, Petra Szvath, Máté Berczi, Szabolcs Fábry, Dalma Skultéti, Zsolt Iványi, János Gál, and Enikő Kovács

Background Proper basic life support (BLS) skills are crucial for laypeople and health care professionals to increase the survival of cardiac arrest patients. A practical examination at the end of a BLS course may be beneficial for prolonging skill retention....

Published: Jan 20, 2026

C

Class-level RCT

Randomisation occurred at the school level, which satisfies the class‑level RCT requirement.
E

Exam-based Assessment

The authors used a bespoke nine‑item quiz rather than a recognised standardised exam.
T

Term Duration

The intervention lasted only a few hours over several weeks, not a full term.
D

Documented Control Group

The control group’s composition, baseline performance, and lack of intervention are clearly documented.
S

School-level RCT

Randomisation at the school level fulfills the school‑level RCT criterion.
I

Independent Conduct

The intervention was designed, delivered, and assessed by the authors’ team with no third‑party evaluation.
Y

Year Duration

The experiment ran under one academic year, so year‐long criterion is unmet.
B

Balanced Resources

The control group received no equivalent time or resources, so control was not balanced.
R

Reproduced Results

No independent replication of this intervention has been reported.
A

All-subject Exams

Only financial literacy was assessed, so all‐subject exam criterion is unmet.
G

Graduation Tracking

The study tracked outcomes only up to seven weeks post‑intervention, not through graduation.
P

Pre-registered Protocol

The trial was pre‑registered in the AEA RCT Registry before data collection.

social studies
K12
EU
homework
parent involvement
EdTech platform
digital assessment

The effects of parental involvement in homework: two randomised controlled trials in financial education

Joana Elisa Maldonado, Kristof De Witte, and Koen Declercq

This paper provides causal evidence on the effects of parental involvement on student outcomes in a financial education course based on two randomised controlled trials with a total of 2,779 grade 8 and 9 students in Flanders. Using an experimental design...

Published: Apr 13, 2021

C

Class-level RCT

Randomization was at the individual student level (not class- or school-level), and the paper does not describe a tutoring/personal teaching exception.
E

Exam-based Assessment

Outcomes are simulator-based performance tasks rather than a widely recognized standardized exam-based assessment.
T

Term Duration

The study reports four one-hour training sessions with testing at the beginning/midpoint/end of the study, without documenting a term-long (3–4 month) follow-up window.
D

Documented Control Group

The control group condition is explicitly described and baseline characteristics are documented and reported as balanced.
S

School-level RCT

The study was conducted within a single institution and assigned individual students, not multiple schools/sites randomized to conditions.
I

Independent Conduct

The evaluated training modalities are third-party tools and the authors disclose no conflicts of interest or financial ties.
Y

Year Duration

The paper does not document outcome measurement at least 75% of an academic year after intervention start, and T is not met.
B

Balanced Resources

The intervention adds structured training time by design, and that additional time is the treatment variable being tested against a no-training control group.
R

Reproduced Results

No independent replication of this specific 2026 randomized study by a different author team was found.
A

All-subject Exams

Criterion A is not met because criterion E is not met and outcomes are limited to simulator tasks rather than all-subject exams.
G

Graduation Tracking

The study does not track participants to graduation, and G is not applicable under ERCT because Y is not met.
P

Pre-registered Protocol

The paper provides ethics approval information but no dated public pre-registration record (registry, ID, and registration date).

science
higher education
EU
gamification
digital assessment

Comparing skill transfer and cognitive style effects across three laparoscopic training modalities: a prospective randomized study in medical students

L. Vradelis; N. Müller; F. Huettl; L. I. Hanke; A. Nedwed; H. Lang; C. Boedecker; T. Huber

Background Simulation-based training is an important component of modern surgical education. While virtual reality (VR) simulators, box trainers, and serious games are all used in laparoscopic training, comparative data on their effectiveness, transferability, and the role of individual cognitive learning...

Published: Jan 13, 2026

C

Class-level RCT

Randomization was at the individual student level rather than at the class (or school) level, and the intervention was not one-to-one tutoring.
E

Exam-based Assessment

The main learning outcome was measured using a researcher-prepared knowledge test rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were measured over roughly 6 weeks from the start of the study, which is shorter than one academic term.
D

Documented Control Group

The paper documents what the control group received and reports baseline demographics and baseline scores for both groups.
S

School-level RCT

The study randomized individual students within one university program rather than randomizing schools or sites.
I

Independent Conduct

The authors appear to have implemented and evaluated the study themselves, with no stated independent third-party conduct.
Y

Year Duration

The study duration is far shorter than 75% of an academic year, and criterion T is not met.
B

Balanced Resources

The intervention changes the learning method, but the paper describes substantial training for both groups and does not show a clear unbalanced addition of time/budget to the intervention group.
R

Reproduced Results

No independent peer-reviewed replication of this specific study was found or cited as of the ERCT check date.
A

All-subject Exams

Criterion E is not met, so criterion A is not met; additionally, the outcomes are not assessed across all core subjects via standardized exams.
G

Graduation Tracking

The study follow-up ends after a 4-week retention test, and no evidence of tracking participants through graduation (or any graduation-linked endpoint) was found.
P

Pre-registered Protocol

The study reports a ClinicalTrials.gov registration (NCT06412835), and registry dates show submission before the study start date.

science
higher education
Asia

The effect of collaborative learning approach on nursing students’ knowledge and skill learning for enteral nutrition: a randomized controlled study

Aysun Acun and Rahime Aksoy Bulgurcu

Background: Collaborative learning is one of the important interactive teaching methods in teaching nursing practices. This study aimed to examine the impact of the collaborative learning approach on nursing students’ knowledge levels and self-directed learning skills related to enteral nutrition....

Published: Jan 29, 2026

C

Class-level RCT

Students (not whole classes or schools) were randomized, so the unit of randomization was below the class level and risks contamination.
E

Exam-based Assessment

The primary educational outcome used a researcher-developed knowledge test rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were assessed after a short intervention with only a 4-week follow-up, and the stated study period spans far less than an academic term.
D

Documented Control Group

The control condition is clearly described (standard instruction), and baseline demographic comparability between groups is documented.
S

School-level RCT

Randomization occurred at the student level within one university program, not by random assignment of schools or sites.
I

Independent Conduct

The study does not report an independent third-party evaluation; key study elements were developed and analyzed by the author team.
Y

Year Duration

The study duration and follow-up are far shorter than 75% of an academic year, and ERCT rules also imply Y is not met when T is not met.
B

Balanced Resources

The paper reports equal instructional session duration across groups, and the main resource difference (concept maps/cooperative structure) is the intervention being tested rather than an unmatched time or budget boost.
R

Reproduced Results

No independent replication by a different research team was found for this specific study/intervention as of the ERCT check date.
A

All-subject Exams

The study does not use standardized exams across all core subjects, and because E is not met, A is not met as well.
G

Graduation Tracking

The study does not track participants until graduation, and because Y is not met, G is not met as well.
P

Pre-registered Protocol

Trial registration information indicates the study was registered on ClinicalTrials.gov before the study start date.

science
higher education
Asia

The effect of cooperative learning approach with concept maps on nursing students’ knowledge acquisition on symptom management in cancer patients: a randomized controlled trial

Burcu Bayrak Kahraman, Aysun Acun, and Semanur Bilgiç

Background: Cancer cases are increasing every day, which makes oncology nursing education very important. Nursing students need to learn how to manage the many symptoms caused by cancer and its treatments. This study aimed to examine the effect of a...

Published: Jan 28, 2026

C

Class-level RCT

Classes were randomly assigned to conditions, satisfying a class-level RCT.
E

Exam-based Assessment

The outcome measures were study-specific tests rather than standardized exams.
T

Term Duration

Outcomes were measured in a short pre-post design around a 90-minute intervention, not one term after the start.
D

Documented Control Group

The control group is described and the paper provides group composition and background characteristics.
S

School-level RCT

Randomization occurred at the class level, not by random assignment of schools.
I

Independent Conduct

The intervention designers and university-affiliated staff were involved in implementation and study conduct.
Y

Year Duration

The study did not track outcomes for a full academic year, and criterion T is not met.
B

Balanced Resources

The intervention did not add extra time or budget relative to control; all groups used comparable digital environments.
R

Reproduced Results

No independent replication study was identified in the paper or in an external literature search.
A

All-subject Exams

Outcomes were not assessed with standardized exams across all main subjects, and criterion E is not met.
G

Graduation Tracking

There is no evidence of tracking participants to graduation, and criterion Y is not met.
P

Pre-registered Protocol

The paper does not report pre-registration, and no matching registry record was identified in an external search.

mathematics
K12
EU
blended learning
EdTech platform

Fostering Students' Measurement Estimation Skills in a Digital Teaching- Learning Environment: A Class-Wise Randomized Controlled Trial in Grade 5

Niklas Peters, Susanne Prediger, and Juliane Weiss

Students' measurement estimation skills require benchmark knowledge (about measures of known objects) and estimation strategies (ways to compare with benchmarks). While students' estimation skills have been assessed and unpacked in several empirical studies (for length and area but less for...

Published: Feb 14, 2025

C

Class-level RCT

Randomization was conducted at the class level using intact classes, satisfying Criterion C.
E

Exam-based Assessment

Outcomes were measured with researcher-run EF tasks rather than a standardized exam-based educational assessment.
T

Term Duration

Outcomes were measured within the same morning rather than at least one academic term after intervention start.
D

Documented Control Group

The comparison conditions and baseline assessment are described, and group/class demographics are reported in an appendix.
S

School-level RCT

Randomization occurred within one school at the class level, not by assigning multiple schools to conditions.
I

Independent Conduct

The paper provides no explicit statement that the study was conducted by an independent external evaluation team.
Y

Year Duration

Outcomes were assessed acutely within a single morning, which is far shorter than 75% of an academic year.
B

Balanced Resources

The intervention does not add extra instructional time or budget, and all conditions are scheduled within the same pre-class time window.
R

Reproduced Results

No independent peer-reviewed replication of this specific trial was found in the paper or via external searching.
A

All-subject Exams

Criterion A is not met because Criterion E is not met and the study does not use standardized exams across core subjects.
G

Graduation Tracking

The study does not track students to graduation and, since Criterion Y is not met, Criterion G is not met.
P

Pre-registered Protocol

No pre-registration is reported in the paper, and external registry searching did not identify a registration clearly dated before the study period.

mathematics
K12
China

Brief Pre-Class Physical Exercise Selectively Enhances Mathematics-Specific Inhibitory Control: A Cluster-Randomized Trial in Authentic Classrooms

Zhihao Zhang, Weijia Zhu, Yiming Tao, Xiaoxin Fan, Fabian Herold, Myrto Mavilidi, Zhengmin Huang, Qian Yu, Peng Wang, David R. Lubans, Caterina Pesce, Charles H. Hillman, Matthew Heath, C. Shawn Green, Rong-Huan Jiang, Tomasz S. Ligeza, Mingyi Liu, Aiguo Chen, Liye Zou, Xia Xu, Fred Paas

Brief pre-class physical exercise may enhance executive function (EF) processes proximal to mathematics learning. However, the effect is likely dependent on the alignment between the physical exercise and subsequent instructional demands. We tested a dual-process framework in authentic classrooms, predicting...

Published: Jan 6, 2026

C

Class-level RCT

Randomization was within classes, but the intervention is parent-child one-to-one home teaching, which fits the ERCT tutoring/personal teaching exception for Criterion C.
E

Exam-based Assessment

Primary outcomes are measured with researcher-developed numeracy tasks, not with widely recognized standardized exams.
T

Term Duration

Outcomes were measured about 6 weeks after intervention start, which is shorter than a full academic term (about 3-4 months).
D

Documented Control Group

The control condition is described in detail and baseline characteristics are reported with descriptive tables.
S

School-level RCT

Randomization occurred within classes rather than assigning whole schools to intervention vs control.
I

Independent Conduct

The paper does not report that an independent third-party evaluation team conducted the study; implementation and evaluation appear to be run by the research team.
Y

Year Duration

The intervention and outcome measurement span only about 6 weeks, not a full academic year.
B

Balanced Resources

The study uses a matched active control with equivalent materials, engagement, and support, isolating numeracy content as the main difference.
R

Reproduced Results

No independent replication by a different research team is identified, and the paper frames the work as novel.
A

All-subject Exams

Criterion E is not met and the study does not measure standardized outcomes across all core subjects.
G

Graduation Tracking

Year-long tracking is not present and the paper reports no delayed posttest; no follow-up-to-graduation publications were identified.
P

Pre-registered Protocol

The paper states an OSF pre-registration link, but the ERCT check could not verify a time-stamped registration date before data collection.

mathematics
pre-K
EU
parent involvement

Causal effects of an ecologically valid home numeracy intervention on preschoolers’ numeracy skills

Cléa Girard, Angie De Lamper, Stien Callens, Davina Van den Broek, Bert De Smedt

The home numeracy environment is suggested to influence children's numerical development, but causal evidence for this assertion remains limited. Addressing this gap, we randomly assigned 117 predominantly White 4- to 5-year-olds (M = 4.68 years, SD = 0.2, 47% girls)...

Published: Jul 23, 2025

C

Class-level RCT

The study randomized students in small peer-instruction groups for a tutoring-style intervention, which satisfies the ERCT tutoring exception.
E

Exam-based Assessment

The study used custom lesson-specific pre- and post-tests rather than widely recognized standardized exams.
T

Term Duration

The study ran across two consecutive weeks with immediate post-tests, so it did not measure outcomes at least one academic term after the intervention began.
D

Documented Control Group

The in-class active learning control condition and group characteristics are documented, including comparable demographics and baseline physics background knowledge.
S

School-level RCT

Randomization occurred within one university course, not at the school level.
I

Independent Conduct

The AI tutor was designed by the authors and the paper does not describe independent third-party conduct of the study.
Y

Year Duration

Outcomes were measured within a two-week crossover design, not after a full academic year.
B

Balanced Resources

The AI tutor condition substituted for the in-class lesson and did not add unmatched time or resources for the intervention group.
R

Reproduced Results

No independent peer-reviewed replication of this specific study by another research team was found.
A

All-subject Exams

Outcomes were limited to physics topics and the study did not use standardized exams across all main subjects (and criterion E is not met).
G

Graduation Tracking

The study reports only immediate post-lesson outcomes and does not track students until graduation (and criterion Y is not met).
P

Pre-registered Protocol

The paper reports IRB approval but provides no evidence of a pre-registered protocol, and no matching public registry entry was found.

science
higher education
US
EdTech platform
online homework

AI tutoring outperforms in-class active learning: an RCT introducing a novel research-based design in an authentic educational setting

Greg Kestin, Kelly Miller, Anna Klales, Timothy Milbourne & Gregorio Ponti

Here we report a randomized, controlled trial measuring college students' learning and their perceptions when content is presented through an AI-powered tutor compared with an active learning class. The novel design of the custom AI tutor is informed by the...

Published: Jun 3, 2025

C

Class-level RCT

Randomization was at the individual child level, but the intervention was delivered one-to-one, fitting the personal teaching exception.
E

Exam-based Assessment

Outcomes were measured with study-specific tasks on novel letters and pseudowords, not standardized exams.
T

Term Duration

Outcomes were assessed within three consecutive days, far shorter than one academic term.
D

Documented Control Group

The paper documents participant characteristics and reports no baseline differences across groups in control measures.
S

School-level RCT

The study sampled from a single school and randomized children, not schools.
I

Independent Conduct

No independent evaluation organization is described in the paper.
Y

Year Duration

The study duration and measurement window were days, not an academic year.
B

Balanced Resources

The paper states that exposure and duration were made comparable across conditions, balancing time-on-task and practice quantity.
R

Reproduced Results

No independent peer-reviewed replication of this specific experiment was found.
A

All-subject Exams

This criterion is not met because Criterion E is not met and outcomes are not standardized exams across subjects.
G

Graduation Tracking

The study does not track participants to graduation, and the one-year prerequisite is not met.
P

Pre-registered Protocol

The paper provides an OSF link for materials and data but does not state or verify preregistration with a pre-data-collection date.

reading
language arts
kindergarten
EU

The impact of handwriting and typing practice in children's letter and word learning: Implications for literacy development

Gorka Ibaibarriaga, Joana Acha, Manuel Perea

Recent research has revealed that the substitution of handwriting practice for typing may hinder the initial steps of reading development. The current experiment investigated the impact of graphomotor action and output variability in letter and word learning using a variety...

Published: Feb 4, 2025

C

Class-level RCT

The study randomized individual students within a single institution rather than randomizing at the class level.
E

Exam-based Assessment

The study utilized the California Critical Thinking Skills Test (CCTST), a widely recognized standardized assessment.
T

Term Duration

The intervention lasted only 8 weeks, which is shorter than the required full academic term (typically 3-4 months).
D

Documented Control Group

The study documents the control group's size, demographics, and baseline performance scores in detail.
S

School-level RCT

The study was conducted within a single nursing college with randomization at the student level, not the school level.
I

Independent Conduct

The study was conducted by the authors themselves; only the randomization process was handled by an independent researcher.
Y

Year Duration

The study duration was 8 weeks, which does not meet the requirement of one full academic year.
B

Balanced Resources

The study tests the integration of LLMs as the variable; both groups received equal course time (16 hours), satisfying the balance requirement.
R

Reproduced Results

No independent replication of this specific study was found in peer-reviewed journals.
A

All-subject Exams

The study measured only critical thinking skills and the specific course test score, not all main subjects taught in the school.
G

Graduation Tracking

Outcomes were measured immediately after the course ended, with no tracking until graduation.
P

Pre-registered Protocol

The paper mentions following CONSORT but does not provide a registration number or evidence of pre-registered protocol.

science
higher education
China
Asia
project-based learning
EdTech app
EdTech platform

Randomized Controlled Study on the Impact of Problem-Based Learning Combined With Large Language Models on Critical Thinking Skills in Nursing Students

Shi Kejingyun, Rao Mingjun

Background: The integration of Large Language Models (LLMs) into nursing education presents a novel approach to enhancing critical thinking skills. This study evaluated the effectiveness of LLM-assisted Problem-Based Learning (PBL) compared to traditional PBL in improving critical thinking skills among...

Published: Apr 22, 2025

C

Class-level RCT

Randomization was performed at the school level, satisfying the requirement for class-level assignment.
E

Exam-based Assessment

The study utilized specific assessments developed by the research group (CSA and TechCheck) rather than widely recognized standardized exams.
T

Term Duration

The paper specifies the number of lessons but does not provide specific dates or a duration interval to confirm the intervention spanned at least one full academic term.
D

Documented Control Group

The control group's demographics are tabulated, and their "business as usual" condition is clearly described.
S

School-level RCT

The study randomized entire schools rather than classes or students, satisfying the school-level RCT requirement.
I

Independent Conduct

The study was conducted and analyzed by the same researchers who developed the curriculum.
Y

Year Duration

The study tracked outcomes only until the end of the 24-lesson curriculum, not for a full academic year.
B

Balanced Resources

The control group did not receive resources, time, or professional development equivalent to the treatment group.
R

Reproduced Results

Previous evaluations were conducted by the same authors/research group; no independent replication is cited.
A

All-subject Exams

The study limited assessment to coding and computational thinking, ignoring other core subjects.
G

Graduation Tracking

Tracking ended immediately after the curriculum implementation.
P

Pre-registered Protocol

The paper mentions IRB approval but does not cite a public pre-registration of the study protocol.

science
language arts
kindergarten
K12
Latam
project-based learning
EdTech app
digital assessment

Coding as another language: an early childhood programming curriculum in Argentina

Tess Levinson, Francisca Carocca P & Marina Bers

Background and context: Early childhood computer science (CS) education is a high-priority focus worldwide, but early childhood CS tools are primarily developed and researched within the United States and Europe. As an example, the Coding as Another Language ScratchJr (CAL-ScratchJr)...

Published: Jan 23, 2025

C

Class-level RCT

The study uses a randomized cross-over design at the peer-group level (student level), which is acceptable under the ERCT exception for personal tutoring interventions.
E

Exam-based Assessment

Outcomes were measured using custom pre- and post-tests designed for the specific lessons, not standardized exam-based assessments.
T

Term Duration

Outcomes were measured immediately following two single-lesson interventions, falling far short of the one-term duration requirement.
D

Documented Control Group

The control group (in-class active learning) is well-documented, including pedagogy, student demographics, and baseline knowledge.
S

School-level RCT

The study was conducted within a single university course, not randomized across multiple schools or institutions.
I

Independent Conduct

The study was designed, conducted, and analyzed by the authors, including the course instructors, without independent third-party conduct.
Y

Year Duration

Outcomes were measured over a two-week period, not tracked for a full academic year.
B

Balanced Resources

The intervention replaced the control activity without adding extra time; in fact, the intervention group spent less time on task than the control group.
R

Reproduced Results

No independent peer-reviewed replication of this specific AI tutoring intervention was found.
A

All-subject Exams

The study only assessed physics content knowledge, not all main subjects.
G

Graduation Tracking

The study tracks learning only for the duration of the lessons and does not follow students to graduation.
P

Pre-registered Protocol

The study mentions IRB approval but does not provide evidence of a pre-registered protocol on a public registry.

science
higher education
US
EdTech platform

AI tutoring outperforms in-class active learning: an RCT introducing a novel research-based design in an authentic educational setting

Greg Kestin, Kelly Miller, Anna Klales, Timothy Milbourne, Gregorio Ponti

This study reports a randomized, controlled trial measuring college students' learning and their perceptions when content is presented through an AI-powered tutor compared with an active learning class. We find that students learn significantly more in less time when using...

Published: Jun 3, 2025

C

Class-level RCT

Randomisation was conducted at the individual‑student level rather than at the class level, so the Class‑level RCT criterion is not satisfied.
E

Exam-based Assessment

The study used researcher‑designed custom tests rather than a standardized, widely recognized exam, so the Exam‑based Assessment criterion is not satisfied.
T

Term Duration

Outcomes were measured after a 4.5‑month intervention period, which covers at least one term, satisfying the Term Duration criterion.
D

Documented Control Group

The study provides detailed baseline characteristics and assessment outcomes for the control group, fulfilling the Documented Control Group criterion.
S

School-level RCT

Randomisation was done at the individual‑student level, not at the school level, so the School‑level RCT criterion is not satisfied.
I

Independent Conduct

The same team that designed Mindspark also carried out the trial and analysis, so the Independent Conduct criterion is not satisfied.
Y

Year Duration

Participants were followed for only 4.5 months rather than an academic year, so the Year Duration criterion is not satisfied.
B

Balanced Resources

The intervention’s extra instructional time is integral to the treatment, so the Balanced Resources criterion is satisfied.
R

Reproduced Results

No independent replication of the study is reported, so the Reproduced criterion is not satisfied.
A

All-subject Exams

Only mathematics and Hindi were assessed, so the All‑subject Exams criterion is not satisfied.
G

Graduation Tracking

Participants were only followed until the endline test, with no graduation tracking, so the Graduation Tracking criterion is not satisfied.
P

Pre-registered Protocol

The trial was registered only after data collection began, so the Pre‑registered Protocol criterion is not satisfied.

mathematics
language arts
K12
Asia
blended learning
EdTech platform

Disrupting Education? Experimental Evidence on Technology‑Aided Instruction in India

Karthik Muralidharan, Abhijeet Singh, and Alejandro J. Ganimian

We study the impact of a personalized technology‑aided after‑school instruction program in middle‑school grades in urban India using a lottery that provided winners with free access to the program. Lottery winners scored 0.37σ higher in math and 0.23σ higher in...

Published: 2019‑04‑01

C

Class-level RCT

Students were randomized at the individual level rather than by classroom, so cross-group contamination could occur.
E

Exam-based Assessment

The learning outcomes were measured using custom-built tests, not standardized exams.
T

Term Duration

Outcomes were measured immediately after two class meetings, not after a full academic term.
D

Documented Control Group

The control condition is clearly described with baseline group characteristics and identical materials.
S

School-level RCT

Randomisation occurred at the student level, not at the school level.
I

Independent Conduct

The same research team designed and conducted the intervention, with no third-party evaluator.
Y

Year Duration

The study spans two sessions with no year‑long follow‑up.
B

Balanced Resources

Time and materials were identical for both conditions, with only active engagement toggled.
R

Reproduced Results

The study has been independently replicated by a different research team.
A

All-subject Exams

Only physics learning was assessed, not all core subjects.
G

Graduation Tracking

The study ended after the course, without tracking students to graduation, and no follow-up by the authors provided such data.
P

Pre-registered Protocol

No pre-registration or protocol registry is mentioned.

science
higher education
US

Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom

Louis Deslauriers, Logan S. McCarty, Kelly Miller, Kristina Callaghan, Greg Kestin

We compared students’ self-reported perception of learning with their actual learning under controlled conditions in large-enrollment introductory college physics courses taught using active instruction and passive lecture. Both groups received identical content and handouts, and students were randomly assigned without...

Published: Aug 13, 2019

C

Class-level RCT

Randomization was conducted at the class (tutorial) level, satisfying the class‑level RCT requirement.
E

Exam-based Assessment

The study used author‑designed pre/post tests rather than a standardized exam.
T

Term Duration

Outcome measures were collected within days, not after a full academic term.
D

Documented Control Group

The handout control group is well described with baseline tasks and scores.
S

School-level RCT

Randomisation occurred at the tutorial‑class level, not at the school level.
I

Independent Conduct

The authors’ own team both developed and evaluated the intervention.
Y

Year Duration

Follow‑up lasted only days, not a full academic year.
B

Balanced Resources

The AR intervention entailed multimedia app access not matched by the handout.
R

Reproduced Results

Independent teams reproduced similar AR‐enhanced learning gains in other contexts.
A

All-subject Exams

Only sewing/textiles skills were assessed, not all core subjects.
G

Graduation Tracking

No follow‑up beyond the immediate post‑workshop period.
P

Pre-registered Protocol

No evidence of pre‑registration is provided.

arts
higher education
Asia
EdTech app
mobile learning

Improving Quality of Teaching and Learning in Classes by using Augmented Reality Video

Joanne Yip, Sze‑Ham Wong, Kit‑Lun Yick, Kannass Chan, Ka‑Hing Wong

This study contributes to enhancing students’ learning experience and increasing their understanding of complex issues by incorporating an augmented reality (AR) mobile application (app) into a sewing workshop in which a threading task was carried out to facilitate better learning...

Published: Sep 23, 2018

C

Class-level RCT

The study randomized entire schools to treatment conditions, meeting the class-level RCT criterion.
E

Exam-based Assessment

The study used researcher-designed tests, not standardized exams, to measure financial proficiency.
T

Term Duration

The intervention consisted of four 50-minute lectures, much shorter than a full academic term.
D

Documented Control Group

The paper documents the control group's characteristics, size, and conditions in detail, including baseline scores in Table III.
S

School-level RCT

The study randomized entire schools, fulfilling the requirement for a school-level RCT.
I

Independent Conduct

The authors conceptualized, designed the methodology, conducted the investigation, and wrote the paper; no independent team was involved.
Y

Year Duration

The intervention involved four 50-minute sessions and a four-week follow-up, falling short of the required full academic year.
B

Balanced Resources

The control group did not receive the financial education program or any comparable substitute, creating an imbalance in educational time/resources.
R

Reproduced Results

There is no mention in the paper or readily available external evidence of an independent replication of this specific study.
A

All-subject Exams

The study only assessed financial proficiency (knowledge and behavior) and did not use standardized exams for other core subjects.
G

Graduation Tracking

Follow-up was limited to approximately four weeks post-intervention; there was no tracking until graduation.
P

Pre-registered Protocol

The study was registered in the AEA RCT Registry (AEARCTR- 0004431), but the registration occurred *after* the intervention and data collection were completed.

K12
EU
EdTech website
digital assessment

The effects of computer-assisted adaptive instruction and elaborated feedback on learning outcomes. A randomized control trial

Kaat Iterbeke, Kristof De Witte, Wouter Schelfhout

Using a computer-based learning environment, the present paper studied the effects of adaptive instruction and elaborated feedback on the learning outcomes of secondary school students in a financial education program. We randomly assigned schools to four conditions on a crossing...

Published: Dec 13, 2020

C

Class-level RCT

Randomisation was at the parent (individual) level, not at the class level.
E

Exam-based Assessment

The study used custom and adapted questionnaires rather than standardised exams.
T

Term Duration

A six‑month follow‑up assessment provided outcome measurement after at least one full academic term.
D

Documented Control Group

The wait‑list control group is described with detailed demographics and conditions.
S

School-level RCT

Randomisation occurred at the individual level, with no schools assigned as units.
I

Independent Conduct

The authors who designed the program also delivered and assessed it without independent evaluation.
Y

Year Duration

Follow‑up lasted six months, shorter than the full academic year required.
B

Balanced Resources

The study explicitly tests additional training resources as the intervention; the control group remained business‑as‑usual.
R

Reproduced Results

No independent replication by another research team is mentioned.
A

All-subject Exams

Academic outcomes were measured by custom questionnaires, not in all main subjects via standardised exams.
G

Graduation Tracking

The study conducted only a six‑month follow‑up and did not track to graduation.
P

Pre-registered Protocol

The study was registered after the trial began (ACTRN12613000660785), so it was not truly pre‑registered.

K12
China
parent involvement

A Randomized Controlled Trial of Group Triple P With Chinese Parents in Mainland China

Mingchun Guo, Alina Morawska, and Matthew R. Sanders

This study evaluated the effects of Group Triple P with Chinese parents on parenting and child outcomes as well as outcomes relating to child academic learning in Mainland China. Participants were 81 Chinese parents and their children in Shanghai, who...

Published: 2016-00-00

C

Class-level RCT

The RCT randomized at the class level (16 classes), meeting the class-level randomization requirement.
E

Exam-based Assessment

Outcomes were measured with self-report scales rather than a standardized exam-based assessment.
T

Term Duration

The start-to-posttest interval is about 7–8 weeks, which is shorter than one academic term.
D

Documented Control Group

The control condition is explicitly described and baseline equivalence across groups is reported.
S

School-level RCT

Randomization occurred at the class level within two schools, not at the school level.
I

Independent Conduct

The paper does not clearly document an independent evaluation team separate from the intervention developers/research team.
Y

Year Duration

Outcomes were measured within about 7–8 weeks, far less than 75% of an academic year, and T is not met.
B

Balanced Resources

Intervention activities were delivered within normal instructional time and integrated into the standard curriculum, with no clear evidence of added instructional time or major additional resources versus business-as-usual.
R

Reproduced Results

No independent replication by other authors could be identified in the paper, and an online search did not find a published independent replication as of 2026-03-13.
A

All-subject Exams

Because criterion E is not met (no standardized exams), the all-subject exam requirement is not met.
G

Graduation Tracking

The study did not (and could not, given its duration) track students to graduation; it reports no long-term follow-up and Y is not met.
P

Pre-registered Protocol

The paper does not report a preregistration record (registry, ID, or date), and none could be confirmed via the article record as of 2026-03-13.

K12
China

The efficacy of three types of gratitude interventions for promoting defending behavior in response to school bullying among Chinese early adolescents

Haiyan Sun, Wei Cui, Yuan Chang, Xiaoran Li, Xiaojing Liu, Kaihua Zhang, Guanghui Chen, and Wenxin Zhang

Objectives This study aimed to examine the association between gratitude and defending behavior in response to school bullying among Chinese early adolescents and to evaluate the effects of three types of interventions (i.e., gratitude curriculum, gratitude journal, and gratitude visits)...

Published: Feb 11, 2026

C

Class-level RCT

Randomization is described at the student/participant level rather than at the class (or school) level, and no tutoring exception is stated.
E

Exam-based Assessment

Outcomes are assessed using a researcher-designed CT test and "Raven Progressive Matrices-style" problem-solving tasks rather than a clearly identified standardized exam-based assessment.
T

Term Duration

Outcomes are measured at least a term after the intervention begins, with a post-test in Week 25 after a 24-week curriculum.
D

Documented Control Group

The control condition is described as the regular school curriculum without deliberate coding instruction, and baseline equivalence is documented.
S

School-level RCT

Although multiple schools are mentioned, the paper does not state that schools were randomized to conditions.
I

Independent Conduct

The paper does not document independent third-party conduct of the evaluation, and author roles suggest the study was run and analyzed by the same team.
Y

Year Duration

The study timeline includes a delayed follow-up assessment at Week 36, which is consistent with at least 75% of a typical academic year after the intervention begins.
B

Balanced Resources

The intervention adds substantial instructional time (48 hours of coding instruction) without a matched active control, and the paper explicitly notes that a passive business-as-usual control cannot separate coding from general enrichment effects.
R

Reproduced Results

The paper does not document an independent peer-reviewed replication of this specific study, and no such replication was identified via targeted searches by DOI/title at the time of this ERCT check.
A

All-subject Exams

Criterion E is not met, so the requirement for all-subject standardized exams is not met.
G

Graduation Tracking

The paper does not report tracking participants to graduation, and no follow-up paper by the same authors reporting graduation outcomes was identified at the time of this ERCT check.
P

Pre-registered Protocol

The paper provides no registry name/ID or dated statement showing pre-registration prior to data collection.

science
K12
China
EdTech platform

Enhancing computational thinking through coding education in primary school students: an experimental study on the impact of early programming exposure on problem-solving skills

Xin Wang, Feixue Wan, and JiaMin Dai

Introduction: This research paper presents an empirical investigation of the effectiveness of early coding instruction in improving problem-solving skills and computational thinking (CT) among primary school students. The primary research question was to determine whether a structured six-month coding intervention...

Published: Feb 18, 2026

C

Class-level RCT

Randomization was at the individual student level (not class- or school-level), and the tutoring/personal-teaching exception does not apply.
E

Exam-based Assessment

Outcomes were measured using study-specific MCQ questionnaires rather than a standardized, widely recognized exam-based assessment.
T

Term Duration

The study ran from 15/10/2024 to 15/01/2025 with assessment at the end of the sessions, which is approximately one academic term.
D

Documented Control Group

The control (PBL) group is described with sample size, baseline characteristics, and a clear description of what the control condition received.
S

School-level RCT

The unit of randomization is students within one institution, not schools or equivalent educational sites.
I

Independent Conduct

The paper does not document independent third-party conduct of implementation and/or evaluation separate from the study team.
Y

Year Duration

The study period (15/10/2024 to 15/01/2025) is about three months, which is well below 75% of an academic year, and the paper notes a short follow-up duration.
B

Balanced Resources

Both conditions had the same in-session duration (4 h) and similar facilitation, and the additional TBL elements (pre-class preparation materials and readiness assurance tests) are integral to the TBL treatment being compared to PBL.
R

Reproduced Results

No independent replication of this specific University of Tabuk TBL-vs-PBL RCT by a different team in a different context was found.
A

All-subject Exams

Criterion E is not met (no standardized exam), so criterion A is automatically not met; additionally, the study assesses only surgery-specific knowledge rather than all main subjects.
G

Graduation Tracking

The study does not track learners through graduation and notes a short follow-up and lack of long-term retention assessment; also, because Y is not met, G cannot be met.
P

Pre-registered Protocol

No pre-registration registry, registration ID, or registration date is reported, and no external protocol record was identified.

science
higher education
Asia
digital assessment
formative assessment

Team-based vs. problem-based learning in undergraduate surgery: a randomized controlled trial in Saudi Arabia

Mohammad Shawir, Ehab A. Frah, Shadad M. Mahmoud, Marai Mohammed Alamri, Yazeed Ibrahim Alghabban, Mohammed M. Aljohani, and Roaa Ghazi Khan

Objectives: The primary objective measured in our study is to determine whether Team-Based Learning (TBL) is a superior pedagogical approach compared to Problem-Based Learning (PBL) or not. We focused on our secondary objectives, which include promoting problem-solving, facilitating independent learning,...

Published: Feb 9, 2026

C

Class-level RCT

Randomization occurred at the individual trainee level rather than by intact classes (or equivalent teaching units), so the ERCT class-level randomization requirement is not satisfied.
E

Exam-based Assessment

Outcomes were assessed with study-assembled case sets and rating scales rather than a widely recognized standardized exam.
T

Term Duration

The stated study period spans from July 18, 2024 to March 31, 2025, which exceeds a typical academic term length.
D

Documented Control Group

The control condition is described, but the paper does not provide control-group baseline demographics and baseline performance suitable for ERCT documentation expectations.
S

School-level RCT

This is a single-site residency program trial with individual randomization, not a school- (or institution-) level RCT.
I

Independent Conduct

Although assessors were blinded for communication skills, the paper does not document that implementation and evaluation were conducted by an independent third party separate from the intervention designers.
Y

Year Duration

The reported study window (July 2024 through March 2025) spans about 8.5 months, which exceeds 75% of a typical 9–10 month academic year, meeting the ERCT year-duration threshold.
B

Balanced Resources

The experimental condition includes added radiologist involvement that is integral to the teaching model being tested against conventional training, so the resource difference is the treatment rather than an unacknowledged confound.
R

Reproduced Results

No independent replication is reported in the paper, and no independently published replication of this specific trial was identified in an external literature search.
A

All-subject Exams

Because the study does not use standardized exam-based assessments, ERCT criterion A is automatically not met; in addition, outcomes focus on vascular training competencies rather than standardized exams across all core subjects.
G

Graduation Tracking

The study reports a short follow-up and provides no tracking to a graduation or program-completion endpoint; no follow-up papers by the same authors reporting graduation tracking were identified.
P

Pre-registered Protocol

The paper provides no registry identifier or dated statement showing that a protocol was pre-registered before data collection.

science
higher education
China

Interdisciplinary collaborative teaching in vascular surgery training: A randomized trial of radiologist-surgeon partnership in China

Xin Li, Wei Hu

This randomized controlled trial evaluates an innovative interdisciplinary teaching model co-led by radiologists and vascular surgeons within China’s standardized residency training program. Forty trainees were randomized into two groups: one receiving collaborative teaching, which included joint lectures, radiologist-attended ward rounds,...

Published: Jan 27, 2026

C

Class-level RCT

Randomization was at the individual nurse (participant) level, not at a class/site level, and no tutoring-style exception applies.
E

Exam-based Assessment

Outcomes were measured using researcher-developed and self-reported instruments rather than standardized exams.
T

Term Duration

Outcomes were measured immediately at program end (or after about two weeks for controls), not at least one academic term after intervention start.
D

Documented Control Group

The control group is clearly described (no education), with group sizes and baseline characteristics reported in tables.
S

School-level RCT

Randomization was not at the school/site level; individual nurses were randomized.
I

Independent Conduct

The research team developed and implemented the program and the PI generated the randomization sequence, with no external evaluator documented.
Y

Year Duration

The study ran only from November to December 2023, far less than 75% of an academic year; per ERCT rules, Y is also blocked because T is not met.
B

Balanced Resources

The intervention added substantial instructional time/materials, but these resources are integral to the treatment being tested (receiving the CCOM program vs no program).
R

Reproduced Results

No independent replication of this specific CCOM program trial was found.
A

All-subject Exams

Criterion E is not met (no standardized exams), so A is not met, and outcomes were not broad core-subject exams.
G

Graduation Tracking

There is no tracking to any graduation milestone, and per ERCT rules G is also blocked because Y is not met.
P

Pre-registered Protocol

The protocol was registered in UMIN-CTR, and the registry shows a registered date before the trial's anticipated start and the paper's data-collection period.

adult education
Asia

Evaluation of an educational program for nurses on the care coordination of pleural mesothelioma patients: A pilot randomized controlled trial

Yasuko Nagamatsu; Wakanako Ono; Yumi Sakyo; Edward Barroga; Keiko Hosokawa; Minako Ito; Shino Matsuura; Taiki Fukujin; Yoko Maehara; Masaya Ito; Haruhiko Mitsunaga; Satomi Nakajima; Nobukazu Fujimoto; Kazunori Okabe

Objective: This study aimed to evaluate the effects of the care coordination of pleural mesothelioma program (CCOM program), an educational program that we developed for nurses to improve their knowledge, attitude, and confidence on the care coordination of pleural mesothelioma...

Published: Jan 27, 2026

C

Class-level RCT

The study randomizes intact sections (all students in a meeting) to flipped or lecture for each lesson, avoiding within‑session mixing and satisfying class‑level assignment.
E

Exam-based Assessment

The study used instructor‑designed course exams instead of standardized external assessments.
T

Term Duration

Learning outcomes were measured at end of semester, satisfying the term duration requirement.
D

Documented Control Group

The control (lecture) condition lacks detailed documentation of participant characteristics and baseline outcomes.
S

School-level RCT

Randomization occurred within course sections rather than entire schools.
I

Independent Conduct

Authors who designed the intervention also implemented and analyzed the study.
Y

Year Duration

Outcomes were measured only through one semester, not a full academic year.
B

Balanced Resources

Flipped lessons included mandatory pre‑class videos and in‑class exercises not equated by the control group.
R

Reproduced Results

An independent replication of the flipped classroom experiment by another research team has been published, satisfying this criterion.
A

All-subject Exams

Only econometrics outcomes were assessed, with no broad subject coverage.
G

Graduation Tracking

The study did not track participants until graduation.
P

Pre-registered Protocol

No evidence of pre-registration of study protocols.

social studies
higher education
US
flipped classroom
blended learning
EdTech website
formative assessment

Evaluating the flipped classroom: A randomized controlled trial

Nathan Wozny, Cary Balser & Drew Ives

Despite recent interest in flipped classrooms, rigorous research evaluating their effectiveness is sparse. In this study, the authors implement a randomized controlled trial to evaluate the effect of a flipped classroom technique relative to a traditional lecture in an introductory...

Published: Mar 14, 2018

C

Class-level RCT

Randomization was at the individual student level (not by class or school), and the intervention is not one-to-one tutoring.
E

Exam-based Assessment

Outcomes were measured with a researcher-developed knowledge test rather than a widely recognized standardized exam.
T

Term Duration

The intervention is described as a three-session program with a 4-week follow-up, which is shorter than one academic term from intervention start.
D

Documented Control Group

The paper documents the control group’s size, condition (standard instruction), and baseline comparability information.
S

School-level RCT

The study was conducted in a single university setting and randomized individual students rather than randomizing schools or sites.
I

Independent Conduct

The paper does not document independent third-party conduct separate from the intervention designers and study team.
Y

Year Duration

Outcomes were not tracked for at least 75% of an academic year, and since T is not met, Y is not met by rule.
B

Balanced Resources

The paper reports equal instructional time across groups, and the main added inputs (concept maps and cooperative activities) are integral to the intervention rather than a separable resource confound.
R

Reproduced Results

No independent peer-reviewed replication of this specific trial was found.
A

All-subject Exams

E is not met (no standardized exams), and the outcomes are not all-subject standardized exams.
G

Graduation Tracking

The study does not track participants until graduation, and since Y is not met, G is not met by rule.
P

Pre-registered Protocol

The paper reports a ClinicalTrials.gov registration with a stated registration date before the study period, supporting prospective registration.

science
higher education
Asia

The effect of cooperative learning approach with concept maps on nursing students’ knowledge acquisition on symptom management in cancer patients: a randomized controlled trial

Burcu Bayrak Kahraman, Aysun Acun, and Semanur Bilgiç

Background Cancer cases are increasing every day, which makes oncology nursing education very important. Nursing students need to learn how to manage the many symptoms caused by cancer and its treatments. This study aimed to examine the effect of a...

Published: Jan 28, 2026

C

Class-level RCT

Students (not intact classes or schools) were randomly assigned, and the paper does not document class- or school-level randomization.
E

Exam-based Assessment

Outcomes were measured with self-report scales (self-efficacy and fear of negative evaluation), not standardized exam-based academic assessments.
T

Term Duration

Primary post-intervention outcomes were collected immediately after an eight-week intervention, which is shorter than a full academic term as defined by ERCT.
D

Documented Control Group

The control group condition and timing are described, and baseline measurements were collected for both groups.
S

School-level RCT

The paper describes two institutes as study sites but does not document randomization of institutes/schools to conditions.
I

Independent Conduct

The paper does not document an independent external evaluation team for implementation, measurement, or analysis.
Y

Year Duration

Outcomes were tracked only through an eight-week intervention and a four-week follow-up, far less than 75% of an academic year.
B

Balanced Resources

The paper explicitly states equivalent instructional time and access to materials for both groups, and the extra instructional structure appears integral to the intervention being tested.
R

Reproduced Results

No independent peer-reviewed replication of this specific RCT was found in searches, and the paper frames its findings as "first evidence."
A

All-subject Exams

Criterion E is not met (no standardized exam outcomes), therefore criterion A is also not met.
G

Graduation Tracking

The study does not track students until graduation, and criterion G is also automatically not met because criterion Y is not met.
P

Pre-registered Protocol

The paper does not mention a registry, registration ID, or a pre-registered protocol prior to data collection.

reading
language arts
K12
Africa

The impact of repeated reading intervention on reading self-efficacy and fear of negative evaluation in Arabic-speaking students

Fatimah Ali ALhuraybi, Abdulaziz Faleh Al-Osail, Waheed Elsayed Hafez, Ashraf Ragab Ibrahim, Faiza Ahmed Abdelrahman, Eman Ahmed Abdelrahman, Mohamed Sayed Abdellatif, Mohamed Ali Nemt-allah

This randomized controlled trial examined the effectiveness of repeated reading intervention on reading self-efficacy and fear of negative evaluation (FNE) among Arabic-speaking preparatory students. Sixty second-grade students (ages 14-15) from two preparatory institutes in Gharbia Governorate, Egypt, were randomly assigned...

Published: Jan 21, 2026

C

Class-level RCT

Randomization was performed at the individual participant level, not at the class (or stronger) level, and no tutoring exception applies.
E

Exam-based Assessment

Outcomes were measured using questionnaires and a self-developed knowledge quiz rather than standardized exam-based assessments.
T

Term Duration

The intervention and post-test occurred within days (data collection within about one month), far shorter than one academic term from intervention start to outcome measurement.
D

Documented Control Group

The paper documents control and experimental group sizes and provides baseline demographic characteristics and pre/post outcome descriptives by group.
S

School-level RCT

The unit of randomization was individual participants from an online panel rather than schools (or other educational institutions) randomized as clusters.
I

Independent Conduct

The intervention was developed by the authors’ organization, and the paper does not document that the overall evaluation and analysis were led by an independent external team.
Y

Year Duration

Outcomes were assessed within days after intervention, far below 75% of an academic year; additionally, since criterion T is not met, criterion Y is not met.
B

Balanced Resources

The intervention required substantial participant time while the control group was a waiting control with no comparable substitute activity, creating an unbalanced input condition.
R

Reproduced Results

No independent replication by a different research team in a different context was found or documented.
A

All-subject Exams

Criterion E is not met (no standardized exam-based outcomes), so criterion A cannot be met; additionally, outcomes do not cover all core school subjects.
G

Graduation Tracking

No tracking through graduation is reported, no follow-up papers tracking this cohort to graduation were found, and criterion G cannot be met because criterion Y is not met.
P

Pre-registered Protocol

The study reports OSF preregistration dated October 4, 2024, which precedes the stated data collection period beginning October 14, 2024.

K12
EU
gamification
EdTech website
mobile learning

Positive evaluation of an online intervention that promotes digital health literacy in adolescents as recommended by the German Digital Healthcare Act: Randomized controlled trial

Rebekka Schröder, Tim Hamer, Victoria Kruzewitz, Ellen Busch, Ralf Suhr, Lars König

Digital health literacy is an important asset to navigate the (digital) world and lead a healthier life. However, digital health literacy levels are insufficient in many segments of society, particularly among adolescents. In response to these deficits, policymakers have called...

Published: Jan 22, 2026

C

Class-level RCT

The unit of randomization is individual participants (between-subjects), not intact classes or schools, and no tutoring exception applies.
E

Exam-based Assessment

Outcomes are measured using a researcher-designed quiz rather than a widely recognized standardized exam.
T

Term Duration

Outcomes are measured within a single short session (minutes to about an hour), not at least one academic term after the intervention begins.
D

Documented Control Group

The control condition and key baseline/balance characteristics are documented, including a balance table and clear control vs treatment descriptions.
S

School-level RCT

Randomization is conducted among individual participants rather than at the school (or equivalent site/institution) level.
I

Independent Conduct

The study does not document independent third-party conduct of the evaluation; the authors are Anthropic-affiliated and describe internal review.
Y

Year Duration

The study duration is about an hour rather than at least 75% of an academic year, and because T is not met, Y is necessarily not met.
B

Balanced Resources

The only clear resource difference is access to the AI assistant, which is the explicit treatment variable; otherwise tasks and time limits are comparable across groups.
R

Reproduced Results

No independent replication by other research teams was found, and the paper frames itself as an initial study that motivates future work.
A

All-subject Exams

Because E is not met (custom quiz), A is not met; additionally, the study assesses only Trio/library-specific skills rather than all core subjects.
G

Graduation Tracking

The study does not track participants to graduation, and because Y is not met, G is necessarily not met.
P

Pre-registered Protocol

The paper links to an OSF pre-registration and states it was done before running the experiment, but the registry entry’s date could not be verified here to confirm it predates data collection.

adult education
EdTech platform
digital assessment

How AI Impacts Skill Formation

Judy Hanwen Shen and Alex Tamkin

AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks...

Published: Feb 3, 2026

C

Class-level RCT

Random allocation is described at the individual participant level rather than by intact classes (or schools), so class-level randomization is not demonstrated.
E

Exam-based Assessment

The primary outcome assessments are internally assembled (course-derived written exam plus a study-developed questionnaire) rather than a widely recognized standardized exam.
T

Term Duration

Although an August 2024 start is stated, the paper does not provide a clear end date or duration showing outcomes were measured at least one academic term after the intervention began.
D

Documented Control Group

The control condition and demographics are described, but baseline academic performance (pre-intervention achievement) is not reported for the control group.
S

School-level RCT

The study occurs within one institution and randomizes individuals, not whole schools/sites, so school-level randomization is not shown.
I

Independent Conduct

The authors who designed the study also carried out data collection and analysis, and no independent external evaluation team is documented.
Y

Year Duration

The paper does not report an outcome measurement time point at least 75% of an academic year after intervention start, and criterion T is also not met.
B

Balanced Resources

The intervention explicitly tests a combined 3D+PBL+annotation teaching package where the additional resources are integral to the treatment definition, making a business-as-usual lecture control appropriate.
R

Reproduced Results

No independent replication by a different research team in a different context could be identified.
A

All-subject Exams

Because standardized exam-based assessment is not documented (criterion E not met), the study cannot meet the all-subject standardized exam requirement.
G

Graduation Tracking

The study focuses on short-term outcomes and provides no graduation tracking, and criterion G also fails automatically because criterion Y is not met.
P

Pre-registered Protocol

The paper reports ChiCTR registration with an ID and registration date that precedes the stated August 2024 study start.

science
higher education
China
EdTech app

Evaluating a Problem-Based Learning Model Integrated with 3D Anatomy Software and Software-Assisted Annotation in Undergraduate Spinal Surgery Education: a Randomized Controlled Trial

Wenbo Li, Ziyao Ding, Shuo Feng, Wenkang Xu, Haixu Qi, Qirui Zhu, Bingxu Xiao, Shaoyu Zhu, Maji Sun, Feng Yuan

Background: Traditional lecture-based learning (LBL) faces limitations in teaching complex spinal anatomy and surgical procedures. This study aimed to evaluate the efficacy of a novel Problem-Based Learning (PBL) model integrated with three-dimensional (3D) anatomy software and software-assisted annotation in spinal...

Published: Jan 26, 2026

C

Class-level RCT

Randomization occurred at the individual student level (not by class or school), and the intervention is not one-to-one tutoring, so ERCT class- level randomization is not satisfied.
E

Exam-based Assessment

Outcomes were assessed using study-created and validated MCQs rather than a widely recognized standardized exam, so the exam-based assessment requirement is not met.
T

Term Duration

The primary post-test occurred one week after instruction, which is far shorter than an academic term, so the term-duration requirement is not met.
D

Documented Control Group

The control condition and baseline comparability are described (control received additional study time), with demographic and pre-test evidence of comparable baseline status, so the control group is adequately documented.
S

School-level RCT

The study was conducted at a single institution with student-level randomization, not school/site-level randomization across multiple schools, so the school-level RCT requirement is not met.
I

Independent Conduct

The authors appear to have designed, conducted, and analyzed the study themselves without documenting an independent evaluation team, so independent conduct is not established.
Y

Year Duration

Outcomes were measured within weeks (and T is not met), which is far short of 75% of an academic year, so the year-duration requirement is not met.
B

Balanced Resources

The intervention’s additional structure and faculty-moderated discussion are integral components of the educational method being tested against an equal-time study control, so the resource difference reflects the treatment rather than an unintended imbalance.
R

Reproduced Results

The paper does not report an independent replication, and an internet search did not identify a peer-reviewed independent replication of this specific trial as of the ERCT check date.
A

All-subject Exams

Because standardized exams were not used (E not met), the all-subject standardized exam requirement cannot be satisfied, and the assessments focus only on dental materials topics.
G

Graduation Tracking

The study measures short-term outcomes only and does not track students through graduation (and Y is not met), so graduation tracking is not satisfied.
P

Pre-registered Protocol

The paper reports a CTRI registration number and date, but registry timing relative to first enrollment could not be independently verified from the CTRI database in this review, so pre-registration cannot be confirmed.

science
higher education
Asia

Student-generated multiple-choice questions enhance deeper learning in dental materials education: a randomized crossover trial

Pankaj Gupta, Karthik Shetty, Heeresh Shetty, and Kulvinder Singh Banga

Background: Dental materials education poses unique challenges due to the complex integration of scientific principles with clinical applications. Traditional teaching methods often fail to promote deep conceptual understanding. This study investigated whether the process of generating multiple-choice questions (MCQs) by...

Published: Jan 23, 2026

C

Class-level RCT

Randomization is at the child level, but the intervention is delivered in small-group/individual formats with an experimenter, matching the ERCT tutoring-style exception.
E

Exam-based Assessment

The primary outcomes rely on researcher-created homonym measures rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were measured about one week after completing a short (~2-week) intervention period, which is far shorter than an academic term.
D

Documented Control Group

The paper documents the control conditions and reports baseline comparisons between conditions, enabling interpretation of control-group comparability.
S

School-level RCT

Schools were not randomized to conditions; participants were assigned within schools.
I

Independent Conduct

The paper does not document an independent, third-party evaluation team; assignment and delivery are described as being done by the study researchers.
Y

Year Duration

Outcomes were measured within weeks rather than at least 75% of an academic year after the intervention began; additionally, because T is not met, Y cannot be met.
B

Balanced Resources

Study 1 uses a time-matched active control, but Study 2 provides greater per-child adult support in the intervention (individual delivery) than the control (pairs), creating an input imbalance not framed as the treatment variable.
R

Reproduced Results

No independent replication by a different research team was identified; the paper reports two trials by the same author team.
A

All-subject Exams

Because criterion E is not met, criterion A cannot be met; the study also does not assess all core subjects via standardized exams.
G

Graduation Tracking

The paper reports only short-term post-tests and does not track students to graduation; additionally, because Y is not met, G cannot be met.
P

Pre-registered Protocol

Study 2 is stated to be pre-registered on OSF, but the paper does not report a registration date that can be verified as occurring before data collection.

reading
language arts
K12
UK

Lexical inference training for homonyms: Two randomized controlled trials for children with English as a first and an additional language

Sophie A. Booton; Julia M. H. Birchenough; Katie Gilligan-Lee; Fiona Jelley; Victoria A. Murphy

Background: Many words have multiple meanings, which present challenges to learning, yet research has yet to identify effective interventions for homonyms. Lexical inference may be a promising strategy. Aim: To evaluate a brief, novel lexical inference intervention for homonyms. Samples:...

Published: Jan 5, 2026

C

Class-level RCT

Randomization was at the individual participant level rather than at the class (or stronger) level, so class-level isolation is not ensured.
E

Exam-based Assessment

Outcomes were measured using self-report psychological scales, not widely recognized standardized exams.
T

Term Duration

The intervention lasted six weeks and outcomes were measured immediately after it ended, which is shorter than one academic term.
D

Documented Control Group

The paper provides clear control-group description and baseline comparability information (sample sizes, demographics, and baseline equivalence tests).
S

School-level RCT

Randomization assigned individual participants, not whole schools, so the study is not a school-level RCT.
I

Independent Conduct

While randomization was done by an assistant and assessors were blinded, the paper does not document an external independent evaluation team separate from the intervention designers/authors.
Y

Year Duration

The study ran for six weeks with immediate post-testing, which is far below 75% of an academic year, and it also fails Criterion T.
B

Balanced Resources

Both groups completed writing tasks on the same schedule and in the same setting with similar supervision, so time and implementation resources appear balanced.
R

Reproduced Results

No independent published replication of this specific 2026 PPEW RCT was identified, and the paper itself does not report a replication.
A

All-subject Exams

Because the study does not use standardized exam-based outcomes (Criterion E not met), it cannot satisfy the all-subject exams requirement.
G

Graduation Tracking

The study explicitly reports no long-term follow-up, and because it also fails the year-duration requirement, it cannot meet graduation tracking.
P

Pre-registered Protocol

The paper provides ethics approval details but no evidence of a publicly pre-registered protocol (registry name/ID and timing).

K12
China

Writing the Past, Present, and Future: The Impact of Positive Psychology Expressive Writing on Adolescents’ Time Attitudes

Xiangling Tu, Bo Wu, Xiaobin Ding, Qixuan Huo, and Min Chen

This study aimed to examine the interventional effects of positive psychology expressive writing (PPEW) on adolescents’ time attitudes and mental health. A total of 285 adolescents from Northwest China (M = 14.13, SD = 1.075; 53.3% female) were randomly assigned...

Published: Jan 14, 2026

C

Class-level RCT

Randomization was at the individual student level rather than assigning whole classes (and this was not a one-to-one tutoring intervention).
E

Exam-based Assessment

Outcomes were measured with self-report empowerment and confidence scales rather than a widely recognized standardized exam-based assessment.
T

Term Duration

The intervention lasted 8 weeks and outcomes were measured immediately after training, which is shorter than a full academic term (about 3–4 months) from start to measurement.
D

Documented Control Group

The control condition is clearly described (routine curriculum without simulation) and baseline characteristics and group sizes are documented.
S

School-level RCT

The paper randomizes students within a single university department rather than randomizing at the school/site level.
I

Independent Conduct

Although randomization and assessment roles include "independent" personnel, the intervention was developed and data collection led by study authors, so the trial is not fully independent of the intervention designers.
Y

Year Duration

Year-duration tracking is not present because outcomes were measured immediately after an 8-week intervention (and ERCT Y also cannot be met when ERCT T is not met).
B

Balanced Resources

The intervention’s added materials and personnel (trained standardized patients, scripted scenarios, debriefing and fidelity procedures) are integral to the treatment being tested, and the paper documents a structured routine curriculum control rather than leaving control students without educational engagement.
R

Reproduced Results

The paper does not report an independent replication and no independent replication evidence is provided within the paper text.
A

All-subject Exams

All-subject standardized exams were not used, and because ERCT E is not met, ERCT A is automatically not met.
G

Graduation Tracking

The study did not track participants to graduation, and because ERCT Y is not met, ERCT G is automatically not met.
P

Pre-registered Protocol

No pre-registration is provided; the paper explicitly states "Clinical trial number Not applicable."

higher education
Africa

The effect of de-escalation simulation training on empowerment and confidence in managing patient aggression among psychiatric nursing students: an experiential learning approach

Gamal Ali Abdelhamid Eltrass, Sorayia Ramadan Abdelfattah, Ghada Mohammed Mourad, Mostafa Shaban, and Wafaa Osman Abd El-Fatah

Background Patient aggression is a persistent challenge in mental health settings, and undergraduate preparation in de-escalation remains variable. This study evaluated whether a standardized-patient (SP) de-escalation simulation improves psychological empowerment and confidence in coping with aggression among psychiatric nursing students....

Published: Jan 12, 2026

C

Class-level RCT

Randomization was at the individual student level (not class or school), so class-level randomization to reduce contamination is not demonstrated.
E

Exam-based Assessment

Outcomes were measured using locally developed Test A/Test B examinations rather than a widely recognized standardized exam.
T

Term Duration

The intervention and measurement occurred within a 2-week rotation with immediate post-teaching testing, which is shorter than a full academic term.
D

Documented Control Group

The control condition and baseline characteristics (including group sizes and demographics) are described with pre-test scores.
S

School-level RCT

Schools (or clerkship sites) were not randomized; students were individually assigned, so school-level randomization is absent.
I

Independent Conduct

The research team developed the animations and also collected and analyzed the data, with no clearly independent evaluator.
Y

Year Duration

The intervention and outcome measurement occurred over a 2-week rotation with immediate post-testing, far below 75% of an academic year.
B

Balanced Resources

Instructional objectives and in-session teaching time appear matched across groups; the only extra resource (post-session online access) occurred after the immediate post-test, making any resource imbalance negligible for the main outcome.
R

Reproduced Results

No independent replication study is reported or identifiable as a direct replication of this specific RCT.
A

All-subject Exams

Because standardized exams are not used (criterion E not met), the all-subject standardized exam requirement cannot be met.
G

Graduation Tracking

The study does not track participants to graduation and explicitly notes the lack of long-term follow-up; additionally, ERCT rules require criterion Y to be met for criterion G to be met.
P

Pre-registered Protocol

The trial registration is explicitly described as retrospective and dated after enrollment, so it was not pre-registered before data collection began.

higher education
China
blended learning
EdTech website

Application of renal pathology three-dimensional animation in clinical clerkship teaching for medical students: a randomized-controlled trial

Qi Zhang, Yingxin Xie, Feng Ding, Shuai Ma, and Wenji Wang

Objectives Teaching renal pathology, characterized by complex spatial relationships and dynamic pathological processes, poses significant challenges. Traditional methods such as static slides often fail to convey these concepts effectively. This study evaluated the efficacy of custom-developed renal pathology three-dimensional (3D)...

Published: Jan 24, 2026

C

Class-level RCT

Randomisation was conducted at the individual student level (not by intact classes or schools) and no tutoring exception applies.
E

Exam-based Assessment

Learning was measured with a curriculum-based multiple- choice test developed by the authors, not a widely recognised standardized exam.
T

Term Duration

Outcomes were measured again at a three-month follow-up, which is approximately one academic term after the intervention began.
D

Documented Control Group

The paper clearly defines the comparison groups (TT vs. JS), reports group sizes, and reports baseline and follow- up outcome data by group.
S

School-level RCT

The randomisation occurred among individual students, not among schools (or equivalent institutions/sites).
I

Independent Conduct

The authors developed key study materials and conducted the randomisation on site, with no quoted evidence of an independent external evaluation team.
Y

Year Duration

The study’s follow-up lasted three months, which is far short of 75% of an academic year.
B

Balanced Resources

The JS condition received substantially more total time and broader learning resources than TT, and this resource difference was not framed as the treatment variable nor matched in the control condition.
R

Reproduced Results

No evidence is provided that an independent research team has replicated this specific RCT in another context.
A

All-subject Exams

Criterion E is not met (the outcome test is not a standardized exam), therefore All-subject Exams cannot be met.
G

Graduation Tracking

The study only follows students for three months and does not track outcomes through graduation; additionally, Criterion Y is not met, so G cannot be met.
P

Pre-registered Protocol

The paper explicitly states that trial registration is not applicable, providing no pre-registered protocol or registry identifier.

science
higher education
EU

The Jigsaw teaching method compared to traditional teaching on anatomy and physiology knowledge in higher education – a randomised controlled trial

Janette Moland Stokstad, Eirik Solberg Nedrehagen, and Karl Ove Hufthammer

Traditional teaching (TT) is lecturer-centred, while student-centred teaching, including Jigsaw (JS), fosters student interaction. However, the results of research on the effectiveness of JS on learning outcomes is inconsistent. This randomised controlled trial compared the effect of JS and TT...

Published: Jan 17, 2026

C

Class-level RCT

Randomization was at the student (within-school) level rather than at the class (or school) level, and the intervention is not one-to-one tutoring.
E

Exam-based Assessment

Outcomes were measured via Likert-scale questionnaires rather than standardized exam-based assessments.
T

Term Duration

The intervention lasted four 45-minute lessons with a posttest immediately after the intervention, which is far shorter than an academic term.
D

Documented Control Group

The control condition (SGM) is clearly described, and baseline outcome data by condition are reported, enabling treatment-control comparison.
S

School-level RCT

Schools were not randomized; instead, students within participating schools/classes were randomly assigned to conditions.
I

Independent Conduct

The study was implemented and evaluated within the project team, with no clearly independent third-party evaluator for conduct and analysis.
Y

Year Duration

The intervention and measurement occurred over a very short period, not over at least 75% of an academic year.
B

Balanced Resources

The two conditions are described as parallelized and appear to match in lesson time and materials; the difference is the instructional activity, not added time/budget.
R

Reproduced Results

No independent replication of this specific experimental study was found in the paper or via web search as of 2026-03-03.
A

All-subject Exams

The study does not use standardized exam outcomes (and thus fails Criterion E), so it cannot meet the all-subject standardized exam requirement.
G

Graduation Tracking

The study did not track students until graduation; additionally, because Criterion Y is not met, Criterion G cannot be met under the ERCT rules.
P

Pre-registered Protocol

No pre-registration statement or registry ID/date was found in the paper, and no corresponding public pre-registration record was identified online.

mathematics
K12
EU

Problem posing and motivation: the effects of posing and solving one’s own modelling problems on autonomy, competence, relatedness, and self-efficacy

Janina Krawitz; Lars Meyer-Jenßen; Katharina Krausmüller; Katrin Rakoczy

According to Self-Determination Theory, it is essential to experience autonomy, competence, and relatedness in order to develop intrinsic motivation and motivational beliefs such as self-efficacy. Posing and solving one’s own modelling problems may support these needs by offering opportunities for...

Published: Jan 19, 2026

C

Class-level RCT

Students (not whole classes) were randomized, so the design is not class-level (or stronger) randomization as required by ERCT C.
E

Exam-based Assessment

Outcomes were measured using a faculty-designed 20-item MCQ rather than a widely recognized standardized exam.
T

Term Duration

The post-test was administered immediately after the sessions, which is far less than one academic term after intervention start.
D

Documented Control Group

The comparison (cadaveric dissection) group and baseline characteristics are described with group sizes and baseline balance checks, meeting the control documentation requirement.
S

School-level RCT

Randomization occurred among individual students, not among schools (or equivalent institutions/sites).
I

Independent Conduct

The paper describes blinding and an independent administrator for allocation, but it does not provide evidence of a truly independent external evaluation team separate from those designing/teaching and building the study instruments.
Y

Year Duration

Outcomes were measured immediately after teaching sessions, far short of 75% of an academic year after intervention start.
B

Balanced Resources

Instructional time and core teaching inputs were standardized across groups, and the control group received a comparable active alternative (cadaveric dissection) rather than a lower-resource condition.
R

Reproduced Results

No independent replication of this specific study was identified in the paper, and a web search did not locate an external replication as of the ERCT check date.
A

All-subject Exams

Because criterion E is not met (no standardized external exam), the all-subject standardized exam requirement is automatically not met.
G

Graduation Tracking

The study does not track participants until graduation, and it also fails the year-duration prerequisite (Y).
P

Pre-registered Protocol

The paper provides IRB approval information but does not provide a public pre-registration record (registry/ID) and date demonstrating registration before data collection.

science
higher education
Asia
EdTech platform

Integration of the Pirogov interactive anatomy table into anatomy teaching: A comparative study with cadaveric dissection

Nguyen Thien Duc, Nguyen An Ninh, Nguyen Phi Trinh, Le Quang Tuyen, Nguyen Van Hung, Dinh Hoang Khanh, Nguyen Van Luat, Nguyen Huu Phuc Dai, Tran Duc Huy, Chu Duc Hoa, Tran Vuong The Vinh

Purposes Anatomy is fundamental in medical education, yet cadaveric dissection faces challenges including limited specimens, high costs, and chemical hazards. Interactive anatomy tables such as the Pirogov system offer innovative alternatives, but evidence from Southeast Asia is limited. Methods In...

Published: Jan 28, 2026

C

Class-level RCT

The study randomized at the student level rather than at the class or school level.
E

Exam-based Assessment

Outcomes were measured with app-administered tests closely aligned to the treatment exercises rather than a standardized external exam.
T

Term Duration

The intervention and outcome window was six weeks, which is shorter than a full academic term.
D

Documented Control Group

The control group is clearly described, including its notification condition and baseline characteristics.
S

School-level RCT

Randomization occurred at the student level rather than the school level.
I

Independent Conduct

The paper does not provide a clear statement that the evaluation was conducted by an independent third party separate from the intervention team.
Y

Year Duration

The intervention and measurement period was six weeks, not an academic year.
B

Balanced Resources

The additional input (messages/notifications) is the treatment being tested, so a no-message control is the appropriate comparison.
R

Reproduced Results

No independent replication study by other authors was found for this specific intervention.
A

All-subject Exams

The study does not measure standardized outcomes across all core subjects, and criterion E is not met.
G

Graduation Tracking

The study does not track participants to graduation and also fails the year-duration prerequisite (criterion Y).
P

Pre-registered Protocol

The paper provides ethics approval and data registration statements, but no evidence of a pre-registered study protocol.

mathematics
K12
Latam
gamification
online homework
EdTech app
mobile learning

Streaks to success: The effects of highlighting streaks on student effort and learning

Raphaëlle Aulagnon, Julian Cristia, Santiago Cueto, Ofer Malamud

We examine whether highlighting streaks - instances of repeated and consecutive behavior when completing learning tasks - encourages 4th to 6th grade students in Peru to increase their use of an online math platform and improve learning. 60,000 students were...

Published: Oct 11, 2025

C

Class-level RCT

The study randomized at the student level rather than the class or school level.
E

Exam-based Assessment

The study used a custom endline test administered through the app rather than a widely recognized standardized exam.
T

Term Duration

The intervention duration was six weeks, which is shorter than the required full academic term.
D

Documented Control Group

The control group is clearly defined as receiving no messages, and their demographics and baseline performance are well-documented.
S

School-level RCT

The study utilized student-level randomization, not school-level randomization.
I

Independent Conduct

The study appears to be conducted by the authors who designed the intervention without a stated independent third-party evaluator.
Y

Year Duration

The intervention lasted six weeks, failing the one-year duration requirement.
B

Balanced Resources

The intervention specifically tested the impact of "nudges" (messages) as the treatment variable, so the lack of messages for the control group is by design and balanced.
R

Reproduced Results

The paper states this is the first experimental study of its kind in this context, and no independent replication is cited.
A

All-subject Exams

The study only assessed math achievement and did not measure outcomes in other main subjects like reading or science.
G

Graduation Tracking

The study followed up immediately after the six-week intervention; no graduation tracking was conducted.
P

Pre-registered Protocol

There is no evidence of a pre-registered protocol with hypotheses and analysis plans in a public registry before data collection.

mathematics
K12
Latam
gamification
online homework
EdTech app
mobile learning

Streaks to success: The effects of highlighting streaks on student effort and learning

Raphaëlle Aulagnon, Julian Cristia, Santiago Cueto, Ofer Malamud

We examine whether highlighting streaks—instances of repeated and consecutive behavior when completing learning tasks—encourages 4th to 6th grade students in Peru to increase their use of an online math platform and improve learning. 60,000 students were randomly assigned to receive...

Published: Aug 1, 2025

C

Class-level RCT

The paper is a within-subject experiment with no random assignment, so it is not a class-level RCT.
E

Exam-based Assessment

Outcomes are based on observational coding, not on standardized, exam-based assessments.
T

Term Duration

The study compares two 10-minute sessions separated by 2 weeks, far shorter than a term-long follow-up.
D

Documented Control Group

The sample and baseline (uninformed) condition are documented with detailed participant characteristics.
S

School-level RCT

There is no school-level randomization; individual dyads were recruited and all experienced both contexts.
I

Independent Conduct

The study does not report being run by an external evaluation organization independent of the authors.
Y

Year Duration

The study spans 2 weeks between sessions, not a full academic year.
B

Balanced Resources

Time and materials are essentially balanced across contexts; the main difference is the informational prime and a different toy set to reduce familiarity effects.
R

Reproduced Results

No independent replication study was identified for this 2025 paper.
A

All-subject Exams

The study does not use standardized exams in any subject, and it does not assess outcomes across all core subjects.
G

Graduation Tracking

The study does not track participants until graduation and cannot meet G because it also fails the Year Duration criterion.
P

Pre-registered Protocol

The authors state the analyses were not preregistered.

mathematics
kindergarten
China
parent involvement

Navigating Trade-Offs in Early Math: How Informational Priming Influences Parent-Child Interaction

Linxi Lu, Marina Vasilyeva, Elida V. Laski

Home math interventions often incorporate informational priming-explicit prompts emphasizing parental math input. While effective in increasing math talk, its impact on child outcome is mixed. This study examined how informational priming shapes the content and dynamic of math interactions. In...

Published: Aug 21, 2025

C

Class-level RCT

Randomization was at the individual participant level rather than at the class (or school) level.
E

Exam-based Assessment

Outcomes were scored via teacher ratings and an internal AI judge, not via a standardized externally administered exam.
T

Term Duration

Outcomes were measured within short sessions rather than at least one academic term after the intervention began.
D

Documented Control Group

The control condition and sample are clearly documented, including what the control group could and could not do.
S

School-level RCT

The study did not randomize at the school (site) level.
I

Independent Conduct

The study was proposed, designed, executed, and analyzed by the author team, not by an independent evaluator.
Y

Year Duration

The study does not measure outcomes over a full academic year, and criterion T is not met.
B

Balanced Resources

Time-on-task is held constant across groups and the tested difference is tool access, not extra instructional time or budget.
R

Reproduced Results

No independent replication of this specific study was identified at the time of this ERCT check.
A

All-subject Exams

The study does not use standardized exams across all main subjects, and criterion E is not met.
G

Graduation Tracking

The study does not track participants to graduation and criterion Y is not met.
P

Pre-registered Protocol

The paper reports IRB approval but no public preregistration record was identified.

language arts
higher education
US
EdTech website

Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

Nataliya Kosmyna, Eugene Hauptmann, Ye Tong Yuan, Jessica Situ, Xian-Hao Liao, Ashly Vivian Beresnitzky, Iris Braunstein, Pattie Maes

With today's wide adoption of LLM products like ChatGPT from OpenAI, humans and businesses engage and use LLMs on a daily basis. Like any other tool, it carries its own set of advantages and limitations. This study focuses on finding...

Published: Jun 10, 2025

C

Class-level RCT

Individual‑level randomization within one class violates the class‑level RCT requirement.
E

Exam-based Assessment

Assessments were custom course quizzes and a final, not a standardized exam.
T

Term Duration

Outcomes were collected after only five weeks, not a full term.
D

Documented Control Group

Demographics and baseline performance for the control group are fully reported.
S

School-level RCT

Randomization did not occur at the school level.
I

Independent Conduct

The same team designed, implemented, and evaluated the study.
Y

Year Duration

Study duration was five weeks, not a full academic year.
B

Balanced Resources

Control and treatment groups received equivalent emails and incentives; no extra resources favored treatment.
R

Reproduced Results

No independent replication of this RCT is reported.
A

All-subject Exams

Outcomes are limited to a single STEM course, not all subjects.
G

Graduation Tracking

No tracking beyond the short 5‑week course was conducted.
P

Pre-registered Protocol

No evidence of prospective trial registration is provided.

science
higher education
US
EdTech platform

Does Inducing Students to Schedule Lecture Watching in Online Classes Improve Their Academic Performance? An Experimental Analysis of a Time Management Intervention

Rachel Baker, Brent Evans, Qiujie Li, Bianca Cung

Time‑management skills are an essential component of college student success, especially in online classes. Through a randomized control trial of students in a for‑credit online course at a public 4‑year university, we test the efficacy of a scheduling intervention aimed...

Published: Jul 28, 2018

C

Class-level RCT

Random assignment occurred at the individual student level, not by entire class or school.
E

Exam-based Assessment

The study employed custom 22‑item free‑response tests rather than a standardized exam.
T

Term Duration

Outcomes were measured over a seven‑day period, not a full term.
D

Documented Control Group

The negative control group’s composition and baseline data are clearly documented.
S

School-level RCT

Randomization was at the individual student level, not by school.
I

Independent Conduct

The intervention was designed, delivered, and assessed by the same team without independent oversight.
Y

Year Duration

The study tracked outcomes over one week, not a full academic year.
B

Balanced Resources

The extra evening sessions are central to the intervention and thus the unmatched control is acceptable.
R

Reproduced Results

No independent replication of this RCT is reported.
A

All-subject Exams

Only stereochemistry outcomes were measured, and E was not met.
G

Graduation Tracking

Tracking ended after one week; no graduation‑level follow‑up.
P

Pre-registered Protocol

No pre‑registration of the study protocol is mentioned.

science
higher education
US
flipped classroom
blended learning
EdTech platform
digital assessment

Dissecting the Flipped Classroom: Using a Randomized Controlled Trial Experiment to Determine When Student Learning Occurs

Matthew D. Casselman, Kinnari Atit, Grace Henbest, Cybill Guregyan, Kiana Mortezaei, and Jack F. Eichler

The use of the flipped classroom approach in higher education STEM courses has rapidly increased over the past decade, and it appears this type of learning environment will play an important role in improving student success and retention in undergraduate...

Published: Oct 21, 2019

C

Class-level RCT

Randomization was at the individual student level within one cohort (not class- or school-level), and the study is not a one-to-one tutoring exception.
E

Exam-based Assessment

Outcomes relied on researcher-developed knowledge and OSCE-based performance instruments rather than widely recognized standardized exams.
T

Term Duration

The intervention lasted one week and the longest reported follow-up was one month, which is shorter than a full academic term.
D

Documented Control Group

The control group is clearly described (non-interactive app), with sample sizes and baseline demographics reported for comparison.
S

School-level RCT

Participants were randomized as individual students in a single university program, not by school/site.
I

Independent Conduct

The authors’ team developed the intervention and ran the study; an independent statistician and blinding help internal validity but do not demonstrate an independent, third-party evaluation.
Y

Year Duration

The reported follow-up is only up to one month, so the study does not approach 75% of an academic year.
B

Balanced Resources

Both groups received the same content and schedule; differences in instructor feedback and interaction are integral to the interactivity treatment being tested.
R

Reproduced Results

No independent replication of this specific i-BreathGuard vs n-BreathGuard RCT was found in available sources as of the ERCT check date.
A

All-subject Exams

Because criterion E is not met (no standardized exams), criterion A cannot be met; additionally, outcomes are specific clinical competence measures rather than all core subjects.
G

Graduation Tracking

The study’s follow-up ends at one month post-intervention, far short of tracking participants through graduation, and criterion Y is not met.
P

Pre-registered Protocol

The paper reports an IRCT registration and registration date, but the IRCT registry entry could not be accessed to verify registration timing relative to the actual start of data collection.

higher education
Asia
EdTech app
mobile learning
formative assessment

Challenges in educating nursing students on mechanical ventilation: the case for interactive mobile learning tools – a randomized controlled trial

Alireza Samimi Sarangi, Kolsoum Deldar, Ahmad Bagheri Moghaddam, and Razieh Froutan

Competence in mechanical ventilation management is a critical component of nursing education. Technology-enhanced learning strategies may support skill acquisition. This randomized clinical trial evaluated the effectiveness of an interactive mobile application compared with a non-interactive version in improving nursing students’...

Published: Feb 16, 2026

C

Class-level RCT

Participants were randomized as individuals, not as intact classes (and this is not a one-to-one tutoring exception).
E

Exam-based Assessment

Outcomes were measured with a study-constructed multiple-choice test and an OSCE checklist developed for this study, not a widely recognized standardized exam.
T

Term Duration

Outcomes were assessed immediately around a single 2-hour training session rather than at least one academic term after the intervention began.
D

Documented Control Group

The lecture-based control condition is clearly described and the paper reports group sizes and baseline characteristics.
S

School-level RCT

The trial randomized individual trainees rather than randomizing schools (or other institution-level sites) to conditions.
I

Independent Conduct

The VR program was developed by the same hospital department that ran the study, and the paper does not document an external independent evaluation team (despite using blinded raters and an independent statistician for some tasks).
Y

Year Duration

Outcomes were assessed immediately after a 2-hour intervention, so year-long tracking is absent; additionally, since criterion T is not met, criterion Y is automatically not met.
B

Balanced Resources

Both arms received equal instructional time (2 hours) and the same core curriculum content; the VR hardware/software is integral to the intended treatment contrast.
R

Reproduced Results

No independent replication by a non-overlapping author team was found for this specific study/intervention.
A

All-subject Exams

Because criterion E is not met, criterion A is automatically not met; additionally, the study does not assess standardized exams across all core subjects.
G

Graduation Tracking

The paper does not track participants to graduation, and because criterion Y is not met, criterion G is automatically not met.
P

Pre-registered Protocol

The study reports a registration date of February 5, 2025, which is after the stated study period (January 2024 to October 2024), so the protocol was not pre-registered before the study began.

science
higher education
China
EdTech platform

Effectiveness of VR and traditional training in medical education for mass casualty management: an OSCE-based randomized controlled trial

Zhe Li, Wan Chen, Guozheng Qiu, Lei Shi, Yutao Tang, Xibin Xu, Sanshan Zhu, and Liwen Lyu

Objective This study aimed to evaluate the effectiveness of VR-based training compared to traditional lecture-based training for medical trainees in managing MCIs, specifically focusing on road traffic accidents. The primary assessment was performed using an Objective Structured Clinical Examination (OSCE)...

Published: Feb 7, 2026

C

Class-level RCT

Randomisation was conducted among individual students rather than by class (or school), so contamination across classmates remains plausible.
E

Exam-based Assessment

The main knowledge outcome (NKT) was researcher-prepared and aligned to the delivered nutrition education, rather than a widely recognised standardised exam.
T

Term Duration

The main between-group outcome comparison is at the 8-week post-test, which is shorter than one academic term.
D

Documented Control Group

The control group is clearly described, including the control condition, sample size, and baseline characteristics reported alongside outcomes.
S

School-level RCT

The study was conducted in a single school and randomised individuals, not schools.
I

Independent Conduct

The intervention and analysis were conducted by the study team without clear third-party evaluation independent of the intervention designers.
Y

Year Duration

The study does not measure outcomes for at least 75% of an academic year after intervention start, and term-duration (T) is not met.
B

Balanced Resources

Although the intervention adds instructional time and materials, these inputs are integral to the education programme being tested against a business-as-usual control.
R

Reproduced Results

No independent replication of this specific trial was identified in the paper or via post-publication searches.
A

All-subject Exams

Exam-based assessment (E) is not met, so all-subject exams (A) cannot be met; additionally, outcomes are nutrition/health measures rather than standardised exams across core school subjects.
G

Graduation Tracking

Year duration (Y) is not met and no graduation tracking (or follow-up paper tracking the cohort to graduation) was identified.
P

Pre-registered Protocol

The reported registration date is after the reported data collection period, so the protocol was not prospectively pre-registered.

science
K12
Asia
parent involvement

Dietitian-Led School-Based Nutrition Education and Its Effects on Knowledge, Attitudes, Behaviours and Anthropometric Measurements: A Randomised Controlled Trial

Çağlar Akçalı, Halil Karadas, and Nisa Nur Ayhanci

Background: This study aimed to evaluate the effects of a dietitian-led, school-based nutrition education programme on primary school students’ nutrition knowledge, attitudes, behaviours, and anthropometric measurements. Methods: A randomised controlled, prospective design was conducted in a primary school in Mardin...

Published: Feb 9, 2026

C

Class-level RCT

The study randomized individual students, not entire classes or schools.
E

Exam-based Assessment

Assessments used researcher-developed quizzes and checklists, not standardized exams.
T

Term Duration

The intervention period was 6 weeks, shorter than a full academic term.
D

Documented Control Group

The control group's demographics, baseline characteristics, and treatment (routine FC) are documented.
S

School-level RCT

Randomization was at the student level within a single university, not at the school level.
I

Independent Conduct

The same authors appear to have designed the intervention, conducted the study, and analyzed the data, with no mention of independent conduct.
Y

Year Duration

The study duration, including data collection, was 11 weeks, which is less than a full academic year. Also, Criterion Y was not met.
B

Balanced Resources

The intervention group received gamified activities (extra quizzes, points, badges) which constitute additional resources/ engagement time compared to the control group's routine FC, and this difference was not explicitly tested as the treatment variable nor balanced.
R

Reproduced Results

The paper does not mention any independent replication of this specific study by another research team.
A

All-subject Exams

The study measured only nursing skills competency and related factors, not performance across all core academic subjects. Also, Criterion E was not met.
G

Graduation Tracking

The study tracked students for 11 weeks, not until graduation. Also, Criterion Y was not met.
P

Pre-registered Protocol

The study was prospectively registered on ClinicalTrials.gov before the intervention likely started based on the semester timing.

higher education
Africa
flipped classroom
gamification
blended learning
EdTech platform

Effect of gamified flipped classroom on improving nursing students' skills competency and learning motivation: a randomized controlled trial

Mohamed E. H. Elzeky, Heba M. M. Elhabashy, Wafaa G. M. Ali and Shaimaa M. E. Allam

Background: Flipped learning excessively boosts the conceptual understanding of students through the reversed arrangement of pre-learning and in classroom learning events and challenges students to independently achieve learning objectives. Using a gamification method in flipped classrooms can help students stay...

Published: Nov 16, 2022

C

Class-level RCT

Randomization occurred at the individual participant level within sessions (not class- or school-level), and the tutoring exception does not apply.
E

Exam-based Assessment

Outcomes are cheating behavior measured via an experimental logic-puzzle paradigm and self-report, not standardized exam- based educational achievement assessments.
T

Term Duration

The intervention and main outcome were completed within a single short experimental session (minutes), not tracked for at least one academic term after the intervention began.
D

Documented Control Group

The control condition is described in detail (content, structure, matching, and group sizes with baseline comparisons), enabling clear treatment-control comparison.
S

School-level RCT

Participants were recruited from a single university and randomized individually; schools/sites were not randomized.
I

Independent Conduct

The trial sessions were administered by the study team (a single primary researcher), without documented independent conduct by a third-party evaluator.
Y

Year Duration

Outcomes were measured immediately within a single session (and all cohorts completed within 2 days), far short of 75% of an academic year; also, since T is not met, Y is not met.
B

Balanced Resources

The control condition is an active, structurally parallel program matched to the intervention in length, format, and engagement, indicating balanced time and activity inputs across groups.
R

Reproduced Results

No independent replication of this specific RCT (growth mindset intervention reducing cheating) by other authors was identified, and the paper itself frames the study as providing first causal evidence.
A

All-subject Exams

Because criterion E is not met (no standardized exam-based achievement assessment), criterion A is also not met; the study does not assess all core subjects via standardized exams.
G

Graduation Tracking

The study provides only immediate/short-term measurement and no tracking to graduation; additionally, because Y is not met, G is not met.
P

Pre-registered Protocol

No protocol pre-registration registry/ID (and no registration date showing registration before data collection) is reported.

higher education
China
EdTech platform

Reducing academic cheating through growth mindset: an intervention study and a mechanism analysis

Song Chang, Yinghua Bao, Chengyou Zhang, Min Xu, Yunyun Huang, and Sufei Xin

Academic dishonesty remains a persistent challenge in higher education, highlighting the need for scalable and cost-effective interventions that target internal motivation. Building on mindset theory, the present research tests the impact of a brief growth- mindset intervention on exam cheating...

Published: Feb 18, 2026

C

Class-level RCT

Randomization was conducted at the small-group level (and not fully rigorous), not at the class (or higher) level.
E

Exam-based Assessment

The primary educational outcome (CVS comprehension) was measured using a research instrument rather than a widely recognized standardized exam.
T

Term Duration

The main outcome measurement occurred within weeks (eight-week period and a post-test two weeks later), not at least one academic term after the intervention began.
D

Documented Control Group

The control condition is clearly described, and baseline/descriptive information by condition is reported.
S

School-level RCT

The study took place in multiple schools, but randomization was not conducted at the school level.
I

Independent Conduct

The paper does not clearly document that an independent third-party evaluation team conducted the study and analyses.
Y

Year Duration

Outcome measurement occurred far short of 75% of an academic year, and criterion T is not met.
B

Balanced Resources

The intervention and control conditions appear time- and materials-structured similarly, differing mainly in contextual embedding of otherwise identical tasks.
R

Reproduced Results

No independent replication of this specific study was found in the paper or via internet searching as of the ERCT check date.
A

All-subject Exams

Criterion E is not met, and the study does not assess all core subjects using standardized exams.
G

Graduation Tracking

The study does not track students until graduation, and criterion Y is not met.
P

Pre-registered Protocol

No pre-registration (registry, ID/link, and pre-data-collection timing) is reported in the paper, and no corresponding entry was found via internet searching.

science
K12
EU

The role of working memory for learning with context-personalized tasks in elementary school

Ann-Kathrin Laufs, André Meyer, Maleika Krüger, and Sebastian Kempert

Context personalization is an instructional approach aimed at enhancing students’ engagement and cognitive processing by embedding learning content in familiar contexts. Numerous studies explore the benefits of personalized tasks for learning, but few empirically examine cognitive mechanisms underlying the effects...

Published: Feb 13, 2026

C

Class-level RCT

The unit of randomization was the intact class section, which satisfies the class-level RCT requirement.
E

Exam-based Assessment

The outcome measure was a researcher-created posttest rather than a standardized, widely recognized exam.
T

Term Duration

Outcomes were measured immediately after a three-session unit, which is shorter than one academic term from start to measurement.
D

Documented Control Group

Although the control activities are described, the posttest-only design provides no baseline performance data documenting control group comparability.
S

School-level RCT

The trial randomized two classes within one school rather than randomizing across schools.
I

Independent Conduct

The paper does not document an independent evaluation team and states the author conducted all study components themselves.
Y

Year Duration

Outcomes were measured after a three-session unit and the authors state long-term outcomes were not measured, failing the 75%-of- year tracking requirement.
B

Balanced Resources

Both conditions used the same scheduled class time and the same delivery platform, with no evidence of extra real resources given only to the intervention group.
R

Reproduced Results

No independent replication by other authors was identified, and the paper frames this study type as novel in this domain.
A

All-subject Exams

The study does not use standardized exams (Criterion E not met), so it cannot satisfy the all-subject standardized exam requirement.
G

Graduation Tracking

The study does not include graduation tracking and also fails the year-duration prerequisite (Criterion Y).
P

Pre-registered Protocol

The paper provides no protocol pre-registration ID or link, and no registry record could be verified from the information provided.

social studies
K12
US
project-based learning
EdTech platform

A randomized controlled trial of project-based learning for middle-school financial literacy

Ishaan Mishra and Gábor Orosz

Financial literacy remains low among U.S. middle school students, while engagement with traditional instruction often declines. This class-randomized, posttest-only trial (two intact sections) compared Project-Based Learning and Standards-Based Learning in a budgeting unit delivered via Nearpod® across three 40-min class...

Published: Feb 5, 2026

C

Class-level RCT

Random assignment was at the individual student level rather than at the class (or school) level.
E

Exam-based Assessment

Outcomes were measured with self-report items and diaries rather than standardized exam-based academic assessments.
T

Term Duration

Outcomes were measured over a seven-day period with a one-week follow-up, which is much shorter than one academic term.
D

Documented Control Group

The PTS control condition is described, group sizes are reported, and baseline comparability is discussed.
S

School-level RCT

The unit of randomization is individual students, not schools/sites.
I

Independent Conduct

The study procedures describe researcher-led implementation and fidelity checks without stating an independent external evaluator.
Y

Year Duration

The study duration is far shorter than 75% of an academic year, and ERCT rules also require Y to be false when T is false.
B

Balanced Resources

The control condition is described as time- and structure-matched, and the additional MCII fidelity feedback appears limited and integral to delivering MCII rather than a large resource imbalance.
R

Reproduced Results

No independent replication of this specific RCT was found in the paper or via external searching.
A

All-subject Exams

The study does not use standardized exams and therefore cannot meet the all-subject standardized exam requirement (and A fails when E fails).
G

Graduation Tracking

The study reports only a one-week follow-up and no graduation tracking; additionally, G must be false when Y is false.
P

Pre-registered Protocol

The paper explicitly states the study was not preregistered.

higher education
China
EdTech platform

Beyond positive thinking: A randomized trial of mental contrasting with implementation intentions to curb academic procrastination

Xiaoxue Zhou, Walton Wider, Hao Wu, Yong Xu, Manping Qin, Alex S. Borromeo

Academic procrastination is a pervasive challenge in higher education, particularly within low-structure, digitally mediated learning environments. This study investigates the efficacy of Mental Contrasting with Implementation Intentions (MCII) as an intervention to reduce procrastination among undergraduate students in a private...

Published: Jan 6, 2026

C

Class-level RCT

Students (not intact classes or schools) were randomly assigned, so the trial is not a class-level (or stronger) RCT.
E

Exam-based Assessment

Outcomes were measured with Likert-scale questionnaire items rather than a standardized exam-based assessment.
T

Term Duration

The intervention and outcome measurement occurred over four 45-minute lessons with an immediate posttest, not at least one academic term after the intervention began.
D

Documented Control Group

The control condition (SGM) is clearly described and baseline outcome data (pretest self-efficacy) are reported by condition.
S

School-level RCT

The study involved two schools but did not randomize at the school level; students were randomized within schools into conditions.
I

Independent Conduct

The paper does not document a clearly independent third-party evaluation; implementation and monitoring were conducted by project- affiliated master’s students.
Y

Year Duration

Because term-duration tracking is not met (and the study is far shorter than 75% of an academic year), the year-duration criterion is not satisfied.
B

Balanced Resources

The control condition is parallelized in instructional time and structure (same number/length of lessons and similar teaching methods), so resources are balanced.
R

Reproduced Results

No peer-reviewed independent replication of this specific PSOM vs. SGM experiment was found in the paper or via an internet literature search.
A

All-subject Exams

Because the study does not use standardized exam-based assessments (Criterion E is not met), the all-subject standardized-exams criterion is not met.
G

Graduation Tracking

The study does not track students through graduation and (because Year Duration is not met) Graduation Tracking cannot be met.
P

Pre-registered Protocol

The paper provides no registry link/ID or dated statement indicating pre-registration before data collection.

mathematics
K12
EU

Problem posing and motivation: the effects of posing and solving one’s own modelling problems on autonomy, competence, relatedness, and self-efficacy

Janina Krawitz, Lars Meyer-Jenßen, Katharina Krausmüller, Katrin Rakoczy

According to Self-Determination Theory, it is essential to experience autonomy, competence, and relatedness in order to develop intrinsic motivation and motivational beliefs such as self-efficacy. Posing and solving one’s own modelling problems may support these needs by offering opportunities for...

Published: Jan 19, 2026

C

Class-level RCT

Randomization was at the individual participant level (dental interns), not at the class (or stronger) level, and no tutoring exception applies.
E

Exam-based Assessment

The outcome was a researcher-developed 15-question MCQ quiz rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were measured immediately after a brief 15-minute quiz, not at least one academic term after the intervention began.
D

Documented Control Group

The control group is clearly defined (baseline knowledge, no AI), its size is reported, and its conditions are described alongside the intervention group and outcome tables.
S

School-level RCT

Randomization occurred among individual interns within one institution, not among schools (or equivalent educational sites).
I

Independent Conduct

The paper does not document an independent third-party evaluation; instead, the study team designed the assessment and ran the study.
Y

Year Duration

The study does not track outcomes for 75% of an academic year and does not provide term-length follow-up.
B

Balanced Resources

Although the AI-assisted group received additional resources (internet and AI tools), these resources are the explicit treatment being tested against baseline practice, with time-on-task held constant.
R

Reproduced Results

No independent replication of this specific study was found, and the paper only cites other related AI education RCTs rather than replications of this experiment.
A

All-subject Exams

This criterion is not met because exam-based assessment (E) is not met; additionally, outcomes focus on a single case-based quiz rather than standardized exams across all core subjects.
G

Graduation Tracking

The study does not track participants until graduation and, since year duration (Y) is not met, graduation tracking (G) is not met.
P

Pre-registered Protocol

The paper provides ethics approval details but does not report a public pre-registration record, registry ID, or a registration date before data collection.

science
higher education
Asia
EdTech app

Impact of artificial intelligence on task performance and perceived task load: a pragmatic randomized experiment

Ghalia Y. Bhadila, Dania Bahdila, Nujud O. Saber, and Dana A. Alyafi

Introduction: This study aimed to assess the impact of artificial intelligence (AI) assistance on immediate task performance and evaluate perceived task load and AI acceptance among dental interns in an educational setting. Methods: A pragmatic experiment was conducted among 132...

Published: Jan 16, 2026

C

Class-level RCT

Students (not intact classes/schools) were randomized to groups, so class-level randomization was not used.
E

Exam-based Assessment

The primary knowledge outcome used a researcher-prepared test, not a widely recognized standardized exam.
T

Term Duration

Outcomes were measured over about six weeks from start to latest follow-up, which is shorter than a full academic term.
D

Documented Control Group

The control condition and baseline characteristics are described, including time/resources for control and demographic comparability.
S

School-level RCT

Randomization was not conducted at the school/site level; it was done among students within one university program.
I

Independent Conduct

The authors/researchers carried out core trial activities with no clear independent evaluation team or blinded outcome assessment.
Y

Year Duration

The study duration is far shorter than 75% of an academic year, and T is not met, so Y cannot be met.
B

Balanced Resources

Both groups received substantial instructional time and lab practice; the main contrast is collaborative versus individual practice rather than a clear one-sided resource increase.
R

Reproduced Results

No independent peer-reviewed replication of this specific RCT was found as of the ERCT check date.
A

All-subject Exams

Criterion E is not met and the study does not assess all core subjects with standardized exams.
G

Graduation Tracking

The study reports only short-term follow-up (4 weeks) and does not track participants until graduation; Y is also not met.
P

Pre-registered Protocol

The trial was registered on the date the study began (and not clearly before study start), so it does not meet pre-registration.

science
higher education

The effect of collaborative learning approach on nursing students’ knowledge and skill learning for enteral nutrition: a randomized controlled study

Aysun Acun and Rahime Aksoy Bulgurcu

Collaborative learning is one of the important interactive teaching methods in teaching nursing practices. This study aimed to examine the impact of the collaborative learning approach on nursing students’ knowledge levels and self-directed learning skills related to enteral nutrition. This...

Published: Jan 29, 2026

C

Class-level RCT

Randomization was at the individual student level rather than at the class (or higher) level.
E

Exam-based Assessment

The outcome exams were locally developed (Test A/Test B) rather than a widely recognized standardized exam.
T

Term Duration

Outcomes were measured immediately after teaching within a 2-week rotation, not at least one academic term after intervention start.
D

Documented Control Group

The control group condition and baseline comparability (including group sizes and baseline demographics/scores) are clearly reported.
S

School-level RCT

The study randomized students, not schools or comparable institutional units, so it is not a school-level RCT.
I

Independent Conduct

The authors developed the 3D animations and also performed core evaluation activities (including analysis), with no clear independent evaluator.
Y

Year Duration

The intervention and outcome measurement occurred within a 2-week rotation with immediate post-testing, far shorter than 75% of an academic year; also, Y cannot be met if T is not met.
B

Balanced Resources

Teaching time/objectives were standardized across groups and the main difference was the instructional medium; any extra post-session access to animations is unlikely to affect the immediate outcome.
R

Reproduced Results

No independent, peer-reviewed replication of this specific RCT was identified.
A

All-subject Exams

The study did not use standardized exams (so E is not met), and it assessed only renal pathology/nephrology knowledge rather than all core subjects.
G

Graduation Tracking

The paper reports no long-term follow-up (including no tracking to graduation), and G cannot be met if Y is not met.
P

Pre-registered Protocol

The trial was registered retrospectively on 2025-11-18 after the study period (August–October 2025), so it was not pre-registered.

science
higher education
China

Application of renal pathology three-dimensional animation in clinical clerkship teaching for medical students: a randomized-controlled trial

Qi Zhang, Yingxin Xie, Feng Ding, Shuai Ma, and Wenji Wang

Objectives Teaching renal pathology, characterized by complex spatial relationships and dynamic pathological processes, poses significant challenges. Traditional methods such as static slides often fail to convey these concepts effectively. This study evaluated the efficacy of custom-developed renal pathology three-dimensional (3D)...

Published: Jan 24, 2026

C

Class-level RCT

Randomization was at the individual student level (crossover), not at the class (or higher) level, so ERCT class-level randomization is not satisfied.
E

Exam-based Assessment

Outcomes were measured using study-developed MCQs (even though validated) rather than a widely recognized standardized exam.
T

Term Duration

The primary outcome was measured one week after instruction, which is far shorter than a full academic term.
D

Documented Control Group

The control condition and baseline/demographic information are reported, enabling meaningful comparison.
S

School-level RCT

This was not a school-/site-level randomized trial; it was conducted in one institution and randomized individual students.
I

Independent Conduct

The paper does not document independent third-party conduct of the trial’s implementation and evaluation.
Y

Year Duration

Outcomes were measured on the scale of a week, not at least 75% of an academic year (and T is also not met).
B

Balanced Resources

The control group received additional study time, and the additional faculty-moderated discussion/support is an explicit and integral part of the intervention package being tested.
R

Reproduced Results

No independent, peer-reviewed replication of this specific intervention/trial was identified as of the ERCT check date.
A

All-subject Exams

Because the study does not use standardized exams (E not met), it cannot meet the all-subject standardized exam requirement.
G

Graduation Tracking

The study measures only immediate outcomes and does not track participants to graduation (and Y is not met).
P

Pre-registered Protocol

A registry ID and registration date are provided, but it cannot be verified from the paper (or accessible registry data) that registration occurred before the first data collection/enrollment.

science
higher education
Asia
formative assessment

Student-generated multiple-choice questions enhance deeper learning in dental materials education: a randomized crossover trial

Pankaj Gupta, Karthik Shetty, Heeresh Shetty, and Kulvinder Singh Banga

Dental materials education poses unique challenges due to the complex integration of scientific principles with clinical applications. Traditional teaching methods often fail to promote deep conceptual understanding. This study investigated whether the process of generating multiple-choice questions (MCQs) by students...

Published: Jan 23, 2026

C

Class-level RCT

Randomization was at the individual student level (not class- or school-level), and no one-to-one tutoring exception is stated.
E

Exam-based Assessment

Outcomes rely on manually graded EiPE responses and survey items rather than a widely recognized standardized exam.
T

Term Duration

The activity and measurement occur in a short time window (late in a semester) rather than at least one full term after the intervention begins.
D

Documented Control Group

The paper describes what the control group received and gives group sizes, but does not report detailed control group demographics and baseline characteristics as required by ERCT.
S

School-level RCT

Randomization was not conducted at the school (or site) level; it was conducted at the student level in a single university course.
I

Independent Conduct

Key measurement and analysis were conducted by the author team, with no clearly described independent third-party evaluation.
Y

Year Duration

The study is a short activity with immediate post measures and does not track outcomes for at least 75% of an academic year (and T is not met).
B

Balanced Resources

The treatment adds transparency information and a quiz as the treatment variable being tested; the extra time/inputs are integral to the intervention rather than a confound.
R

Reproduced Results

No independent replication of this specific transparency RCT was found in the paper or via external literature search.
A

All-subject Exams

Because the study does not use standardized exam-based assessments (E not met), it cannot satisfy the all-subject standardized exams requirement.
G

Graduation Tracking

The study does not track participants through graduation; additionally, Y is not met, so G cannot be met under ERCT.
P

Pre-registered Protocol

No protocol registry link/ID or dated statement indicating pre-registration prior to data collection was found in the paper or via registry-focused search.

science
higher education
US
EdTech platform
digital assessment
formative assessment

The Effect of Transparency on Students’ Perceptions of AI Graders

Joslyn Orgill, Andra Rice, Max Fowler, Seth Poulsen

The development of effective autograders is key for scaling assessment and feedback. While NLP based autograding systems for open-ended response questions have been found to be beneficial for providing immediate feedback, autograders are not always liked, understood, or trusted by...

Published: Jan 2, 2026

C

Class-level RCT

Randomization was at the individual child level (not class- or school-level), and the paper does not frame the intervention as a tutoring-style exception.
E

Exam-based Assessment

The main pre/post outcomes are researcher-created homonym measures rather than standardized exam-based assessments.
T

Term Duration

Post-testing occurred about one week after a short (~2-week) intervention, far shorter than one academic term from intervention start.
D

Documented Control Group

The paper clearly describes what the control groups received and reports baseline comparability checks between conditions.
S

School-level RCT

Participants were drawn from multiple schools, but randomization was not conducted at the school level.
I

Independent Conduct

The intervention delivery and assignment were carried out by the research team, and no independent evaluation team is documented.
Y

Year Duration

The intervention and follow-up span only weeks, far below 75% of an academic year, and Criterion Y is not met when Criterion T is not met.
B

Balanced Resources

Study 2 likely provides more experimenter time per child in the inference condition (individual) than in the control (pairs), and this resource imbalance is not framed as the treatment variable.
R

Reproduced Results

No independent replication by a different author team was found or documented; the paper reports two trials conducted by the same research team.
A

All-subject Exams

Standardized exams across all core subjects are not used, and per the ERCT dependency rule, Criterion A is not met because Criterion E is not met.
G

Graduation Tracking

The study does not track participants to graduation, and under ERCT rules Criterion G is not met because Criterion Y is not met.
P

Pre-registered Protocol

Although Study 2 is stated to be pre-registered on OSF, the registration record/date could not be verified and the paper does not provide dates to confirm pre-registration occurred before data collection.

reading
language arts
L2 languages
K12
UK

Lexical inference training for homonyms: Two randomized controlled trials for children with English as a first and an additional language

Sophie A. Booton; Julia M. H. Birchenough; Katie Gilligan-Lee; Fiona Jelley; Victoria A. Murphy

Background: Many words have multiple meanings, which present challenges to learning, yet research has yet to identify effective interventions for homonyms. Lexical inference may be a promising strategy. Aim: To evaluate a brief, novel lexical inference intervention for homonyms. Samples:...

Published: Jan 5, 2026

C

Class-level RCT

Randomization was at the individual student level (within one course cohort), not at the class or school level, and no tutoring exception applies.
E

Exam-based Assessment

Outcomes were measured via self-report questionnaires (and the OSCE was ungraded with no recorded scores), not via standardized exam-based assessment outcomes.
T

Term Duration

The intervention-to-outcome interval is within the same semester (week 7 OSCE vs week 12 final exam), which is shorter than a full academic term follow-up from intervention onset.
D

Documented Control Group

The control condition is described, but the paper does not provide clearly reported control-group baseline characteristics and/or baseline performance separately by group.
S

School-level RCT

The study randomized individual students within one university course, not multiple schools (or equivalent institutions/sites).
I

Independent Conduct

The paper does not report an independent external evaluation team; core study roles were performed by the author group.
Y

Year Duration

Outcomes were measured within a single semester (week 7 OSCE to week 12 exam), far short of 75% of an academic year.
B

Balanced Resources

Both groups received the same OSCE resources (same format/stations and duration), and the only difference (timing) is the intended treatment variable, so resource imbalance does not confound the intervention effect.
R

Reproduced Results

As of the ERCT check date, no independent peer-reviewed replication of this specific randomized timing study was found.
A

All-subject Exams

Criterion E is not met (no standardized exam-based outcomes), therefore criterion A cannot be met under the ERCT rules.
G

Graduation Tracking

The study measured outcomes only through the final exam within the same semester and (since criterion Y is not met) cannot satisfy graduation tracking; no follow-up-to-graduation papers were found.
P

Pre-registered Protocol

No pre-registration statement, registry/platform ID, or timing evidence is provided, and no external preregistration record was found by DOI/title searches.

science
higher education
EU

Impact of a one-time formative OSCE on learning behavior and self-assessment in dental undergraduate education

Thekla J. Pfeiffer-Grötz, Friederike Basten, Anke Hollinderbäumer, Lisa Zöll, and James Deschner

Background With the introduction of the new dental licensing regulations (ZApprO) in Germany, preclinical teaching time was substantially reduced, particularly affecting practical training. To support students’ learning under these conditions, a formative Objective Structured Clinical Examination (OSCE) was implemented early...

Published: Jan 9, 2026

C

Class-level RCT

Randomisation was at the individual learner level, not classes.
E

Exam-based Assessment

Outcomes used a course-specific final test rather than a recognised standardised exam.
T

Term Duration

Outcomes were measured within a short-duration course without term-long follow-up.
D

Documented Control Group

The control condition is described, but baseline and full control-group characteristics are not documented for most participants.
S

School-level RCT

Randomisation was not conducted at the school (institution) level.
I

Independent Conduct

The paper describes the authors implementing the intervention themselves rather than an independent evaluator.
Y

Year Duration

Outcomes were not tracked for a full academic year after the intervention began.
B

Balanced Resources

Any added time from writing responses is the treatment itself and is described as minimal.
R

Reproduced Results

No independent replication of this specific RCT is identified.
A

All-subject Exams

Because E is not met, A is automatically not met.
G

Graduation Tracking

Because Y is not met, G is automatically not met; no evidence of graduation tracking was found.
P

Pre-registered Protocol

The paper provides an OSF link for data and code, but it does not report a pre-registered protocol with a pre-data-collection date.

adult education
Asia
EdTech website
EdTech platform
digital assessment

Reflection prompts are not promising: a randomised controlled trial in a short MOOC shows no positive effects

Anastasiia Kapuza and Lilit Andriasian

This study investigates the effectiveness of brief reflection interventions designed to support self-regulated learning in a short, Massive Open Online Course for in-service teachers. Two types of text-based reflection prompts were tested in a randomised controlled trial with over 5,000...

Published: Aug 12, 2025

C

Class-level RCT

The paper does not describe any randomization at the class level.
E

Exam-based Assessment

No standardized exam-based assessment is implemented in the paper.
T

Term Duration

No term-long outcome measurement is reported in the paper.
D

Documented Control Group

Control group demographics and baseline data are not provided.
S

School-level RCT

No school-level random assignment is executed as part of this paper.
I

Independent Conduct

The study was conducted by the intervention's own authors.
Y

Year Duration

No outcomes tracked over a full academic year are provided.
B

Balanced Resources

Treatment classes had extra teacher time; controls did not.
R

Reproduced Results

No independent replication of the interventions is reported.
A

All-subject Exams

Outcomes measured only in targeted subjects, not across all.
G

Graduation Tracking

No long-term tracking through graduation is provided.
P

Pre-registered Protocol

The RCT was preregistered on OSF prior to data collection.

K12
EU

Beyond class size reduction: Towards more flexible ways of implementing a reduced pupil–teacher ratio

Oddny Judith Solheim and Vibeke Opheim

The effect of a reduced pupil–teacher ratio has mainly been investigated as that of reduced class size. Hence we know little about alternative methods of reducing the pupil–teacher ratio. Deploying additional teachers in selected subjects may be a more flexible...

Published: Nov 15, 2018

C

Class-level RCT

The study is observational and did not randomize at the class level.
E

Exam-based Assessment

The study measures course grades rather than using standardized exams.
T

Term Duration

There is no intervention with outcomes measured after one academic term.
D

Documented Control Group

The paper does not document a distinct control group.
S

School-level RCT

No school-level randomization was performed.
I

Independent Conduct

The study was conducted by the authors without an independent evaluator.
Y

Year Duration

There is no intervention tracked for a full academic year.
B

Balanced Resources

No attempt to balance class time or resources.
R

Reproduced Results

The study's findings have been independently replicated by others.
A

All-subject Exams

The study does not use all-subject standardized exams.
G

Graduation Tracking

No graduation tracking is performed.
P

Pre-registered Protocol

No pre-registered protocol is referenced.

higher education
US

The effects of class size on student grades at a public university

Edward C. Kokkelenberg, Michael Dillon, Sean M. Christy

We model how class size affects the grade higher education students earn and we test the model using an ordinal logit with and without fixed effects on over 760,000 undergraduate observations from a northeastern public university. We find that class...

Published: Apr 1, 2008

C

Class-level RCT

Participants were randomized individually (not by class/school), and the intervention was not one-to-one tutoring.
E

Exam-based Assessment

Outcomes were assessed with an OSCE developed for this study and a short written test, not a widely recognized standardized external exam.
T

Term Duration

Outcomes were measured immediately at the end of a short course (one day, plus one week of preparatory access), not at least one academic term after the intervention began.
D

Documented Control Group

The study explicitly states key baseline demographics and prior ultrasound experience were not collected, and it does not report baseline performance measures for the control group.
S

School-level RCT

The trial was conducted at a single university site and randomized individual students, not schools (or equivalent institutional sites).
I

Independent Conduct

The authors created the blended modules and were involved in course implementation and assessments, with no stated independent external evaluation team.
Y

Year Duration

Outcomes were assessed at the end of the curriculum and the study duration is far shorter than 75% of an academic year; also, T is not met.
B

Balanced Resources

The blended and conventional conditions provide comparable supervised hands-on resources and intentionally substitute online preparation for face-to-face theory; there is no evidence of a non-integral resource advantage for the intervention group.
R

Reproduced Results

No independent replication of this specific trial by a different research team was found or documented at the time of this ERCT check.
A

All-subject Exams

E is not met (no standardized external exams), so A cannot be met; the outcomes are limited to ultrasound OSCE and a short written test rather than all-subject standardized exams.
G

Graduation Tracking

The paper reports no long-term follow-up, and no follow-up publications by the same authors tracking participants to graduation were found; also, Y is not met so G cannot be met.
P

Pre-registered Protocol

The study states trial registration was not applicable and explicitly reports it was not pre-registered in a trial registry.

science
higher education
EU
blended learning
EdTech platform

Blended-learning with half the face-to-face time versus conventional abdominal ultrasound training in undergraduate medical education: a randomized controlled non-inferiority trial

Laura Butennandt, Tina Stibane, Andreas Mayr, Felix Mühlensiepen, Helmut Sitter, and Johannes Knitza

Background: Ultrasonography is an essential clinical tool, offering rapid, bedside, imaging that supports timely clinical decision-making. Its effectiveness, however, depends heavily on examiner skill, requiring structured, practice-oriented training. Traditional tutor-led ultrasound teaching is limited by personnel and resource shortages. Blended...

Published: Feb 28, 2026

C

Class-level RCT

Randomization was at the student level within one school, not at the class (or higher) level, and the intervention is not described as one-to-one tutoring.
E

Exam-based Assessment

Outcomes were measured with self-report questionnaires/scales (PSQI, ASQ, FFMQ), not standardized exam-based academic assessments.
T

Term Duration

Outcomes were measured immediately after an 8-week intervention, which is shorter than a full academic term (typically ~3-4 months).
D

Documented Control Group

The control group is clearly described (no training) and the paper reports sample sizes, demographics, and baseline comparability across groups.
S

School-level RCT

This is not a school-level RCT because only students (not schools) were randomized and the study took place in one middle school.
I

Independent Conduct

The paper does not document an independent external evaluator; "two independent researchers" double-checking data entry does not establish independence from intervention design/delivery.
Y

Year Duration

Outcomes were measured after 8 weeks, far shorter than 75% of an academic year; additionally, per ERCT rules, if T is not met then Y is not met.
B

Balanced Resources

The intervention group received substantial additional structured time and activities (weekly 90-min sessions plus daily practice) while the control group received no comparable substitute activity.
R

Reproduced Results

No independent replication of this specific study by other authors was found in the paper or via internet search as of the ERCT check date.
A

All-subject Exams

Criterion A is not met because Criterion E is not met and the study does not use standardized academic exams across subjects.
G

Graduation Tracking

Graduation tracking is not reported; additionally, per ERCT rules, if Y is not met then G is not met, and no follow-up paper reporting graduation tracking was found.
P

Pre-registered Protocol

No pre-registration link/ID or registration date is reported, and searches did not identify a public pre-registered protocol for this specific trial.

K12
China

Effects of mindfulness-based stress reduction on sleep quality and academic stress of Chinese adolescents: a randomized controlled trial

Jingxin Deng and Fang Xu

This study evaluated the impact of an 8-week Mindfulness-Based Stress Reduction (MBSR) program on Chinese adolescents' sleep quality and academic stress. Forty-six students were randomly assigned to an experimental group (n=22) receiving MBSR or a control group (n=24) receiving no...

Published: Feb 4, 2026

C

Class-level RCT

The study is not randomized at the class (or school) level; the same cohort experienced all modalities in a fixed sequence.
E

Exam-based Assessment

Outcomes are measured using self-report surveys rather than a standardized exam-based assessment.
T

Term Duration

Measurements are taken immediately around a one-week rotation and pre/post sessions, not at least one academic term after start.
D

Documented Control Group

There is no separate documented control group; the design is a single cohort experiencing all three modalities.
S

School-level RCT

The study is conducted within one institution and does not randomize schools/sites to conditions.
I

Independent Conduct

Independent third-party conduct is not documented; the authors report conducting the analysis themselves.
Y

Year Duration

The study does not measure outcomes over 75% of an academic year, and term duration is also not met.
B

Balanced Resources

There is no control group and the compared modalities differ in format and resources (e.g., one-on-one vs small group), so balanced inputs cannot be established.
R

Reproduced Results

No independent peer-reviewed replication of this specific study was found, and the paper itself reports a single-university sample.
A

All-subject Exams

Because exam-based assessment (E) is not met, all-subject standardized exams (A) are also not met.
G

Graduation Tracking

The study is brief and does not track participants to graduation; no follow-up graduation-tracking publications were found.
P

Pre-registered Protocol

The paper reports no clinical trial number and describes the work as retrospective, providing no evidence of prospective pre-registration.

science
higher education
Asia

Evaluating medical students’ engagement and confidence across three simulation-based education methods: standardized patient, high fidelity simulator, and virtual reality

Jihye Yu, Sukyung Lee, Mira Kim, Janghoon Lee, and Yun Jung Jung

Simulation-based education is a crucial element of medical training, providing safe and realistic environments to develop clinical skills and confidence. This study evaluates the effects of three key simulation methods—Standardized Patients (SP), High-Fidelity Simulators (HFS), and Virtual Reality (VR)—on medical...

Published: Jan 23, 2026

Making Educational Research Better with ERCT Standards

The ERCT Framework

Level 1

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

Level 2

School-level RCT

Independent Conduct

Year Duration

Balanced Resources

Level 3

Reproduced

All-subject Exams

Graduation Tracking

Pre-registered Protocol:

ERCT Evaluated Studies

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

School-level RCT

Independent Conduct

Year Duration

Balanced Resources

Reproduced Results

All-subject Exams

Graduation Tracking

Pre-registered Protocol

Efficacy of Zearn Math over two years in grades 3 to 5: An experiment in Texas

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

School-level RCT

Independent Conduct

Year Duration

Balanced Resources

Reproduced Results

All-subject Exams

Graduation Tracking

Pre-registered Protocol

The heterogeneous effect of information on student performance: Evidence from a randomized control trial in Mexico

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

School-level RCT

Independent Conduct

Year Duration

Balanced Resources

Reproduced Results

All-subject Exams

Graduation Tracking

Pre-registered Protocol

Online Mathematics Homework Increases Student Achievement

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

School-level RCT

Independent Conduct

Year Duration

Balanced Resources

Reproduced Results

All-subject Exams

Graduation Tracking

Pre-registered Protocol

Digital personalised learning to improve literacy and numeracy outcomes: a randomised controlled trial in Kenyan pre-primary classrooms

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

School-level RCT

Independent Conduct

Year Duration

Balanced Resources

Reproduced Results

All-subject Exams

Making
Educational Research
Better with ERCT Standards