Abstract
Tutoring programs for low-performing students, delivered in-person or online, effectively enhance school performance, yet their medium- and longer-term impacts on labor market outcomes remain less understood. To address this gap, we conduct a randomized controlled trial with 839 secondary school students in Germany to examine the effects of an online tutoring program for low-performing students on academic performance and school-to-work transitions. The online tutoring program had a non-significant intention-to-treat effect of 0.06 standard deviations on math grades six months after program start. However, among students who had not received other tutoring services prior to the intervention, the program significantly improved math grades by 0.14 standard deviations. Moreover, students in non-academic school tracks experienced smoother school-to-work transitions, with vocational training take-up 18 months later being 5 percentage points higher-an effect that was even larger (12 percentage points) among those without prior tutoring. Overall, the results indicate that tutoring can generate lasting benefits for low-performing students that extend beyond school performance.
Full
Article
ERCT Criteria Breakdown
-
Level 1 Criteria
-
C
Class-level RCT
- Student-level random assignment is clearly described.
- The students were then pair-wise randomly assigned to either a treatment or a control group.
Relevant Quotes:
1) "To analyze the effectiveness of the Lern-Fair online tutoring program, we conducted a randomized field experiment." (p. 14)
2) "The students were then pair-wise randomly assigned to either a treatment or a control group." (p. 14)
Detailed Analysis:
The paper describes a randomized field experiment in which eligible students enter a randomization process and are assigned to treatment (online tutoring invitation) or control. This is an individual-level randomization within an education context, which matches the intent of criterion C (testing below the school level to reduce contamination and allow a clean comparison).
Final: Criterion C is met because students are randomly assigned to treatment or control.
-
E
Exam-based Assessment
- Outcomes rely on self-reported grades rather than standardized exams.
- All grades are self-reported by the young participants and relate to the final grades from school years 2021/2022 and 2022/2023, respectively.
Relevant Quotes:
1) "All grades are self-reported by the young participants and relate to the final grades from school years 2021/2022 and 2022/2023, respectively." (p. 25)
Detailed Analysis:
The main academic outcome is based on students' reported school grades, not standardized test or external exam scores. The ERCT criterion E requires exam-based (standardized) assessment so outcomes are objective and comparable across students and settings. Self-reported grades do not satisfy that requirement.
Final: Criterion E is not met because outcomes are based on self-reported grades, not exams.
-
T
Term Duration
- Follow-up measurement occurs after a substantial interval post- intervention start.
- The first survey (follow-up I) took place in October and Novem- ber 2022, during the new school year, after the treatment group had been invited to participate in Lern-Fair.
Relevant Quotes:
1) "In response to our invitation to participate in the online tutoring program, 30% of the students did in fact take part, usually attending weekly sessions for more than three months." (p. 6)
2) "The first survey (follow-up I) took place in October and Novem- ber 2022, during the new school year, after the treatment group had been invited to participate in Lern-Fair." (p. 17)
Detailed Analysis:
The intervention is an online tutoring offer that (for participants) typically runs for more than three months with weekly sessions. Outcome measurement is not immediate: the first follow-up survey occurs in October-November 2022 after students were invited to participate in the program, and the intervention period is described as spanning multiple months. This satisfies the "at least one term after intervention begins" requirement in the ERCT specification.
Final: Criterion T is met because outcomes are measured after a substantial post-start interval.
-
D
Documented Control Group
- Treatment and control conditions are clearly described and baseline balance is shown.
- Students assigned to the treatment group received an invitation to participate in the Lern-Fair online tutoring program.
Relevant Quotes:
1) "Students assigned to the treatment group received an invitation to participate in the Lern-Fair online tutoring program." (p. 14)
2) "Table 1 indicates that student characteristics were balanced between the treatment and the control group at baseline." (p. 17)
Detailed Analysis:
The paper clearly defines the treatment condition as receiving an invitation to join the Lern-Fair online tutoring program. It also reports baseline balance between treatment and control groups (Table 1), indicating that the control group is well-characterized and comparable at baseline. This level of documentation is sufficient for criterion D, which requires a clearly described and analyzable control condition.
Final: Criterion D is met because the control condition is clearly defined and baseline balance is documented.
-
Level 2 Criteria
-
S
School-level RCT
- Randomization is not conducted at the school level.
- The students were then pair-wise randomly assigned to either a treatment or a control group.
Relevant Quotes:
1) "The students were then pair-wise randomly assigned to either a treatment or a control group." (p. 14)
Detailed Analysis:
Criterion S requires randomization at the school level. Here, the randomization occurs at the student level (pair-wise assignment of students). While students and schools are geographically dispersed, the unit of randomization is not the school, so criterion S is not satisfied.
Final: Criterion S is not met because randomization is not at the school level.
-
I
Independent Conduct
- Key outcomes are self-reported, not independently assessed.
- All grades are self-reported by the young participants and relate to the final grades from school years 2021/2022 and 2022/2023, respectively.
Relevant Quotes:
1) "All grades are self-reported by the young participants and relate to the final grades from school years 2021/2022 and 2022/2023, respectively." (p. 25)
Detailed Analysis:
Criterion I requires outcomes to be assessed by third-party or independent evaluators to reduce bias. The key outcome measures rely on student self-reports of grades (and survey-based outcomes more generally). Self-reported outcome measurement does not meet the intent of independent evaluation.
Final: Criterion I is not met because key outcomes are self-reported rather than independently assessed.
-
Y
Year Duration
- Outcomes span more than one academic year, including a second follow-up in late 2023.
- The second survey (follow-up II) took place in November and December 2023.
Relevant Quotes:
1) "The second survey (follow-up II) took place in November and December 2023." (p. 17)
2) "All grades are self-reported by the young participants and relate to the final grades from school years 2021/2022 and 2022/2023, respectively." (p. 25)
Detailed Analysis:
Criterion Y requires outcomes covering at least one academic year. The paper uses final grades from two school years (2021/2022 and 2022/2023) and conducts a second follow-up in November-December 2023. Together, these details show that measurement spans more than a single academic year.
Final: Criterion Y is met because outcomes cover multiple school years and include an 18-month follow-up.
-
B
Balanced Control Group
- Extra instructional time is the intervention itself and is explicitly tested.
- In response to our invitation to participate in the online tutoring program, 30% of the students did in fact take part, usually attending weekly sessions for more than three months.
Relevant Quotes:
1) "In response to our invitation to participate in the online tutoring program, 30% of the students did in fact take part, usually attending weekly sessions for more than three months." (p. 6)
2) "Students assigned to the treatment group received an invitation to participate in the Lern-Fair online tutoring program." (p. 14)
Detailed Analysis:
The treatment is explicitly an online tutoring program, which by definition adds instructional time and support. The additional tutoring time is not an incidental add-on; it is the intervention being tested ("online tutoring"). Under the updated ERCT criterion B rules, when the extra time/resources are integral to the treatment variable, the control group can remain business-as-usual, and the criterion can still be met, provided this intent is clear. Here, the intent is to estimate the effect of offering/receiving tutoring, so the resource difference is the treatment itself.
Final: Criterion B is met because the extra instructional time is the core treatment being tested.
-
Level 3 Criteria
-
R
Reproduced
- No independent replications were found in available sources.
Relevant Quotes:
No independent replication studies were found in the sources checked.
Detailed Analysis:
Criterion R requires independent reproduction by other authors. I searched for later or parallel studies explicitly reproducing this specific RCT and its results (using the exact title, author list, and the Lern-Fair program name) across major working-paper listings and indexes where this study is hosted (EdWorkingPapers, IZA, RFBerlin, and RePEc/IDEAS). These sources only list the original study by the same author team and do not surface independent replication papers.
Final: Criterion R is not met because no independent replications were found.
-
A
All-subject Exams
- Not applicable because exam-based assessment is not used and outcomes are not across all core exams.
- To measure school performance, we preregistered two main outcomes: grades in math and grade retention.
Relevant Quotes:
1) "To measure school performance, we preregistered two main outcomes: grades in math and grade retention." (p. 25)
Detailed Analysis:
Criterion A requires exam-based assessment (criterion E) and effects measured across all core subjects/exams. Because criterion E is not met, criterion A is automatically not met under the ERCT rules. In addition, the preregistered school performance outcomes focus on math grades and grade retention, not a full set of core subject exam outcomes.
Final: Criterion A is not met because criterion E is not met and outcomes are not across all core exams/subjects.
-
G
Graduation Tracking
- No evidence of tracking outcomes until graduation for the full cohort was found.
- At this point, students, particularly those on the non-academic school track (which finishes after grade 9 or 10), face crucial decisions regarding their educational and career paths.
Relevant Quotes:
1) "At this point, students, particularly those on the non-academic school track (which finishes after grade 9 or 10), face crucial decisions regarding their educational and career paths." (p. 6)
2) "The second survey (follow-up II) took place in November and December 2023." (p. 17)
Detailed Analysis:
Criterion G requires tracking students until graduation. The study's longer follow-up is about 18 months after program start and focuses on school-to-work transitions during a key decision period, especially for non-academic tracks that can end after grade 9 or 10. The paper does not report tracking the full cohort to graduation outcomes, and I did not find subsequent publications by the same author team that extend follow-up to graduation for this RCT in the sources checked.
Final: Criterion G is not met because there is no evidence of tracking the cohort until graduation.
-
P
Pre-Registered
- The study reports preregistration in the AEA RCT Registry and analyzes outcomes as registered.
- The experiment was preregistered in the AEA RCT registry, AEARCTR-0008937.
Relevant Quotes:
1) "The experiment was preregistered in the AEA RCT registry, AEARCTR-0008937." (p. 3)
2) "All outcomes are analyzed as registered." (p. 83)
Detailed Analysis:
The paper explicitly states that the experiment was preregistered in the AEA RCT Registry (with an AEARCTR identifier), and the appendix discusses deviations from a preregistered analysis plan while stating that all outcomes are analyzed as registered. In this environment, the AEA RCT Registry pages themselves were not accessible for an independent check of the registry's registration timestamp, so this assessment relies on the paper's explicit preregistration statement and internal consistency with a preregistered plan.
Final: Criterion P is met because the study reports preregistration in the AEA RCT Registry and analyzes outcomes as registered.
Request an Update or Contact Us
Are you the author of this study? Let us know if you have any questions or updates.