Class-level RCT
Tests interventions at the classroom level to prevent cross-group contamination
Randomised Controlled Trials (RCTs) are considered the "gold standard" in educational research, but their implementation alone doesn't guarantee reliable or practical results. Many RCTs face challenges like unclear criteria, short-term focus, or limited applicability to real-world settings.
The Educational Randomised Controlled Trial (ERCT) Standard solves these issues by introducing 12 clear criteria, grouped into three levels, to ensure research is rigorous, transparent, and impactful in real-life educational contexts.
The ERCT Standard has 3 levels, each containing 4 criteria
Tests interventions at the classroom level to prevent cross-group contamination
Uses standardized exams for objective and comparable results
Ensures studies last at least one academic term to measure meaningful impacts
Requires detailed control group data for proper comparisons
Expands testing to whole schools for real-world relevance
Assesses effects across all core subjects, avoiding imbalances
Ensures studies last at least one academic term to measure meaningful impacts
Ensures equal time and resources for both groups to isolate the intervention's impact.
Tracks students until graduation to evaluate long-term impacts.
Uses standardized exams for objective and comparable results
Removes bias by using third-party evaluators.
Increases transparency by publishing study plans before data collection
By following these criteria, researchers can
conduct robust
studies,
and educators can confidently interpret research
findings.
This standard guides high-quality educational RCTs and evaluates existing research.
Many studies conduct a brief, two-week intervention and immediately measures outcomes. Short-term interventions may show temporary effects that don't persist, or miss delayed effects that take time to manifest. Ensuring at least a term-long intervention allows for more reliable assessment of the intervention's impact.
Identify quotes from the paper specifying the start and end dates or the duration of the intervention (e.g., “The program ran from September to December…”).
Check if these quotes detail who the control group is, their baseline characteristics, and confirm that no special treatment was given beyond normal schooling. If no such descriptive quote is found, this is a failure.
Mark as “met” if you can quote clear documentation of the control group’s characteristics. Mark as “not met” if no adequate quote describing the control group is provided.
Researchers often create a custom test specifically designed to measure the outcomes of their intervention. This can lead to bias, as the test may be overly aligned with the intervention, inflating its apparent effectiveness. Standardised exams provide a more objective and comparable measure of educational outcomes.
Locate any quotes from the paper describing the test or examination used to measure outcomes. For example: “We used the national standardised exam in mathematics…” or “We developed a new test for the purpose of this study…”
If the exam name or description indicates it is a widely recognised standardised test (e.g., “state-wide standardised achievement test,” “national curriculum exam”), it meets the criterion. Quote the part that confirms its standardization.
Mark as “met” if the quoted duration is at least one full term. Mark as “not met” if the quoted duration is shorter than a term or not clearly stated.
Many studies conduct a brief, two-week intervention and immediately measures outcomes. Short-term interventions may show temporary effects that don't persist, or miss delayed effects that take time to manifest. Ensuring at least a term-long intervention allows for more reliable assessment of the intervention's impact.
Identify quotes from the paper specifying the start and end dates or the duration of the intervention (e.g., “The program ran from September to December…”).
Check if these quotes detail who the control group is, their baseline characteristics, and confirm that no special treatment was given beyond normal schooling. If no such descriptive quote is found, this is a failure.
Mark as “met” if you can quote clear documentation of the control group’s characteristics. Mark as “not met” if no adequate quote describing the control group is provided.
Many studies mention having a control group but provide no details about its composition or treatment. Why it's an issue: Without proper documentation, it's impossible to assess whether the control group was truly comparable or if it received any unintended interventions. Detailed documentation of the control group allows for proper comparison and interpretation of results.
Find quotes from the methods section describing the control group’s demographics, baseline performance, or any conditions placed on them. For example: “The control group received standard instruction, and included 30 students with similar demographic backgrounds…”
Check if these quotes detail who the control group is, their baseline characteristics, and confirm that no special treatment was given beyond normal schooling. If no such descriptive quote is found, this is a failure.
Mark as “met” if you can quote clear documentation of the control group’s characteristics. Mark as “not met” if no adequate quote describing the control group is provided.
A class-level RCT shows positive results, but when implemented school-wide, the effects disappear. Class-level randomisation might not account for school-level factors that influence the intervention's effectiveness. School-level randomisation captures a more realistic implementation scenario and accounts for school-wide factors. They are the closest to real-life implementations.
Locate quotes describing the randomisation procedure at the school level (e.g., “Twenty schools were randomly assigned to either the intervention or control condition…”).
If you find quotes that randomisation was at class or student level only, this criterion is not met.
Mark as “met” if a quote confirms school-level randomisation. Mark as “not met” if no quote indicates school-level assignment.
For example a maths intervention shows great improvement in maths scores, but researchers don't measure performance in other subjects. This intervention might be improving maths at the expense of other subjects, leading to an imbalanced education. Measuring all subjects ensures the intervention doesn't have unintended negative consequences in non-target areas.
For highly specialised interventions in upper secondary or vocational education, measuring impact on directly related subjects might be sufficient if the rationale is clearly explained.
Locate quotes from the paper listing the subjects tested. For example: “We assessed student performance in math, science, and language arts at the end of the year…”
Verify from the quotes that all main subjects taught in that educational level were assessed. If unsure what the main subjects are, refer to the paper’s curriculum description or standard subjects in that context. Make sure that they are standard standardised exam-based assessments, not some custom tests.
If the paper states a clear rationale for a specialized intervention (e.g., vocational training focused solely on welding certification) and justifies measuring only related outcomes, quote that explanation and consider this acceptable.
Mark as “met” if quoted evidence shows all main subjects (or justified exception) were assessed. Mark as “not met” if quoted evidence shows only one or a limited set of subjects without justification.
A term-long intervention shows promising results, but these gains fade by the end of the school year. Some educational interventions may have short-term effects that don't persist long-term. A year-long study is a reasonable practical compromise - it is long-enough to have good confidence in the intervention results while still practical as schools often are organised around years.
Identify quotes specifying the intervention period. For example: “The intervention was implemented from September 2020 to June 2021.”
Verify from the quotes that it covers an entire academic year (generally ~9-10 months).
Mark as “met” if the quoted duration spans a full academic year. Mark as “not met” if quotes indicate a shorter duration.
An intervention that provides extra tutoring time (or extra budget) shows positive results, but the control group received no additional educational time (or money). It's unclear whether the positive results are due to the specific intervention or simply the additional time or money spent on education. Ensuring the control group receives balanced time and resources isolates the effect of the specific intervention. When an intervention is designed to test the impact of additional resources (such as extra tutoring time or rewards) on outcomes, the control group typically receives the standard 'business as usual' level. In this case, the absence of extra resources in the control group is by design and does not indicate an imbalance.
Assess whether the study is explicitly testing the impact of additional resources. If so, the control group should receive the baseline “business as usual” input.
Find quotes describing the intervention in the test group. Examples: “Students in the intervention group received an additional hour of tutoring each day.” “Teachers in the intervention group were provided with new tablets and training sessions.”
Based on the quotes, decide if these interventions required extra budget/time/resources compared to standard instruction. If uncertain, look for additional quotes clarifying the nature of the intervention. Include the detailed description of the additional resources into your explanation. If the extra resources are the treatment variable, then the control group should be documented as receiving the standard input.
If the quotes show no extra resources (e.g., “The intervention involved a new teaching method but no additional class time or materials”), mark as “met” without further checking.
Locate quotes that describe what the control group received. For example: “Control schools also received additional professional development time equivalent to the intervention group’s training hours.” Verify that the quoted resources/time for the control group matches or balances out the intervention group’s extra input.
Mark as “met” if the evaluation confirms a balanced allocation—either by matching extra resources or, if the extra resource is the treatment variable, by ensuring all groups receive the same core inputs. Mark as “not met” if no quotes indicate any effort to balance or if baseline inputs differ.
Interventions may show short-term benefits, but researchers often neglect to follow up on long-term outcomes. Tracking until graduation offers insight into the lasting impact on students' educational journeys without needing to track them after leaving school.
Locate quotes describing the follow-up duration. For example: “Students were tracked through to the end of their primary education, until Grade 6 graduation.”
Confirm from the quotes that the study did not stop measurement immediately after the intervention ended, but continued until the students graduated from that educational stage.
Mark as “met” if quoted evidence shows tracking continued through graduation. Mark as “not met” if quoted evidence shows tracking stopped earlier or no mention of graduation tracking is found.
A highly publicised educational intervention fails to show the same positive results when implemented in different schools or contexts. Single studies may have results influenced by specific contexts, leading to non-generalisable findings. There have been numerous cases in educational research where initial studies were promising, but replication efforts revealed little to no effect. Reproduction in different contexts ensures the intervention's effects are robust and generalisable.
Find quotes where the authors mention a previous or separate study that replicated their intervention and results. For example: “A subsequent study by Smith et al. (2022) implemented the same intervention in a different district and found similar effects.”
Confirm from the quotes that the replication was done by a different team or institution, not the same authors.
Mark as “met” if quoted references show independent replication in a different context. Mark as “not met” if no quotes mention replication or if the replication was by the same research team only.
When the researchers or authors of an intervention conduct the study themselves, there is a risk of biased reporting or analysis. For example, the authors might subconsciously or consciously influence data collection or interpretation to favour their intervention.
Look for quotes in the acknowledgments, methods, or author contribution sections. For example: “Data collection and analysis were conducted by an external evaluation team with no involvement in the intervention’s design.”
If the quotes show that the same authors developed the intervention and also carried out the study, this criterion fails unless there is a statement of third-party oversight.
Mark as “met” if quoted evidence confirms independence (e.g., an external evaluation agency). Mark as “not met” if quotes indicate the same team designed and tested the intervention without independent oversight.
Researchers often analyse their data in multiple ways and only report the analyses that show significant positive results. This p-hacking or selective reporting can lead to false positive results and an inflated sense of the intervention's effectiveness. Pre-registration of hypotheses and analysis plans prevents selective reporting and increases transparency in research.
Find quotes mentioning a registry platform (e.g., “The study was pre-registered on ClinicalTrials.gov (ID…) before data collection began.”).
Check quotes for a date of pre-registration and ensure it was before data collection started (e.g., “Pre-registration occurred in June 2020, data collection began in September 2020.”).
Mark as “met” if quoted evidence confirms a pre-registration reference and timing. Mark as “not met” if no quotes referencing pre-registration are found or if the quoted timing indicates registration occurred after data collection.
Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.
Share your research with us for review
Provide feedback to help us make things better.
If you're the author, let us know about necessary updates or corrections.