Level 1 Criteria
-
C Class-level RCT
- The study uses a randomized cross-over design at the peer-group level (student level), which is acceptable under the ERCT exception for personal tutoring interventions.
- "Students were randomly assigned to two groups, respecting the constraint that students who regularly worked together in class during peer instruction were placed in the same group..." (p. 7)
- Relevant Quotes: 1) "Students were randomly assigned to two groups, respecting the constraint that students who regularly worked together in class during peer instruction were placed in the same group in order to maximize the effectiveness of their in-class learning." (p. 7) 2) "The structure of the experimental condition differed from the control condition in that all interactions and feedback were with an Al tutor, rather than with peer-instruction followed by instructor feedback." (p. 8) 3) "Working with an expert personal tutor is generally regarded as the most efficient form of education... What if an AI tutor could mimic the learning experience one would get from an expert (human) tutor?" (p. 2) Detailed Analysis: The study randomizes students (grouped by their small peer instruction clusters) to either the AI condition or the in-class condition. While this is technically not a "Class-level" randomization (as it occurs within a single course), the ERCT standard provides an exception for interventions designed for personal teaching or tutoring. The paper explicitly frames the intervention as "AI tutoring" intended to mimic "one-on-one tutoring." Therefore, the student-level (or small group-level) randomization is acceptable under the exception. Final sentence: The criterion is met because the intervention is a personal tutoring tool, allowing for the student-level randomization exception.
-
E Exam-based Assessment
- Outcomes were measured using custom pre- and post-tests designed for the specific lessons, not standardized exam-based assessments.
- "Following each lesson, students completed post-tests to measure content mastery..." (p. 2)
- Relevant Quotes: 1) "To establish baseline knowledge, students from both groups completed a pre-test prior to each lesson... Following each lesson, students completed post-tests to measure content mastery" (p. 2) 2) "To prevent the specific test questions from influencing the teaching or Al tutor design, the tests were constructed by a separate team member... tests were written based on the learning goals for the lesson" (p. 8) Detailed Analysis: The study uses custom-created quizzes (pre- and post-tests) that are specific to the two lessons (surface tension and fluid flow). While the authors used the Force Concept Inventory (FCI) for baseline characterization, the FCI was not the outcome measure for the intervention. The standard requires widely recognized, standardized exams to measure the educational outcome. Custom lesson-aligned tests do not meet this requirement. Final sentence: The criterion is not met because the study relies on custom-designed post-tests rather than standardized exams to measure learning outcomes.
-
T Term Duration
- Outcomes were measured immediately following two single-lesson interventions, falling far short of the one-term duration requirement.
- "The study took place during one of the two meeting of the class during the ninth and tenth weeks of the course." (p. 7)
- Relevant Quotes: 1) "The study took place during one of the two meeting of the class during the ninth and tenth weeks of the course." (p. 7) 2) "Following each lesson, students completed post-tests..." (p. 2) Detailed Analysis: The intervention consisted of two specific lessons occurring in consecutive weeks. The measurement (post-test) occurred immediately after each lesson. The ERCT standard requires that outcomes be measured at least one full academic term after the intervention begins. Here, the measurement was immediate, and the total duration of the study interaction was only two weeks. Final sentence: The criterion is not met because the interval between the start of the intervention and the measurement of outcomes was less than one academic term.
-
D Documented Control Group
- The control group (in-class active learning) is well-documented, including pedagogy, student demographics, and baseline knowledge.
- "All in-class lessons employed research-based best practices for in-class active learning." (p. 7)
- Relevant Quotes: 1) "All in-class lessons employed research-based best practices for in-class active learning... First the instructor introduces an activity, then students work through the activity in self-selected groups..." (p. 7) 2) "The demographics of the two groups were comparable (see table S2A), as were previous measures of their physics background knowledge" (p. 7) Detailed Analysis: The paper provides a detailed description of the control condition, which is the standard "in-class active learning" format for the course. It describes the structure (intro, group work, feedback), the qualifications of the instructors, and the baseline characteristics of the students in that group (via FCI and CLASS scores). This satisfies the requirement for a documented control group. Final sentence: The criterion is met as the paper clearly documents the control group's composition, baseline characteristics, and instructional conditions.