Abstract
Brief pre-class physical exercise may enhance executive function (EF) processes proximal to mathematics learning. However, the effect is likely dependent on the alignment between the physical exercise and subsequent instructional demands. We tested a dual-process framework in authentic classrooms, predicting selective gains in mathematics-specific inhibitory control rather than broad executive function. Using a cluster-randomized trial, four Grade-5 classes (N = 182 students) were assigned to mentally passive sedentary behavior (e.g., video watching), mentally active sedentary behavior (e.g., numeracy practice), 5-minute physical exercise, or 10-minute physical exercise. EF outcomes were assessed at baseline, immediately post-intervention, and after the mathematics instruction using a domain-general flanker task and a mathematics-specific negative priming task. Both physical exercise groups outperformed mentally passive sedentary behavior on the inverse efficiency score of the negative priming task at post-instruction, with no differences between 5- and 10-minute physical exercise, and no consistent advantages on the flanker interference performance. These results support a task–context alignment mechanism in which brief physical exercise may prime motivational/alerting states that translate into improved inhibition when followed by domain-relevant instruction. Short, pre-class physical exercise routines can be integrated without displacing teaching time and may enhance readiness for mathematics learning by selectively improving mathematics-specific inhibitory control. We discuss implementation parameters for classroom use and implications for theory on how physical exercise influences psychological processes central to learning.
Full
Article
ERCT Criteria Breakdown
-
Level 1 Criteria
-
C
Class-level RCT
- Randomization was conducted at the class level using intact classes, satisfying Criterion C.
- "Randomization was conducted at the class level to preserve natural classroom structures." (p. 5)
Relevant Quotes:
1) "Four of twelve fifth-grade classes were randomly assigned to the four experimental conditions using a computer-generated randomization procedure in Microsoft Excel 2019 (Microsoft Corporation, Redmond, Washington, USA)." (p. 5)
2) "Randomization was conducted at the class level to preserve natural classroom structures." (p. 5)
3) "This study employed a cluster-randomized controlled trial design." (p. 8)
4) "Randomization occurred at the class level (k = 4 clusters; one class per experimental condition)." (p. 10)
Detailed Analysis:
Criterion C requires that the unit of randomization is the class (or stronger, e.g., school), unless a tutoring-style exception applies. The paper repeatedly and explicitly states that intact Grade-5 classes were randomized, and it reports k = 4 clusters with one class per condition. This is unambiguous class-level randomization.
Final Summary:
Criterion C is met because intact classes were randomized at the class level using a computer-generated procedure.
-
E
Exam-based Assessment
- Outcomes were measured with researcher-run EF tasks rather than a standardized exam-based educational assessment.
- "EF outcomes were assessed at baseline, immediately post-intervention, and after the mathematics instruction using a domain-general flanker task and a mathematics-specific negative priming task." (p. 1)
Relevant Quotes:
1) "EF outcomes were assessed at baseline, immediately post-intervention, and after the mathematics instruction using a domain-general flanker task and a mathematics-specific negative priming task." (p. 1)
2) "Flanker Task: Domain-General Inhibition The task was adapted from Simmering et al. (2022) and presented in PsychoPy (v2021.1; Peirce, 2007, 2009; Peirce et al., 2019)." (p. 6)
3) "Negative Priming Task: Domain-Specific Inhibition The negative priming (NP) task was adapted from the fraction comparison task (Jiang et al., 2020)." (p. 7)
Detailed Analysis:
Criterion E requires that educational outcomes are assessed with standardized exam-based tests (not researcher-run experimental/cognitive paradigms). This paper measures executive function using a Flanker task and a negative priming task, and it explicitly describes both tasks as adapted from prior work. These are not standardized exams of academic achievement.
Final Summary:
Criterion E is not met because outcomes are measured with adapted cognitive tasks rather than standardized exams.
-
T
Term Duration
- Outcomes were measured within the same morning rather than at least one academic term after intervention start.
- "Each group completed cognitive assessments at three time points: (1) pre-intervention, (2) post-intervention (i.e., just before class), and (3) after an instruction (i.e., a mathematics lesson)..." (p. 8)
Relevant Quotes:
1) "The experiment was conducted between May 9, 2025 and June 5, 2025..." (p. 5)
2) "Each group completed cognitive assessments at three time points: (1) pre-intervention, (2) post-intervention (i.e., just before class), and (3) after an instruction (i.e., a mathematics lesson)..." (p. 8)
3) "All participants were instructed to arrive at the classroom at 08:00 a.m. The session began at 08:10 a.m. with a 6-minute cognitive test..." (p. 8)
4) "Between 08:20 a.m. and 08:30 a.m., participants engaged in the assigned intervention activity." (p. 8)
5) "At 08:40 a.m., all participants attended a standardized 40-minute mathematics lesson..." (p. 9)
6) "At 09:20 a.m., immediately after the 40-minute mathematics lesson, participants completed the final cognitive test." (p. 9)
Detailed Analysis:
Criterion T requires that outcomes be measured at least one academic term (approximately 3 to 4 months) after the intervention begins. In this paper, the intervention is an acute pre-class activity (5 or 10 minutes) and the outcomes are measured pre-intervention, immediately post- intervention, and after a single 40-minute mathematics lesson, all within the same morning schedule.
Although the broader data-collection period spans May 9 to June 5, 2025, the outcome timing is still immediate/short- term rather than term-long follow-up.
Final Summary:
Criterion T is not met because outcomes are assessed within the same day, not after at least one academic term.
-
D
Documented Control Group
- The comparison conditions and baseline assessment are described, and group/class demographics are reported in an appendix.
- "These four intact classes were then randomly assigned in a 1:1:1:1 ratio to one of the four intervention conditions..." (p. 8)
Relevant Quotes:
1) "These four intact classes were then randomly assigned in a 1:1:1:1 ratio to one of the four inter- vention conditions: (1) mentally passive sedentary behavior (i.e., watching a video), (2) mentally active sedentary behavior (i.e., doing arithmetic work), (3) 5-minute physical exercise, and (4) 10-minute physical exercise." (p. 8)
2) "Each group completed cogni- tive assessments at three time points: (1) pre-intervention, (2) post-intervention (i.e., just before class), and (3) after an instruction (i.e., a mathematics lesson)..." (p. 8)
3) "In the sitting conditions (i.e., mentally passive and mentally active sedentary behaviors), participants remained seated for 10 minutes without any interruption." (p. 9)
4) "Participants’ demographics stratified by class are reported in Appendix 1." (p. 10)
Detailed Analysis:
Criterion D requires a documented control/comparison group, including what the comparison condition received and baseline information about participants. The paper explicitly defines the two sedentary comparison conditions (mentally passive and mentally active), including what students did and for how long, and it specifies that all groups had a pre-intervention assessment. It also states that demographics stratified by class are reported in an appendix, indicating that group characteristics are documented.
Final Summary:
Criterion D is met because the paper clearly documents the comparison conditions and baseline and reports group demographics in an appendix.
-
Level 2 Criteria
-
S
School-level RCT
- Randomization occurred within one school at the class level, not by assigning multiple schools to conditions.
- "The research team collaborated with an elementary school in Shenzhen, China." (p. 5)
Relevant Quotes:
1) "The research team collaborated with an elementary school in Shenzhen, China." (p. 5)
2) "Four of twelve fifth-grade classes were randomly assigned to the four experimental conditions..." (p. 5)
3) "Randomization was conducted at the class level to preserve natural classroom structures." (p. 5)
Detailed Analysis:
Criterion S requires school-level randomization (i.e., multiple schools randomized to different conditions). This study was conducted in a single school and randomized four Grade-5 classes within that school. Therefore, the randomization unit is not the school.
Final Summary:
Criterion S is not met because only classes within one school were randomized, not multiple schools.
-
I
Independent Conduct
- The paper provides no explicit statement that the study was conducted by an independent external evaluation team.
- "They were instructed to stay quietly at their desks and refrain from talking or engaging in any other activities under supervision by the experimenters." (p. 9)
Relevant Quotes:
1) "The research team collaborated with an elementary school in Shenzhen, China." (p. 5)
2) "The session began at 08:10 a.m. with a 6-minute cognitive test administered by a trained teacher using a standardized script..." (p. 8)
3) "They were instructed to stay quietly at their desks and refrain from talking or engaging in any other activities under supervision by the experimenters." (p. 9)
Detailed Analysis:
Criterion I requires quoted evidence that the evaluation is conducted independently from the intervention designers (e.g., an external evaluator or explicit independence statement). The paper describes a collaboration between the research team and the school and indicates experimenter supervision and teacher-administered tests. It does not include an explicit statement that implementation, data collection, or analysis was performed by an independent third party.
Final Summary:
Criterion I is not met because independence from the study team is not explicitly documented.
-
Y
Year Duration
- Outcomes were assessed acutely within a single morning, which is far shorter than 75% of an academic year.
- "Between 08:20 a.m. and 08:30 a.m., participants engaged in the assigned intervention activity." (p. 8)
Relevant Quotes:
1) "Between 08:20 a.m. and 08:30 a.m., participants engaged in the assigned intervention activity." (p. 8)
2) "At 09:20 a.m., immediately after the 40-minute mathematics lesson, participants completed the final cognitive test." (p. 9)
3) "Each group completed cognitive assessments at three time points: (1) pre-intervention, (2) post-intervention (i.e., just before class), and (3) after an instruction (i.e., a mathematics lesson)..." (p. 8)
Detailed Analysis:
Criterion Y requires tracking outcomes for at least 75% of an academic year after the intervention begins. This study is explicitly acute: the intervention occurs in a short pre-class time block and outcomes are assessed immediately and after a single lesson, within one morning session. That is far shorter than any plausible academic-year threshold.
Additionally, under the ERCT rule provided for this check, if Criterion T is not met then Criterion Y is also not met; here, T is not met.
Final Summary:
Criterion Y is not met because the study measures only immediate effects within one day (and T is not met).
-
B
Balanced Control Group
- The intervention does not add extra instructional time or budget, and all conditions are scheduled within the same pre-class time window.
- "Between 08:20 a.m. and 08:30 a.m., participants engaged in the assigned intervention activity." (p. 8)
Relevant Quotes:
1) "Between 08:20 a.m. and 08:30 a.m., participants engaged in the assigned intervention activity." (p. 8)
2) "In the sitting conditions (i.e., mentally passive and mentally active sedentary behaviors), participants remained seated for 10 minutes without any interruption." (p. 9)
3) "Those in the 5-minute condition began physical exercise at 08:25 a.m., whereas the 10-minute group started at 08:20 a.m." (p. 9)
4) "At 08:40 a.m., all participants attended a standardized 40-minute mathematics lesson delivered by the original teachers using identical instructional slides and materials." (p. 9)
5) "Short, pre-class physical exercise routines can be integrated without displacing teaching time..." (p. 1)
Detailed Analysis:
Criterion B compares the nature, quantity, and quality of resources (time, materials, budget, staffing) between intervention and control conditions, and asks whether extra resources are balanced unless extra resources are the explicit treatment variable.
All conditions occur within the same scheduled pre-class window (08:20 to 08:30). The two control conditions are structured seated activities during that same period, and the exercise conditions are also delivered during this period. The paper explicitly frames the approach as being feasible "without displacing teaching time." There is no indication that the exercise condition added extra instructional minutes or a distinct budgetary input versus the control conditions.
Final Summary:
Criterion B is met because the intervention fits within the same time block as controls and does not add extra instructional time or budget beyond the comparison inputs.
-
Level 3 Criteria
-
R
Reproduced
- No independent peer-reviewed replication of this specific trial was found in the paper or via external searching.
Relevant Quotes:
1) (No statements describing an independent replication of this specific trial were found in the provided paper content.)
Detailed Analysis:
Criterion R requires an independent replication of this specific study (or a clearly identified replication of its core design and claim) by a different research team, in a different context, in a peer-reviewed outlet, supported by quote evidence from the replication paper(s).
The paper does not claim to be a replication, and it does not cite an external replication of this specific four-arm, class-cluster randomized pre-class exercise versus sedentary-behavior design with the same EF outcome setup. External searching (by DOI, title, and key design terms) did not identify a peer-reviewed independent replication report that explicitly replicated this study.
Final Summary:
Criterion R is not met because no independent, peer- reviewed replication of this specific trial was found.
-
A
All-subject Exams
- Criterion A is not met because Criterion E is not met and the study does not use standardized exams across core subjects.
- "EF outcomes were assessed at baseline, immediately post-intervention, and after the mathematics instruction using a domain-general flanker task and a mathematics-specific negative priming task." (p. 1)
Relevant Quotes:
1) "EF outcomes were assessed at baseline, immediately post-intervention, and after the mathematics instruction using a domain-general flanker task and a mathematics-specific negative priming task." (p. 1)
Detailed Analysis:
Criterion A requires standardized exam-based assessment across all main subjects and is explicitly dependent on Criterion E being met. Because this study does not use a standardized educational exam (it uses cognitive EF tasks), Criterion E is not met and therefore Criterion A is not met. Independently of that dependency, the outcomes are not standardized tests across subjects such as math, reading, science, etc.
Final Summary:
Criterion A is not met because E is not met and the study does not assess standardized exams across core subjects.
-
G
Graduation Tracking
- The study does not track students to graduation and, since Criterion Y is not met, Criterion G is not met.
- "The persistence of benefits beyond a single lesson was not examined, leaving the duration of potential effects for educational practice unclear." (p. 17)
Relevant Quotes:
1) "The persistence of benefits beyond a single lesson was not examined, leaving the duration of potential effects for educational practice unclear." (p. 17)
Detailed Analysis:
Criterion G requires tracking participants until graduation from the relevant educational stage. This paper explicitly acknowledges that persistence beyond a single lesson was not examined, which is incompatible with graduation tracking.
Per the ERCT dependency rule provided, if Criterion Y is not met then Criterion G is not met. Here, Y is not met because outcomes are assessed acutely within one morning.
External searching for follow-up papers by the same author team (using the DOI, title, and key phrases about pre-class physical exercise and mathematics-specific inhibitory control) did not identify any cohort follow-up paper that tracks these Grade-5 students to graduation.
Final Summary:
Criterion G is not met because there is no long-term follow- up to graduation (and Y is not met).
-
P
Pre-Registered
- No pre-registration is reported in the paper, and external registry searching did not identify a registration clearly dated before the study period.
Relevant Quotes:
1) "The experiment was conducted between May 9, 2025 and June 5, 2025, and the Ethics Com- mittee of the School of Shenzhen University (SZU_PSY_2024_171) approved the study protocol..." (p. 5)
2) "Study First Posted 2025-11-17" (MedPath page for NCT07231536)
3) "Study Start 2025-12-01" (MedPath page for NCT07231536)
Detailed Analysis:
Criterion P requires a publicly accessible pre-registered protocol (registry/link plus a date) demonstrating registration before data collection began.
The paper includes ethics approval information but does not provide a registry name, registration ID, or a pre- registration date.
External registry searching identified an entry describing a very similar design (NCT07231536). However, that entry shows "Study First Posted 2025-11-17" and "Study Start 2025-12-01", which are after the paper's reported study period (May 9, 2025 to June 5, 2025). Therefore, this does not support that the published study was pre-registered before it started.
Final Summary:
Criterion P is not met because the paper provides no pre- registration ID/date and no verified pre-study registration was found for this published trial.
Request an Update or Contact Us
Are you the author of this study? Let us know if you have any questions or updates.