Abstract
The home numeracy environment is suggested to influence children's numerical development, but causal evidence for this assertion remains limited. Addressing this gap, we randomly assigned 117 predominantly White 4- to 5-year-olds (M = 4.68 years, SD = 0.2, 47% girls) attending preschool in in Flanders (Belgium) to either an experimental (numeracy) or an active control (language) condition. The 6-week intervention (pretest-March 2023; posttest-May 2023) consisted of an ecologically valid implementation, with parents integrating flexible activities into their routines. Pre- and post-intervention assessments measured children’s numerical skills. Medium effects were observed on transcoding and ordering skills, providing causal evidence for the impact of the home numeracy environment on children’s numeracy. This study highlights the potential of ecologically valid interventions to support early numeracy in daily life.
Full
Article
ERCT Criteria Breakdown
-
Level 1 Criteria
-
C
Class-level RCT
- Randomization was within classes, but the intervention is parent-child one-to-one home teaching, which fits the ERCT tutoring/personal teaching exception for Criterion C.
- "Within a class (and a school), children were randomly assigned to either the experimental (Numeracy, n = 59) or the control condition (Language, n = 58) (Figure 1)." (p. 13)
Relevant Quotes:
1) "Within a class (and a school), children were randomly assigned to either the experimental (Numeracy, n = 59) or the control condition (Language, n = 58) (Figure 1)." (p. 13)
2) "The 6-week intervention (pretest-March 2023; posttest-May 2023) consisted of an ecologically valid implementation, with parents integrating flexible activities into their routines." (p. 2)
3) "Eighth, the classroom-level randomization in our study was implemented to control for potential confounding effects related to preschool experiences." (p. 40)
4) "However, this approach could have introduced the possibility that parents from different groups (experimental vs. control) within the same classroom shared some of the activities and it was not possible to actively monitor or prevent such sharing." (p. 40)
Detailed Analysis:
The unit of randomization is within class, which would normally fail the ERCT Class-level RCT requirement due to potential contamination between conditions. However, the intervention is implemented at home by parents with their own child, and the ERCT standard explicitly allows a student-level RCT when the intervention is personal teaching/tutoring. The paper’s description makes clear this is a parent-child home activity intervention, and not a teacher-delivered classroom intervention. The authors also acknowledge the main contamination risk that could exist here (parents sharing activities across conditions within a classroom), but that risk does not negate the personal-teaching nature of the intervention.
Final sentence: Criterion C is met because this is a one-to-one parent-child home teaching intervention, which fits the ERCT exception.
-
E
Exam-based Assessment
- Primary outcomes are measured with researcher-developed numeracy tasks, not with widely recognized standardized exams.
- "Children’s numerical skills were measured with five tasks previously developed by our research team and validated in studies with children of the same age range (Bakker et al., 2023; Rathé et al., 2022; Turan & De Smedt, 2023; Vandecruys et al., 2024)." (p. 17)
Relevant Quotes:
1) "Children’s numerical skills were measured with five tasks previously developed by our research team and validated in studies with children of the same age range (Bakker et al., 2023; Rathé et al., 2022; Turan & De Smedt, 2023; Vandecruys et al., 2024)." (p. 17)
2) "Mathematical language. Children’s mathematical language was assessed using the Preschool Assessment of the Language of Mathematics (PALM) from Purpura and Logan (2015) adapted by Turan and De Smedt (2023)." (p. 18)
Detailed Analysis:
ERCT Criterion E requires standardized, widely recognized exam-based assessments for the main educational outcomes. The paper explicitly says its main numeracy outcomes are measured with tasks "previously developed by our research team", which are research instruments rather than standardized exams. The PALM measure is an assessment instrument, but it is presented as a task used in research (and adapted for this context), not as a system-wide standardized exam that would satisfy ERCT’s intent.
Final sentence: Criterion E is not met because outcomes are assessed with researcher-developed tasks rather than standardized exams.
-
T
Term Duration
- Outcomes were measured about 6 weeks after intervention start, which is shorter than a full academic term (about 3-4 months).
- "The 6-week intervention (pretest-March 2023; posttest-May 2023) consisted of an ecologically valid implementation, with parents integrating flexible activities into their routines." (p. 2)
Relevant Quotes:
1) "The 6-week intervention (pretest-March 2023; posttest-May 2023) consisted of an ecologically valid implementation, with parents integrating flexible activities into their routines." (p. 2)
2) "We measured children’s once before and once after the 6-week intervention (in March 2023 and in May 2023, respectively) with the same set of tests." (p. 13)
Detailed Analysis:
ERCT Criterion T requires outcome measurement at least one full academic term after the intervention begins. The paper explicitly describes a 6-week intervention and states pretest in March 2023 with posttest in May 2023, which is substantially shorter than a typical term-length follow-up window.
Final sentence: Criterion T is not met because the measurement interval is only about 6 weeks, not a full academic term.
-
D
Documented Control Group
- The control condition is described in detail and baseline characteristics are reported with descriptive tables.
- "In the control condition, participants received equivalent materials - such as a verbal memory game, a visual memory game (pictures, letters, rhymes), a bingo game, and a storybook - without numerical content." (p. 16)
Relevant Quotes:
1) "In the control condition, participants received equivalent materials - such as a verbal memory game, a visual memory game (pictures, letters, rhymes), a bingo game, and a storybook - without numerical content." (p. 16)
2) "Descriptive statistics for children’s and parents’ characteristics at pretest are presented in Table 1 and Table 2, respectively." (p. 23)
Detailed Analysis:
ERCT Criterion D requires a well-documented control group, including what the control group received and baseline comparability information. The paper provides a concrete description of the active control materials and explicitly reports descriptive baseline characteristics for children and parents at pretest in Tables 1 and 2, supporting interpretability of comparisons.
Final sentence: Criterion D is met because the control condition and baseline characteristics are documented in sufficient detail.
-
Level 2 Criteria
-
S
School-level RCT
- Randomization occurred within classes rather than assigning whole schools to intervention vs control.
- "Within a class (and a school), children were randomly assigned to either the experimental (Numeracy, n = 59) or the control condition (Language, n = 58) (Figure 1)." (p. 13)
Relevant Quotes:
1) "Within a class (and a school), children were randomly assigned to either the experimental (Numeracy, n = 59) or the control condition (Language, n = 58) (Figure 1)." (p. 13)
2) "The children came from 26 different schools (including 42 different classrooms) in the Flemish part of Belgium." (p. 11)
Detailed Analysis:
ERCT Criterion S requires school-level randomization (schools assigned to conditions). The paper states that randomization was within class (and within school), which indicates student-level assignment rather than school-level assignment, even though many schools participated.
Final sentence: Criterion S is not met because the unit of randomization is not the school.
-
I
Independent Conduct
- The paper does not report that an independent third-party evaluation team conducted the study; implementation and evaluation appear to be run by the research team.
- "Children were assessed at their school, individually in a quiet room by an experimenter blind to the condition." (p. 13)
Relevant Quotes:
1) "Children were assessed at their school, individually in a quiet room by an experimenter blind to the condition." (p. 13)
2) "In the experimental condition, participants were provided with a set of materials adapted from previous intervention studies, designed to elicit interactions about numeracy." (p. 15)
3) "Following our pre-registered analysis plan (https://osf.io/k5ea9), we first assessed the quality of our randomization by testing whether the conditions did not differ at pretest." (p. 21)
Detailed Analysis:
ERCT Criterion I requires the study to be conducted independently from the intervention designers to reduce bias. The paper describes the intervention materials and the evaluation procedures using first-person "we" statements and indicates the intervention materials were selected and adapted as part of the study design. While the assessor was blind to condition, the paper does not state that an external agency or independent evaluation team conducted implementation, data collection, or analysis.
Final sentence: Criterion I is not met because independence from the intervention design team is not demonstrated.
-
Y
Year Duration
- The intervention and outcome measurement span only about 6 weeks, not a full academic year.
- "We measured children’s once before and once after the 6-week intervention (in March 2023 and in May 2023, respectively) with the same set of tests." (p. 13)
Relevant Quotes:
1) "The 6-week intervention (pretest-March 2023; posttest-May 2023) consisted of an ecologically valid implementation, with parents integrating flexible activities into their routines." (p. 2)
2) "We measured children’s once before and once after the 6-week intervention (in March 2023 and in May 2023, respectively) with the same set of tests." (p. 13)
Detailed Analysis:
ERCT Criterion Y requires outcomes to be measured at least one full academic year after the intervention begins (even if the intervention itself is shorter). The paper clearly reports only a 6-week window between pretest and posttest.
Final sentence: Criterion Y is not met because the study does not track outcomes for a full academic year.
-
B
Balanced Control Group
- The study uses a matched active control with equivalent materials, engagement, and support, isolating numeracy content as the main difference.
- "The primary and only difference between the two conditions was the inclusion of numerical content in the experimental condition and its absence in the control condition." (p. 16)
Relevant Quotes:
1) "In the control condition, participants received equivalent materials - such as a verbal memory game, a visual memory game (pictures, letters, rhymes), a bingo game, and a storybook - without numerical content." (p. 16)
2) "Three activities per week were suggested in the booklet for both conditions, but families were free to decide which activities to play, how often, and how challenging to make them, thereby preserving ecological validity." (p. 16)
3) "Weekly videos (three per condition) were sent via email or text message, providing additional guidance and motivation." (p. 16)
4) "The primary and only difference between the two conditions was the inclusion of numerical content in the experimental condition and its absence in the control condition." (p. 16)
5) "This letter emphasized that participation entailed exploring a set of materials with their child over a six-week period, with activities designed to take no more than 10 minutes per day." (p. 11)
Detailed Analysis:
ERCT Criterion B requires that time and resources are balanced between intervention and control unless extra resources are explicitly the treatment variable. Here, the paper describes an active control condition that receives equivalent materials, a comparable suggested activity schedule, and equivalent ongoing support (weekly videos and surveys). The authors explicitly state that the "primary and only difference" is numeracy content, implying that time, contact, and materials are matched. The described per-day activity time is part of participation for both conditions, and no unbalanced additional tutoring time, budget, or staffing is described for only one arm.
Final sentence: Criterion B is met because resources and engagement are explicitly matched across conditions, with numeracy content as the only stated difference.
-
Level 3 Criteria
-
R
Reproduced
- No independent replication by a different research team is identified, and the paper frames the work as novel.
- "To the best of our knowledge, our study is the first to demonstrate positive effects of an HNE intervention on preschoolers’ numeracy skills under realistic, ecologically valid conditions – combining a variety of playful activities with minimal supervision." (p. 34)
Relevant Quotes:
1) "To the best of our knowledge, our study is the first to demonstrate positive effects of an HNE intervention on preschoolers’ numeracy skills under realistic, ecologically valid conditions – combining a variety of playful activities with minimal supervision." (p. 34)
Detailed Analysis:
ERCT Criterion R requires independent replication by other authors in a peer-reviewed publication. The paper itself characterizes the study as the first of its kind, and an external literature search performed for this ERCT check did not identify an independent replication of this specific intervention as of the ERCT check date.
Final sentence: Criterion R is not met because no independent replication study is found.
-
A
All-subject Exams
- Criterion E is not met and the study does not measure standardized outcomes across all core subjects.
- "Children’s numerical skills were measured with five tasks previously developed by our research team and validated in studies with children of the same age range (Bakker et al., 2023; Rathé et al., 2022; Turan & De Smedt, 2023; Vandecruys et al., 2024)." (p. 17)
Relevant Quotes:
1) "Children’s numerical skills were measured with five tasks previously developed by our research team and validated in studies with children of the same age range (Bakker et al., 2023; Rathé et al., 2022; Turan & De Smedt, 2023; Vandecruys et al., 2024)." (p. 17)
2) "As a control measure, we included the Matrix Reasoning subtest of the Dutch Wechsler Intelligence Scale for Children, Third Edition (Wechsler, 2011)." (p. 19)
Detailed Analysis:
ERCT Criterion A requires standardized exam-based assessment across all main subjects, and ERCT explicitly states that if Criterion E is not met then Criterion A is not met. In addition, the outcomes reported here are numeracy tasks (plus a non-verbal reasoning control), not a full set of core-subject standardized exams.
Final sentence: Criterion A is not met because Criterion E is not met and there is no all-subject standardized assessment.
-
G
Graduation Tracking
- Year-long tracking is not present and the paper reports no delayed posttest; no follow-up-to-graduation publications were identified.
- "First, our study did not include a delayed posttest measure. As a result, we cannot determine whether the observed intervention effects were sustained over time." (p. 37)
Relevant Quotes:
1) "First, our study did not include a delayed posttest measure. As a result, we cannot determine whether the observed intervention effects were sustained over time." (p. 37)
2) "The 6-week intervention (pretest-March 2023; posttest-May 2023) consisted of an ecologically valid implementation, with parents integrating flexible activities into their routines." (p. 2)
Detailed Analysis:
ERCT Criterion G requires tracking participants until graduation, and the ERCT rules specify that if Criterion Y (year duration) is not met, then Criterion G is not met. This study reports only a short pretest-posttest window and explicitly states it did not include even a delayed posttest. A targeted external search for later follow-up papers by these authors reporting longer-term tracking for this cohort did not identify any such graduation-tracking publication as of the ERCT check date.
Final sentence: Criterion G is not met because the study lacks year-long follow-up (and explicitly lacks delayed posttesting), so graduation tracking is absent.
-
P
Pre-Registered
- The paper states an OSF pre-registration link, but the ERCT check could not verify a time-stamped registration date before data collection.
- "The sample size, materials, and the planned analyses of this study were pre-registered on the Open Science Framework (https://osf.io/k5ea9)." (p. 12)
Relevant Quotes:
1) "The sample size, materials, and the planned analyses of this study were pre-registered on the Open Science Framework (https://osf.io/k5ea9)." (p. 12)
2) "The study was conducted between February and May 2023." (p. 11)
Detailed Analysis:
ERCT Criterion P requires a time-stamped pre-registration that can be verified to have occurred before data collection began. The paper provides an OSF link and describes the study timing (February to May 2023). However, during this ERCT check, the OSF page contents needed to confirm the registration timestamp (date registered) could not be retrieved from the registry/database, so the check cannot verify that the OSF record is a time-stamped preregistration created before data collection.
Final sentence: Criterion P is not met because the preregistration timing cannot be independently verified from the registry record.
Request an Update or Contact Us
Are you the author of this study? Let us know if you have any questions or updates.