Abstract
Background and context: Early childhood computer science (CS) education is a high-priority focus worldwide, but early childhood CS tools are primarily developed and researched within the United States and Europe. As an example, the Coding as Another Language ScratchJr (CAL-ScratchJr) curriculum is used in multiple countries but was developed and evaluated in the United States. Objectives: This paper describes the implementation and evaluation of an adapted CAL-ScratchJr curriculum in Mendoza and Corrientes, Argentina. Method: We used mixed methods and a cluster-randomized control trial to evaluate curriculum success, measured by validated assessments and teacher interviews. Findings: The curriculum significantly improved student coding knowledge and computational thinking. Teachers saw the curriculum as successful across multiple domains. Implications: These findings reinforce that the success of CS education programs should be evaluated not only with internationally validated assessments but also by local understandings of success.
Full
Article
ERCT Criteria Breakdown
-
Level 1 Criteria
-
C
Class-level RCT
- Randomization was performed at the school level, satisfying the requirement for class-level assignment.
- "Schools were randomly assigned to teach the CAL-ScratchJr curriculum or the control condition."
Relevant Quotes:
1) "We used mixed methods and a cluster-randomized control trial to evaluate curriculum success" (p. 2)
2) "This study was designed as a cluster-randomized control trial, so to answer this research question, we used a three-level multilevel growth model with students' assessment scores as the independent variable, nested at the student level and finally at the school level." (p. 11)
3) "Seventeen schools across the Corrientes and Mendoza provinces took part in the CAL-ScratchJr Argentina Project. Schools were randomly assigned to teach the CAL-ScratchJr curriculum or the control condition." (p. 8)
Detailed Analysis:
The ERCT standard requires the study to be a Randomized Controlled Trial (RCT) conducted at the class level or higher to prevent contamination. The authors explicitly state that the randomization occurred at the school level ("Schools were randomly assigned"), which is a stronger unit of randomization than the class level. Therefore, the requirement for class-level randomization is automatically satisfied.
Final sentence explaining if criterion C is met because the study utilized a cluster-randomized design where randomization occurred at the school level.
-
E
Exam-based Assessment
- The study utilized specific assessments developed by the research group (CSA and TechCheck) rather than widely recognized standardized exams.
- "Coding knowledge... was evaluated using the Coding Stages Assessment (CSA)... The TechCheck assessments evaluated computational thinking"
Relevant Quotes:
1) "Coding knowledge, defined as knowledge of the ScratchJr coding language and its' syntax, was evaluated using the Coding Stages Assessment (CSA) (de Ruiter & Bers, 2021)." (p. 10)
2) "The TechCheck assessments evaluated computational thinking, as defined by the cognitive skills necessary for coding (e.g. sequencing, abstraction) divorced from any one coding language (Relkin & Bers, 2021; Relkin et al., 2020, 2023)." (p. 10)
3) "In Argentina, testing begins in third grade..." (p. 16)
Detailed Analysis:
The ERCT standard requires the use of standardized, widely recognized exam-based assessments (e.g., state or national exams) and explicitly excludes assessments specially designed for the study or specific to the intervention. The study uses the Coding Stages Assessment (CSA) and TechCheck. While the paper notes these have been "previously validated," they are specific instruments developed by the research group (DevTech/Bers et al.) specifically for assessing coding and computational thinking in early childhood, rather than standard, widely recognized exams. Furthermore, the authors acknowledge that standard testing in Argentina only begins in third grade, implying these students (K-2) were not subject to standard exams.
Final sentence explaining if criterion E is not met because the assessments used were specific research instruments developed by the authors' lab rather than widely recognized standardized exams.
-
T
Term Duration
- The paper specifies the number of lessons but does not provide specific dates or a duration interval to confirm the intervention spanned at least one full academic term.
- "The curricula... consist of 24 45-minute lesson plans"
Relevant Quotes:
1) "The curricula, freely available online for kindergarten, first, and second grade, consist of 24 45-minute lesson plans..." (p. 4)
2) "Teachers then implemented the 24-lesson coding curriculum in the classroom. During the curriculum, teachers were given a mid-curriculum survey... Finally, following the curriculum... students completed post-curriculum assessments." (p. 8)
Detailed Analysis:
The ERCT standard requires that outcomes be measured at least one full academic term (approx. 3-4 months) after the intervention begins. The text specifies the curriculum consists of "24 45-minute lesson plans." While 24 lessons could theoretically span a term (e.g., if taught twice a week for 12 weeks), the paper does not explicitly state the start and end dates of the intervention or the total duration in weeks/months. Without a quoted time interval confirming the duration was at least one term, the criterion cannot be confirmed as met according to the strict documentation requirements.
Final sentence explaining if criterion T is not met because the specific calendar duration of the intervention is not explicitly stated, only the number of lesson plans.
-
D
Documented Control Group
- The control group's demographics are tabulated, and their "business as usual" condition is clearly described.
- "Schools in the control condition participated in their daily activities... This did not include a standardized CS program"
Relevant Quotes:
1) "Schools in the control condition participated in their daily activities, with teachers receiving only their standard professional development and teaching their standard curriculum. This did not include a standardized CS program..." (p. 8)
2) "Table 1. Distribution of students by gender, condition, and grade." (p. 9) [Table lists Control totals: Girls 56+41+41, Boys 70+52+56, Total 126+93+97]
3) "17 schools, 9 in Corrientes and 8 in Mendoza." (p. 9)
Detailed Analysis:
The ERCT standard requires detailed documentation of the control group, including demographics and conditions. The paper provides a clear description of the control condition ("business as usual," no standardized CS program) and provides a specific breakdown of the control group's participants by grade and gender in Table 1.
Final sentence explaining if criterion D is met because the authors provided demographic data and a clear description of the business-as-usual conditions for the control group.
-
Level 2 Criteria
-
S
School-level RCT
- The study randomized entire schools rather than classes or students, satisfying the school-level RCT requirement.
- "Schools were randomly assigned to teach the CAL-ScratchJr curriculum or the control condition."
Relevant Quotes:
1) "The schools served as the clustering level, as teachers work within school communities." (p. 8)
2) "Seventeen schools across the Corrientes and Mendoza provinces took part in the CAL-ScratchJr Argentina Project." (p. 8)
3) "Schools were randomly assigned to teach the CAL-ScratchJr curriculum or the control condition." (p. 8)
Detailed Analysis:
The ERCT standard requires randomization at the school level. The paper explicitly states that the school was the unit of clustering and randomization ("Schools were randomly assigned").
Final sentence explaining if criterion S is met because the randomization was conducted at the school level.
-
I
Independent Conduct
- The study was conducted and analyzed by the same researchers who developed the curriculum.
- "The Coding as Another Language for ScratchJr (CAL-ScratchJr) curricula were developed by the DevTech research group"
Relevant Quotes:
1) "The Coding as Another Language for ScratchJr (CAL-ScratchJr) curricula were developed by the DevTech research group..." (p. 4)
2) "Coding as another language: an early childhood programming curriculum in Argentina... Tess Levinson, Francisca Carocca P & Marina Bers" (p. 2)
3) "Our process included first identification and open coding by two authors of teacher quotes... One author is a white American woman... The other author is a Latin American woman..." (p. 12)
Detailed Analysis:
The ERCT standard requires the study to be conducted independently from the authors who designed the intervention. The intervention (CAL-ScratchJr) was developed by the DevTech research group, led by author Marina Bers. The authors of this paper (Levinson, Carocca, Bers) are directly involved in the research, data coding, and analysis. While they partnered with the Varkey Foundation for implementation, the analysis was conducted by the authors themselves.
Final sentence explaining if criterion I is not met because the authors of the paper are the developers of the curriculum and conducted the analysis.
-
Y
Year Duration
- The study tracked outcomes only until the end of the 24-lesson curriculum, not for a full academic year.
- "Teachers then implemented the 24-lesson coding curriculum... following the curriculum... students completed post-curriculum assessments."
Relevant Quotes:
1) "Teachers then implemented the 24-lesson coding curriculum... following the curriculum... students completed post-curriculum assessments." (p. 8)
2) "The curricula... consist of 24 45-minute lesson plans" (p. 4)
Detailed Analysis:
The ERCT standard requires outcomes to be measured at least one full academic year after the intervention begins. As established in the analysis for criterion T, the intervention consists of only 24 lessons. There is no mention of a follow-up period extending to a full academic year (approx. 9-10 months). The assessments were administered "following the curriculum."
Final sentence explaining if criterion Y is not met because the study duration was limited to the implementation of 24 lessons and did not span a full academic year.
-
B
Balanced Control Group
- The control group did not receive resources, time, or professional development equivalent to the treatment group.
- "Schools in the control condition participated in their daily activities... receiving only their standard professional development"
Relevant Quotes:
1) "Schools in the control condition participated in their daily activities, with teachers receiving only their standard professional development and teaching their standard curriculum. This did not include a standardized CS program..." (p. 8)
2) "Before teaching the CAL-ScratchJr curriculum, teachers received six sessions of an hour-long professional development..." (p. 8)
3) "The CAL-ScratchJr curriculum was implemented at the classroom level either by the classroom teacher or by an enrichment teacher who received prior training." (p. 8)
Detailed Analysis:
The ERCT standard requires the control group to receive balanced resources (time/budget) unless the extra resources are the explicit variable being tested. The treatment group received a specific new curriculum, 6 hours of professional development, and support from enrichment teachers. The control group received "only their standard professional development" and "participated in their daily activities." There was no compensatory time, attention, or alternative intervention provided to the control group to balance the additional resources and training given to the treatment group. The study measures the efficacy of the curriculum itself, but the lack of resource balance means the effects could be attributed to the extra training and attention.
Final sentence explaining if criterion B is not met because the treatment group received additional professional development and resources that were not matched in the control group.
-
Level 3 Criteria
-
R
Reproduced
- Previous evaluations were conducted by the same authors/research group; no independent replication is cited.
- "The curricula have been implemented and evaluated... (Bers, Blake-West, et al., 2023, Yang et al., 2023)."
Relevant Quotes:
1) "The curricula have been implemented and evaluated in the United States as part of a pilot program and a multi-state cluster-randomized control trial (Bers, Blake-West, et al., 2023, Yang et al., 2023)." (p. 5)
2) "This is the first paper to evaluate the CAL-ScratchJr curriculum and pedagogy outside of the United States..." (p. 16)
Detailed Analysis:
The ERCT standard requires independent replication by a different research team. The previous studies cited (Bers et al., Yang et al.) involve the same primary author (Bers) and research group (DevTech). There is no evidence provided of an independent replication by a completely different research team in a peer-reviewed journal.
Final sentence explaining if criterion R is not met because all cited previous evaluations were conducted by the same research group, not independent teams.
-
A
All-subject Exams
- The study limited assessment to coding and computational thinking, ignoring other core subjects.
- "Students... completed pre-curriculum assessments of coding knowledge and computational thinking."
Relevant Quotes:
1) "Coding knowledge... was evaluated using the Coding Stages Assessment (CSA)" (p. 10)
2) "The TechCheck assessments evaluated computational thinking..." (p. 10)
3) "Students... completed pre-curriculum assessments of coding knowledge and computational thinking." (p. 9)
Detailed Analysis:
The ERCT standard requires measuring impact on all main subjects taught in school. This study only assessed "coding knowledge" and "computational thinking." It did not assess impact on math, reading, science, or other core subjects. Additionally, since Criterion E (Exam-based Assessment) was not met, this criterion is automatically not met.
Final sentence explaining if criterion A is not met because the study only assessed specific domain knowledge (coding/CT) and did not utilize standardized exams for all main school subjects.
-
G
Graduation Tracking
- Tracking ended immediately after the curriculum implementation.
- "Finally, following the curriculum... students completed post-curriculum assessments."
Relevant Quotes:
1) "Finally, following the curriculum... students completed post-curriculum assessments." (p. 8)
Detailed Analysis:
The ERCT standard requires tracking participants until graduation. The study collected data only immediately following the curriculum intervention. There is no mention of long-term tracking or follow-up until graduation in this paper or in subsequent papers by the same authors (as the study is from 2025 and participants are in K-2).
Final sentence explaining if criterion G is not met because there was no long-term follow-up or tracking of students until graduation.
-
P
Pre-Registered
- The paper mentions IRB approval but does not cite a public pre-registration of the study protocol.
Relevant Quotes:
1) "This study was approved by [AUTHORS' INSTITUTIONAL IRB] under protocol code [#########]." (p. 9)
Detailed Analysis:
The ERCT standard requires a pre-registered protocol (e.g., on OSF or ClinicalTrials.gov) before data collection begins. The paper mentions IRB approval, which is an ethical requirement, but does not cite a public pre-registration of the study design and hypotheses on a registry platform.
Final sentence explaining if criterion P is not met because there is no evidence of a public pre-registered protocol.
Request an Update or Contact Us
Are you the author of this study? Let us know if you have any questions or updates.