Boosting Coding Confidence in Elementary Students: The Impact of ELA-Integrated Computational Thinking Curriculum

Leiny Y. Garcia, Yvonne Kao, Sharin Jacob, Clare Baek, Dana Saito-Stehberger, Diana Franklin, and Mark Warschauer

Published: Feb 18, 2026

ERCT Check Date: Mar 13, 2026

DOI: 10.1145/3770762.3772634

Link

Download PDF

language arts
K12
US
EdTech platform

C

Teachers (i.e., classroom/teacher clusters) were randomly assigned to treatment vs. control, meeting the class-level RCT requirement.

"Within each cluster of teachers, we randomly assigned half the participating teachers to the treatment condition and half to the control condition."
E

Outcomes are measured using surveys of coding attitudes (ESCAS), not standardized exam-based achievement assessments.

"Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders."
T

Outcomes were measured from the start to the end of the 2023-2024 school year, exceeding the one-term minimum.

"All participating teachers completed the weekly logs throughout the school year and administered the post-ESCAS to their students at the end of the school year."
D

The control condition is described as business-as-usual and the paper reports control-group counts, demographics, and baseline differences.

"As in similar studies, we compare the CAIforALL Curriculum to a business-as-usual control, meaning that comparison teachers used curriculum or lesson plans they would typically have used in the absence of the study, which often included ad-hoc CS lessons from Code.org."
S

Randomization was at the teacher/class level rather than by assigning entire schools to treatment vs. control.

"We block-randomized teachers to treatment and control conditions..."
I

The paper describes research-team-delivered PD and support, but it does not clearly state that evaluation/data collection and analysis were conducted independently from the intervention designers.

"Throughout the school year, teachers receive additional support through monthly meetings with the research team to discuss successes and challenges in implementing the curriculum."
Y

Outcomes were measured from the start to the end of the school year, satisfying the year-duration requirement.

"All participating teachers completed the weekly logs throughout the school year and administered the post-ESCAS to their students at the end of the school year."
B

Treatment received substantially more CS/CT instructional time (and teacher support) than control, and the paper notes it cannot separate curriculum effects from increased instruction time.

"As expected, control teachers reported spending significantly less time on programming, CT, or CS instruction compared to treatment teachers, reporting an average of 4.7 (SD = 9.4) hours of instruction for the entire school year, t(47.59) = -4.92, p = < 0.001."
R

No independent peer-reviewed replication of this specific RCT was found in the paper or via external search.
A

Because standardized exam-based outcomes are not used (criterion E is not met), the all-subject standardized-exams criterion is also not met.

"Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders."
G

The study reports outcomes only through the end of one school year, and no follow-up publication was found that tracks the cohort through graduation.

"Future work should determine whether observed confidence gains persist and whether delayed effects emerge for other constructs."
P

The paper reports pre-registration on REES prior to collecting any outcome data, but the registry record could not be independently accessed without login.

"We pre-registered the planned analytic model with the Registry of Efficacy and Effectiveness Studies (REES) before the collection of any outcome data [24]."

Abstract

Integrating literacy and computational thinking (CT) can broaden computer science education participation, especially for multilingual learners. This study examined how the Computing and AI for All (CAIforALL) Act 1 Curriculum, an ELA-integrated Scratch-based CT curriculum, impacts coding attitudes among elementary students in predominantly Latine and multilingual districts. The curriculum integrates literacy strategies into CT instruction as proposed by the National Academies of Sciences, Engineering, and Medicine (NASEM) to support multilingual learners. We conducted a cluster randomized controlled trial with 1,325 students in grades 3–5 across 23 schools in two suburban districts. The treatment group used an ELA-integrated CT curriculum for a school year while controls continued with business-as instruction. Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders. We estimated treatment effects using a two-level hierarchical linear model, controlling for student and classroom characteristics. Findings show no statistically significant differences emerged in overall coding attitudes between groups. However, students exposed to a year of ELA-integrated CT curriculum showed significant increases in coding confidence. The curriculum did not significantly affect other attitude dimensions. Findings suggest that an ELA-integrated CT curriculum can enhance coding confidence among elementary students, demonstrating the value of early computing exposure and integrated approaches.

Full Article

ERCT Criteria Breakdown

Level 1 Criteria
- C
  Class-level RCT
  - Teachers (i.e., classroom/teacher clusters) were randomly assigned to treatment vs. control, meeting the class-level RCT requirement.
  - "Within each cluster of teachers, we randomly assigned half the participating teachers to the treatment condition and half to the control condition."
  - Relevant Quotes: 1) "We designed this study as a cluster-randomized trial, following procedures and standards outlined by the What Works Clearinghouse [47]." (p. 382) 2) "We block-randomized teachers to treatment and control conditions after the recruitment period ended and before the summer professional development (PD)." (p. 382) 3) "Within each cluster of teachers, we randomly assigned half the participating teachers to the treatment condition and half to the control condition." (p. 382) 4) "Consistent with the study pre-registration, we estimated program effects using a 2-level hierarchical linear model (HLM) with students nested within teachers/classes." (p. 383) Detailed Analysis: Criterion C requires random assignment at the class level (or a stronger unit). The paper describes a cluster-randomized design and then specifies that teachers were block-randomized to conditions and that students are nested within teachers/classes. This indicates the treatment assignment happens at the teacher/classroom cluster level (not within a classroom at the individual-student level), which aligns with the purpose of Criterion C to reduce within-class contamination. Final summary sentence: Criterion C is met because the unit of randomization is the teacher/classroom cluster rather than individual students within the same class.
- E
  Exam-based Assessment
  - Outcomes are measured using surveys of coding attitudes (ESCAS), not standardized exam-based achievement assessments.
  - "Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders."
  - Relevant Quotes: 1) "Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders." (p. 379) 2) "The Elementary Student Coding Attitudes Survey (ESCAS) is an instrument designed to measure elementary students’ attitudes toward coding." (p. 383) 3) "We added ten items to the measure that are relevant to the demographics of our study population." (p. 383) Detailed Analysis: Criterion E requires standardized exam-based assessments (e.g., state/national standardized achievement tests). This study’s outcomes are attitudinal and measured via a survey instrument (ESCAS). Although ESCAS is described as an instrument validated for certain grades, it is not an exam-based standardized academic assessment, and the authors also modified it by adding ten items for this study. Therefore, the outcome measurement does not satisfy the ERCT requirement for standardized exams. Final summary sentence: Criterion E is not met because the study uses (modified) attitude surveys rather than standardized exam-based achievement assessments.
- T
  Term Duration
  - Outcomes were measured from the start to the end of the 2023-2024 school year, exceeding the one-term minimum.
  - "All participating teachers completed the weekly logs throughout the school year and administered the post-ESCAS to their students at the end of the school year."
  - Relevant Quotes: 1) "The treatment group used an ELA-integrated CT curriculum for a school year while controls continued with business-as instruction." (p. 379) 2) "We enrolled students into the study at the start of the 2023-2024 school year after teachers completed summer professional development (PD) and submitted class rosters..." (p. 382) 3) "All participating teachers submitted class rosters to the evaluation team at the start of the school year and administered the pre-ESCAS to their students." (p. 383) 4) "All participating teachers completed the weekly logs throughout the school year and administered the post-ESCAS to their students at the end of the school year." (p. 383) Detailed Analysis: Criterion T requires outcomes to be measured at least one academic term after the intervention begins. The paper places baseline measurement at the start of the 2023-2024 school year and post measurement at the end of that school year, with the curriculum implemented during the year. A full school year necessarily exceeds one academic term, so the minimum follow-up window is satisfied. Final summary sentence: Criterion T is met because outcomes are measured from the start to the end of a full school year, which is longer than a single academic term.
- D
  Documented Control Group
  - The control condition is described as business-as-usual and the paper reports control-group counts, demographics, and baseline differences.
  - "As in similar studies, we compare the CAIforALL Curriculum to a business-as-usual control, meaning that comparison teachers used curriculum or lesson plans they would typically have used in the absence of the study, which often included ad-hoc CS lessons from Code.org."
  - Relevant Quotes: 1) "As in similar studies, we compare the CAIforALL Curriculum to a business-as-usual control, meaning that comparison teachers used curriculum or lesson plans they would typically have used in the absence of the study, which often included ad-hoc CS lessons from Code.org." (p. 382) 2) "Given that participating districts lacked any adopted elementary curriculum in programming, computational thinking, or computer science prior to the study, we anticipated minimal and inconsistent exposure to CT instruction for control group students." (p. 382) 3) "Table 1 summarizes key attributes of the student sample in the study." (p. 382) 4) "In the analytic samples, the baseline differences between treatment and control group pre-ESCAS responses were between 0.07 and 0.24 standard deviations." (p. 383) Detailed Analysis: Criterion D requires that the control group be well documented, including what they experienced and baseline characteristics. The paper defines the control as business-as-usual and provides concrete context for what that meant (often ad-hoc CS lessons such as Code.org, and likely minimal exposure). It also reports control vs. treatment counts and demographic attributes in Table 1 and discusses baseline differences and adjustment, which further documents comparability. Final summary sentence: Criterion D is met because the paper clearly defines the business-as-usual control condition and reports control-group characteristics and baseline information.
Level 2 Criteria
- S
  School-level RCT
  - Randomization was at the teacher/class level rather than by assigning entire schools to treatment vs. control.
  - "We block-randomized teachers to treatment and control conditions..."
  - Relevant Quotes: 1) "We conducted a cluster randomized controlled trial with 1,325 students in grades 3–5 across 23 schools in two suburban districts." (p. 379) 2) "We block-randomized teachers to treatment and control conditions after the recruitment period ended and before the summer professional development (PD)." (p. 382) 3) "Within each cluster of teachers, we randomly assigned half the participating teachers to the treatment condition and half to the control condition." (p. 382) Detailed Analysis: Criterion S requires school-level randomization (schools assigned to treatment vs. control). While 23 schools participated, the randomization described is explicitly at the teacher level. No quoted text indicates that entire schools were randomized to conditions. Final summary sentence: Criterion S is not met because assignment occurred at the teacher/classroom level rather than at the school level.
- I
  Independent Conduct
  - The paper describes research-team-delivered PD and support, but it does not clearly state that evaluation/data collection and analysis were conducted independently from the intervention designers.
  - "Throughout the school year, teachers receive additional support through monthly meetings with the research team to discuss successes and challenges in implementing the curriculum."
  - Relevant Quotes: 1) "Teachers learn to implement the CAIforALL Curriculum through a 25-hour in-person professional development (PD) session." (p. 381) 2) "Throughout the school year, teachers receive additional support through monthly meetings with the research team to discuss successes and challenges in implementing the curriculum." (p. 381) 3) "The project team recruited participants from two large suburban public school districts in Southern California in Spring 2023, representing 23 schools across both districts." (p. 382) 4) "All participating teachers submitted class rosters to the evaluation team at the start of the school year..." (p. 383) Detailed Analysis: Criterion I requires clear documentation that the study’s conduct (especially evaluation and analysis) is independent from the intervention designers/providers. The paper indicates the research/project team delivered extensive implementation support (PD and monthly meetings). While it mentions an "evaluation team," the paper does not provide an explicit statement that the evaluators and analysts were independent of the curriculum designers, nor does it describe an external third-party evaluation organization or independence safeguards. Final summary sentence: Criterion I is not met because the paper does not clearly document independent third-party conduct of the evaluation separate from the intervention design/support team.
- Y
  Year Duration
  - Outcomes were measured from the start to the end of the school year, satisfying the year-duration requirement.
  - "All participating teachers completed the weekly logs throughout the school year and administered the post-ESCAS to their students at the end of the school year."
  - Relevant Quotes: 1) "The treatment group used an ELA-integrated CT curriculum for a school year..." (p. 379) 2) "We enrolled students into the study at the start of the 2023-2024 school year..." (p. 382) 3) "All participating teachers ... administered the pre-ESCAS to their students." (p. 383) 4) "All participating teachers ... administered the post-ESCAS to their students at the end of the school year." (p. 383) Detailed Analysis: Criterion Y requires outcome measurement at least 75% of one academic year after the intervention begins. The paper describes a school-year intervention with pre measurement at the start of the 2023-2024 school year and post measurement at the end of the school year, which corresponds to essentially the full academic year. Final summary sentence: Criterion Y is met because the study measures outcomes across an entire academic year.
- B
  Balanced Control Group
  - Treatment received substantially more CS/CT instructional time (and teacher support) than control, and the paper notes it cannot separate curriculum effects from increased instruction time.
  - "As expected, control teachers reported spending significantly less time on programming, CT, or CS instruction compared to treatment teachers, reporting an average of 4.7 (SD = 9.4) hours of instruction for the entire school year, t(47.59) = -4.92, p = < 0.001."
  - Relevant Quotes: 1) "Teachers learn to implement the CAIforALL Curriculum through a 25-hour in-person professional development (PD) session." (p. 381) 2) "Throughout the school year, teachers receive additional support through monthly meetings with the research team..." (p. 381) 3) "Treatment teachers reported teaching an average of 72.8% of the curriculum (SD = 25.2%), spending a mean of 17.1 hours (SD = 8.7) of instructional time teaching the CAIforALL curriculum over the year." (p. 383) 4) "As expected, control teachers reported spending significantly less time on programming, CT, or CS instruction compared to treatment teachers, reporting an average of 4.7 (SD = 9.4) hours of instruction for the entire school year, t(47.59) = -4.92, p = < 0.001." (p. 383) 5) "Ten control teachers reported not teaching any programming, CT, or CS at all." (p. 383) 6) "Finally, the analyses presented here do not distinguish between effects of the specific curriculum content and the effects of increased CS instruction time." (p. 384) Detailed Analysis: Criterion B compares the nature, quantity, and quality of resources (time, training/support, materials) across conditions, unless extra resources are explicitly the treatment variable. The treatment condition includes substantial added resources: 25-hour PD and monthly meetings with the research team, and it results in much more CS/CT instructional time delivered (mean 17.1 hours) than in the control condition (mean 4.7 hours, with some control teachers providing none). The paper frames the intent as estimating the effect of "changing the curriculum" relative to business-as-usual, not as explicitly testing "more CS time" as the treatment variable. The paper also explicitly acknowledges the confound by stating the analyses do not distinguish curriculum-content effects from increased CS instruction time. Under the Criterion B decision logic, this is a non-negligible resource imbalance without a matched control substitute and without being explicitly positioned as the core treatment variable. Final summary sentence: Criterion B is not met because treatment received substantially more time and support than control, and the paper states increased instruction time is not separated from curriculum effects.
Level 3 Criteria
- R
  Reproduced
  - No independent peer-reviewed replication of this specific RCT was found in the paper or via external search.
  - Relevant Quotes: 1) "We designed this study as a cluster-randomized trial..." (p. 382) Detailed Analysis: Criterion R requires an independent replication by a different research team, in a different context, published in a peer-reviewed scientific outlet. The paper itself does not claim that this specific CAIforALL Act 1 curriculum RCT has been replicated by an independent team. External search notes (performed 2026-03-13): Searching for papers that explicitly replicate this CAIforALL Act 1 school-year cluster RCT (same intervention and core design) did not identify a clearly independent replication study by other authors. Given the paper’s 2026 publication timing, the absence of identified replications is not surprising. Final summary sentence: Criterion R is not met because no independent peer-reviewed replication of this specific study was found.
- A
  All-subject Exams
  - Because standardized exam-based outcomes are not used (criterion E is not met), the all-subject standardized-exams criterion is also not met.
  - "Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders."
  - Relevant Quotes: 1) "Pre- and post-surveys measured five coding attitude constructs: confidence, interest, utility, perceived coding values of social circles, and perceptions of young coders." (p. 379) 2) "The Elementary Student Coding Attitudes Survey (ESCAS) is an instrument designed to measure elementary students’ attitudes toward coding." (p. 383) Detailed Analysis: Criterion A requires standardized exam-based assessment across all main subjects, and per the ERCT rule, Criterion A cannot be met if Criterion E is not met. This study uses a coding-attitudes survey rather than standardized achievement exams in any subject. Therefore it does not meet the all-subject standardized exam requirement. Final summary sentence: Criterion A is not met because the study does not use standardized exam-based assessments (so E, and thus A, fail).
- G
  Graduation Tracking
  - The study reports outcomes only through the end of one school year, and no follow-up publication was found that tracks the cohort through graduation.
  - "Future work should determine whether observed confidence gains persist and whether delayed effects emerge for other constructs."
  - Relevant Quotes: 1) "All participating teachers completed the weekly logs throughout the school year and administered the post-ESCAS to their students at the end of the school year." (p. 383) 2) "Future work should determine whether observed confidence gains persist and whether delayed effects emerge for other constructs." (p. 384) Detailed Analysis: Criterion G requires tracking participants until graduation from the relevant educational stage. In this paper, outcomes are measured at the end of the school year, and the discussion explicitly frames persistence as an open question for "future work," which indicates the current study does not include longer-term follow-up to an educational graduation milestone. External search notes (performed 2026-03-13): No subsequent peer-reviewed follow-up paper by the same author team was located that reports tracking this cohort through graduation. Final summary sentence: Criterion G is not met because the study ends at end-of-year measurement and no located follow-up tracks students through graduation.
- P
  Pre-Registered
  - The paper reports pre-registration on REES prior to collecting any outcome data, but the registry record could not be independently accessed without login.
  - "We pre-registered the planned analytic model with the Registry of Efficacy and Effectiveness Studies (REES) before the collection of any outcome data [24]."
  - Relevant Quotes: 1) "We pre-registered the planned analytic model with the Registry of Efficacy and Effectiveness Studies (REES) before the collection of any outcome data [24]." (p. 383) 2) "Yvonne Kao. 2023. Project IMPACT (Early50). Technical Report Registry ID: 17120.1v1. Registry of Efficacy and Effectiveness Studies." (p. 384) 3) "We enrolled students into the study at the start of the 2023-2024 school year..." (p. 382) Detailed Analysis: Criterion P requires that the study protocol be pre-registered before the study begins (i.e., before data collection). The paper explicitly states that the planned analytic model was pre-registered on REES before collecting any outcome data, and it provides a registry identifier (17120.1v1). External verification note (performed 2026-03-13): REES search and registry entry details require an ICPSR Researcher Passport login, and the specific registry record could not be accessed publicly to confirm the registration date. Therefore, this criterion is marked as met based on the paper’s explicit statement and provided registry ID, but without independent date confirmation from the REES record. Final summary sentence: Criterion P is met because the paper explicitly reports REES pre-registration before outcome data collection and provides a registry ID.

Request an Update or Contact Us

Are you the author of this study? Let us know if you have any questions or updates.

Request Valuation Update

All Other Requests

Have Questions
or Suggestions?

Get in Touch

Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.

Submit a Study for Evaluation

Share your research with us for review
Suggest Improvements

Provide feedback to help us make things better.
Update Your Study

If you're the author, let us know about necessary updates or corrections.

Boosting Coding Confidence in Elementary Students: The Impact of ELA-Integrated Computational Thinking Curriculum

Teachers (i.e., classroom/teacher clusters) were randomly assigned to treatment vs. control, meeting the class-level RCT requirement.

Outcomes are measured using surveys of coding attitudes (ESCAS), not standardized exam-based achievement assessments.

Outcomes were measured from the start to the end of the 2023-2024 school year, exceeding the one-term minimum.

The control condition is described as business-as-usual and the paper reports control-group counts, demographics, and baseline differences.

Randomization was at the teacher/class level rather than by assigning entire schools to treatment vs. control.

The paper describes research-team-delivered PD and support, but it does not clearly state that evaluation/data collection and analysis were conducted independently from the intervention designers.

Outcomes were measured from the start to the end of the school year, satisfying the year-duration requirement.

Treatment received substantially more CS/CT instructional time (and teacher support) than control, and the paper notes it cannot separate curriculum effects from increased instruction time.

No independent peer-reviewed replication of this specific RCT was found in the paper or via external search.

Because standardized exam-based outcomes are not used (criterion E is not met), the all-subject standardized-exams criterion is also not met.

The study reports outcomes only through the end of one school year, and no follow-up publication was found that tracks the cohort through graduation.

The paper reports pre-registration on REES prior to collecting any outcome data, but the registry record could not be independently accessed without login.

Abstract

ERCT Criteria Breakdown

Level 1 Criteria

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

Level 2 Criteria

School-level RCT

Independent Conduct

Year Duration

Balanced Control Group

Level 3 Criteria

Reproduced

All-subject Exams

Graduation Tracking

Pre-Registered

Request an Update or Contact Us

Have Questions or Suggestions?

Submit a Study for Evaluation

Suggest Improvements

Update Your Study

Have Questions
or Suggestions?