Effects of a professional development program on teacher beliefs about mathematics teaching and learning after one and two years: An experimental study

Robert C. Schoen, Christopher Rhoads, Bri Oshiro, and Wendy Bray

Published: Nov 15, 2025

ERCT Check Date: Apr 15, 2026

DOI: 10.1016/j.tate.2025.105297

Link

Download PDF

mathematics
K12
US

C

Schools (a stronger unit than classes) were randomized, meeting the class-level-or-stronger RCT requirement.

"Within each matched pair, one school was randomly assigned to the CGI PD condition, with the other school assigned to the comparison condition." (p. 4)
E

The outcomes are teacher belief measures (questionnaires), not standardized exam-based student achievement assessments.

"Teacher beliefs about mathematics teaching and learning were measured using the B-MTL questionnaire (Schoen & LaVenia, 2019)." (p. 4)
T

The first post-intervention measurement in spring 2014 occurs well after the summer 2013 PD began, exceeding one academic term.

"We measured teacher beliefs again in spring 2014 after the first year of the PD program and again in spring 2015 after the second year of the PD program had been offered to teachers assigned to the CGI condition." (p. 4)
D

The comparison condition is described and both groups’ baseline characteristics and sizes are reported in tables.

"Teachers in schools assigned to the comparison condition engaged in practice-as-usual instruction and participation in PD opportunities in mathematics." (p. 4)
S

Schools were randomized to the CGI or comparison condition, satisfying the school-level RCT requirement.

"School-level randomization occurred within these matched pairs in the spring of 2013." (p. 4)
I

The authors state they neither developed nor delivered the CGI PD program and acted as a third-party evaluation team.

"The researchers did not develop or deliver the CGI PD program and thus represent a third-party evaluation team." (p. 3)
Y

Measurement in spring 2014 after a summer 2013 start spans most of an academic year, meeting the 75% year-duration threshold.

"We measured teacher beliefs again in spring 2014 after the first year of the PD program..." (p. 4)
B

The added PD time and supports are integral to the CGI PD intervention being tested against business-as-usual.

"Each of the two years of the multi-year CGI PD program implemented in this study included a 4-day workshop in the summer and two 2-day follow-up workshops held in school settings during the school year." (p. 4)
R

No independent replication by a different author team of this specific school-randomized CGI-PD-on-beliefs trial was found.

"Further study is needed to know whether these results can be replicated elsewhere and under different conditions." (p. 9)
A

Because standardized exam-based outcomes are not used (E is not met), the all-subject standardized exam requirement is not met.
G

Neither this paper nor identified follow-up materials track any participant cohort through a graduation endpoint.

"We followed the teachers for two years during this study." (p. 4)
P

No prospective pre-registration is documented in the paper, and the identified registry entry is explicitly retrospective.

Abstract

Teacher beliefs about mathematics teaching and learning are thought to exert a major influence on instructional practice and student learning. More than 200 teachers in 22 schools participated in a randomized controlled trial of the first two years of a three-year professional development program for elementary mathematics teachers called Cognitively Guided Instruction. We found that the program had large effects on teacher beliefs, with their treatment-group teacher beliefs becoming more consistent with the principles undergirding the CGI program. Moderation analyses suggest that the effects were relatively homogeneous in relation to teachers’ baseline years of teaching experience and mathematical knowledge for teaching.

Full Article

ERCT Criteria Breakdown

Level 1 Criteria
- C
  Class-level RCT
  - Schools (a stronger unit than classes) were randomized, meeting the class-level-or-stronger RCT requirement.
  - "Within each matched pair, one school was randomly assigned to the CGI PD condition, with the other school assigned to the comparison condition." (p. 4)
  - Relevant Quotes: 1) "The study used a matched-pairs design where, within a district, schools were matched based on a measure of the average socioeconomic status of students in the school. Within each matched pair, one school was randomly assigned to the CGI PD condition, with the other school assigned to the comparison condition." (p. 4) 2) "Thus, each of the 22 schools in the sample had an equal probability of assignment to the CGI or the comparison condition." (p. 4) 3) "School-level randomization occurred within these matched pairs in the spring of 2013." (p. 4) Detailed Analysis: Criterion C requires an RCT with randomization at the class level (or stronger). The paper explicitly states that randomization occurred at the school level (22 schools), which is stronger than class-level randomization and also reduces contamination risk within schools. Criterion C is met because randomization occurred at the school level, which is stronger than class-level randomization.
- E
  Exam-based Assessment
  - The outcomes are teacher belief measures (questionnaires), not standardized exam-based student achievement assessments.
  - "Teacher beliefs about mathematics teaching and learning were measured using the B-MTL questionnaire (Schoen & LaVenia, 2019)." (p. 4)
  - Relevant Quotes: 1) "Teacher beliefs about mathematics teaching and learning were measured using the B-MTL questionnaire (Schoen & LaVenia, 2019)." (p. 4) 2) "Teacher beliefs were measured through a self-report questionnaire." (p. 9) Detailed Analysis: Criterion E requires standardized exam-based assessments for the educational outcome measures. In this paper, the focal outcomes are teacher beliefs measured with the B-MTL questionnaire, which is a self-report instrument rather than a standardized exam-based achievement test. Criterion E is not met because the study’s outcomes are measured using a self-report beliefs questionnaire rather than standardized exams.
- T
  Term Duration
  - The first post-intervention measurement in spring 2014 occurs well after the summer 2013 PD began, exceeding one academic term.
  - "We measured teacher beliefs again in spring 2014 after the first year of the PD program and again in spring 2015 after the second year of the PD program had been offered to teachers assigned to the CGI condition." (p. 4)
  - Relevant Quotes: 1) "The first wave occurred in summer 2013, after random assignment and before the PD occurred." (p. 4) 2) "We measured teacher beliefs again in spring 2014 after the first year of the PD program and again in spring 2015 after the second year of the PD program had been offered to teachers assigned to the CGI condition." (p. 4) 3) "Each of the two years of the multi-year CGI PD program implemented in this study included a 4-day workshop in the summer and two 2-day follow-up workshops held in school settings during the school year." (p. 4) Detailed Analysis: Criterion T requires outcomes be measured at least one academic term (~3–4 months) after the intervention begins. The PD begins with a summer workshop (summer 2013) and continues during the school year; the first posttest is in spring 2014, which is well beyond one term after the start. Criterion T is met because outcomes were measured in spring 2014 after PD began in summer 2013, exceeding one academic term.
- D
  Documented Control Group
  - The comparison condition is described and both groups’ baseline characteristics and sizes are reported in tables.
  - "Teachers in schools assigned to the comparison condition engaged in practice-as-usual instruction and participation in PD opportunities in mathematics." (p. 4)
  - Relevant Quotes: 1) "Teachers in schools assigned to the comparison condition engaged in practice-as-usual instruction and participation in PD opportunities in mathematics." (p. 4) 2) "A total of 206 teachers completed the 2013 pretest B-MTL administration, 101 of whom were in schools randomized to the CGI condition and 105 of whom were in schools randomized to the comparison condition." (p. 6) 3) "Table 1 Teacher demographics for the year 1 analytic sample. Factor Comparison (n = 103) CGI (n = 94) Overall (n = 197)" (p. 5) 4) "Table 4 Baseline and outcome descriptives for the analytic sample in year 1, split by treatment group." (p. 6) Detailed Analysis: Criterion D requires that the control/comparison group be well documented (what it received) and described with information that allows comparison (e.g., counts and baseline characteristics). The paper describes the comparison condition as practice-as-usual and provides group sizes and multiple tables with demographics and baseline/outcome descriptives split by condition. Criterion D is met because the comparison condition and its sample characteristics are described and tabulated.
Level 2 Criteria
- S
  School-level RCT
  - Schools were randomized to the CGI or comparison condition, satisfying the school-level RCT requirement.
  - "School-level randomization occurred within these matched pairs in the spring of 2013." (p. 4)
  - Relevant Quotes: 1) "Within each matched pair, one school was randomly assigned to the CGI PD condition, with the other school assigned to the comparison condition." (p. 4) 2) "School-level randomization occurred within these matched pairs in the spring of 2013." (p. 4) 3) "A total of 15 schools in the first school district and 7 schools in the second school district met the eligibility criteria for enrollment." (p. 4) Detailed Analysis: Criterion S requires randomization among schools. The paper states that 22 schools across two districts were enrolled and that randomization occurred at the school level within matched pairs. Criterion S is met because schools (not classes or individuals) were the randomized unit.
- I
  Independent Conduct
  - The authors state they neither developed nor delivered the CGI PD program and acted as a third-party evaluation team.
  - "The researchers did not develop or deliver the CGI PD program and thus represent a third-party evaluation team." (p. 3)
  - Relevant Quotes: 1) "The present study used a randomized controlled trial research design to estimate the impact of a multi-year CGI program on teacher beliefs after one and two years of teacher participation in the program. The researchers did not develop or deliver the CGI PD program and thus represent a third-party evaluation team." (p. 3) 2) "This amounted to approximately 52 contact hours per year for each teacher in a workshop setting facilitated by a CGI workshop leader from Teacher Development Group, the CGI PD provider." (p. 4) Detailed Analysis: Criterion I requires that the study be conducted independently from the intervention designer/provider. The paper explicitly states that the researchers did not develop or deliver the PD and characterizes them as a third-party evaluation team. It also states the PD was facilitated by a workshop leader from Teacher Development Group (the provider), consistent with evaluator vs. provider separation. Criterion I is met because the paper explicitly states the researchers did not develop or deliver the PD and served as a third-party evaluation team.
- Y
  Year Duration
  - Measurement in spring 2014 after a summer 2013 start spans most of an academic year, meeting the 75% year-duration threshold.
  - "We measured teacher beliefs again in spring 2014 after the first year of the PD program..." (p. 4)
  - Relevant Quotes: 1) "The first wave occurred in summer 2013, after random assignment and before the PD occurred." (p. 4) 2) "We measured teacher beliefs again in spring 2014 after the first year of the PD program..." (p. 4) 3) "Each of the two years of the multi-year CGI PD program implemented in this study included a 4-day workshop in the summer and two 2-day follow-up workshops held in school settings during the school year." (p. 4) Detailed Analysis: Criterion Y requires outcomes measured at least 75% of an academic year after the intervention begins. The intervention begins with summer 2013 PD activities and follow-up workshops during the school year, and the first posttest occurs in spring 2014, which spans most of the academic year. Criterion Y is met because outcomes were measured in spring 2014 after the intervention began in summer 2013, covering most of an academic year.
- B
  Balanced Control Group
  - The added PD time and supports are integral to the CGI PD intervention being tested against business-as-usual.
  - "Each of the two years of the multi-year CGI PD program implemented in this study included a 4-day workshop in the summer and two 2-day follow-up workshops held in school settings during the school year." (p. 4)
  - Relevant Quotes: 1) "Each of the two years of the multi-year CGI PD program implemented in this study included a 4-day workshop in the summer and two 2-day follow-up workshops held in school settings during the school year. This amounted to approximately 52 contact hours per year for each teacher in a workshop setting facilitated by a CGI workshop leader from Teacher Development Group, the CGI PD provider." (p. 4) 2) "Teachers in schools assigned to the comparison condition engaged in practice-as-usual instruction and participation in PD opportunities in mathematics." (p. 4) 3) "In addition, they had the opportunity to participate in a science PD program paid for by the research grant and selected by their respective school districts." (p. 4) Detailed Analysis: Criterion B compares the nature, quantity, and quality of resources provided to intervention and control conditions, unless additional resources are explicitly the treatment variable or are integral to the intervention definition. Here, the intervention is a professional-development program (CGI PD) whose defining feature is teacher participation in structured PD (workshops and follow-ups), i.e., additional contact hours and associated supports are not an incidental add-on but the intervention itself. The comparison condition is practice-as-usual (with typical math PD opportunities) and an opportunity for a district-selected science PD program. Because the treatment being evaluated is the provision of CGI PD (including its time and support package) versus business-as-usual, the presence of additional PD contact hours in the treatment group is an integral part of the intended treatment contrast, consistent with the ERCT Criterion B exception for resources that are the treatment variable. Criterion B is met because the extra time/support is integral to the PD intervention being tested rather than an unintended confound.
Level 3 Criteria
- R
  Reproduced
  - No independent replication by a different author team of this specific school-randomized CGI-PD-on-beliefs trial was found.
  - "Further study is needed to know whether these results can be replicated elsewhere and under different conditions." (p. 9)
  - Relevant Quotes: 1) "Consequently, we do not know whether these results would be reproduced in a different sample or context (e.g., for teachers teaching in a different time or place, accountability systems, grade levels, using different textbooks, participating in a different CGI PD program, experiencing different amounts and types of support in their school building, or at different phases of their career, such as prospective (rather than practicing) teachers)." (p. 9) 2) "Further study is needed to know whether these results can be replicated elsewhere and under different conditions." (p. 9) Detailed Analysis: Criterion R requires an independent replication by a different research team in a different context, published in a peer-reviewed outlet, that reproduces this study’s central experimental claim (here: the causal effect of CGI PD on in-service teachers’ beliefs measured after one and two years in a school-randomized design). The paper itself emphasizes uncertainty about whether results would be reproduced and explicitly calls for replication. Internet searching identified related CGI publications and follow-ups by the same research program, but no clear peer-reviewed, independent replication study by a different author team that reproduces this specific RCT on teacher beliefs. Criterion R is not met because no independent replication of this specific trial’s teacher-beliefs findings was identified.
- A
  All-subject Exams
  - Because standardized exam-based outcomes are not used (E is not met), the all-subject standardized exam requirement is not met.
  - Relevant Quotes: 1) "Teacher beliefs about mathematics teaching and learning were measured using the B-MTL questionnaire (Schoen & LaVenia, 2019)." (p. 4) 2) "Teacher beliefs were measured through a self-report questionnaire." (p. 9) Detailed Analysis: Criterion A requires standardized exam-based assessment across all main subjects and depends on criterion E being met. This study’s outcomes are not standardized exam-based achievement outcomes, so E is not met and A cannot be met. Criterion A is not met because criterion E is not met (no standardized exam-based outcomes).
- G
  Graduation Tracking
  - Neither this paper nor identified follow-up materials track any participant cohort through a graduation endpoint.
  - "We followed the teachers for two years during this study." (p. 4)
  - Relevant Quotes (this paper): 1) "We followed the teachers for two years during this study." (p. 4) 2) "We measured teacher beliefs again in spring 2014 after the first year of the PD program and again in spring 2015 after the second year of the PD program had been offered to teachers assigned to the CGI condition." (p. 4) Relevant Quotes (follow-up / related registration material): 3) "Three cohorts of students will be followed through third-grade." (Registry of Efficacy and Effectiveness Studies entry 453.1v1, p. 2) Detailed Analysis: Criterion G requires tracking participants until graduation from the relevant educational stage. This paper follows teachers across two years (summer 2013 to spring 2015) and does not describe any graduation endpoint tracking. Internet searching for follow-up publications/materials connected to this RCT did not identify any document that tracks the relevant student cohort(s) through a graduation milestone (e.g., end of elementary school, high school graduation). The related registry material indicates planned follow-up only "through third-grade," which is far short of any graduation endpoint. Criterion G is not met because graduation tracking is not documented for any participant cohort.
- P
  Pre-Registered
  - No prospective pre-registration is documented in the paper, and the identified registry entry is explicitly retrospective.
  - Relevant Quotes (this paper): 1) "De-identified data and replication code will be made available through Open Science Framework (osf.io) after acceptance and before publication." (p. 11) Relevant Quotes (registration record located via internet search): 2) "The first version of this entry was published on December 28, 2018 8:49:45 PM EST" (Registry of Efficacy and Effectiveness Studies entry 453.1v1, p. 1) 3) "Timing of entry: Retrospective registration" (Registry of Efficacy and Effectiveness Studies entry 453.1v1, p. 2) Detailed Analysis: Criterion P requires a publicly pre-registered protocol before data collection begins, with evidence of registry and timing. This paper includes a statement about data and replication code being made available on OSF after acceptance, but it does not provide a registration ID or a pre-data-collection registration date. Internet searching identified a Registry of Efficacy and Effectiveness Studies entry connected to the CGI experiment; however, it explicitly states "Retrospective registration" and shows a first-publication date in 2018, which is after the study’s 2013 data collection described in this paper. Criterion P is not met because no prospective pre-registration before data collection is documented.

Request an Update or Contact Us

Are you the author of this study? Let us know if you have any questions or updates.

Request Valuation Update

All Other Requests

Have Questions
or Suggestions?

Get in Touch

Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.

Submit a Study for Evaluation

Share your research with us for review
Suggest Improvements

Provide feedback to help us make things better.
Update Your Study

If you're the author, let us know about necessary updates or corrections.

Effects of a professional development program on teacher beliefs about mathematics teaching and learning after one and two years: An experimental study

Schools (a stronger unit than classes) were randomized, meeting the class-level-or-stronger RCT requirement.

The outcomes are teacher belief measures (questionnaires), not standardized exam-based student achievement assessments.

The first post-intervention measurement in spring 2014 occurs well after the summer 2013 PD began, exceeding one academic term.

The comparison condition is described and both groups’ baseline characteristics and sizes are reported in tables.

Schools were randomized to the CGI or comparison condition, satisfying the school-level RCT requirement.

The authors state they neither developed nor delivered the CGI PD program and acted as a third-party evaluation team.

Measurement in spring 2014 after a summer 2013 start spans most of an academic year, meeting the 75% year-duration threshold.

The added PD time and supports are integral to the CGI PD intervention being tested against business-as-usual.

No independent replication by a different author team of this specific school-randomized CGI-PD-on-beliefs trial was found.

Because standardized exam-based outcomes are not used (E is not met), the all-subject standardized exam requirement is not met.

Neither this paper nor identified follow-up materials track any participant cohort through a graduation endpoint.

No prospective pre-registration is documented in the paper, and the identified registry entry is explicitly retrospective.

Abstract

ERCT Criteria Breakdown

Level 1 Criteria

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

Level 2 Criteria

School-level RCT

Independent Conduct

Year Duration

Balanced Control Group

Level 3 Criteria

Reproduced

All-subject Exams

Graduation Tracking

Pre-Registered

Request an Update or Contact Us

Have Questions or Suggestions?

Submit a Study for Evaluation

Suggest Improvements

Update Your Study

Have Questions
or Suggestions?