Dissecting the Flipped Classroom: Using a Randomized Controlled Trial Experiment to Determine When Student Learning Occurs

Matthew D. Casselman, Kinnari Atit, Grace Henbest, Cybill Guregyan, Kiana Mortezaei, and Jack F. Eichler

Published:
ERCT Check Date:
DOI: 10.1021/acs.jchemed.9b00767
  • science
  • higher education
  • US
  • flipped classroom
  • blended learning
  • EdTech platform
  • digital assessment
0
  • C

    Random assignment occurred at the individual student level, not by entire class or school.

    “The available participants were then randomly assigned to one of the three study groups prior to commencing the study.” (p. 3)

  • E

    The study employed custom 22‑item free‑response tests rather than a standardized exam.

    “The pre‑ and posttest assessments are 22‑item free response measures that probe student understanding of stereochemical properties of organic molecules.” (p. 3)

  • T

    Outcomes were measured over a seven‑day period, not a full term.

    “two learning intervention sessions during the seven‑day study period during the first and second weeks of the term” (Table 1)

  • D

    The negative control group’s composition and baseline data are clearly documented.

    “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests and did not receive any learning interventions.” (p. 3)

  • S

    Randomization was at the individual student level, not by school.

    “The available participants were then randomly assigned to one of the three study groups prior to commencing the study.” (p. 3)

  • I

    The intervention was designed, delivered, and assessed by the same team without independent oversight.

    “Tests were scored by author M.D.C. using an objective answer key.” (p. 3)

  • Y

    The study tracked outcomes over one week, not a full academic year.

    “two learning intervention sessions during the seven‑day study period during the first and second weeks of the term” (Table 1)

  • B

    The extra evening sessions are central to the intervention and thus the unmatched control is acceptable.

    “… two learning intervention sessions … held in the evenings outside of the time frame in which most classes on campus are scheduled.” (p. 3)

  • R

    No independent replication of this RCT is reported.

  • A

    Only stereochemistry outcomes were measured, and E was not met.

    “The pre‑ and posttest assessments are 22‑item free response measures that probe student understanding of stereochemical properties of organic molecules.” (p. 3)

  • G

    Tracking ended after one week; no graduation‑level follow‑up.

  • P

    No pre‑registration of the study protocol is mentioned.

Abstract

The use of the flipped classroom approach in higher education STEM courses has rapidly increased over the past decade, and it appears this type of learning environment will play an important role in improving student success and retention in undergraduate chemistry “gatekeeper” courses. Many adopters of the flipped classroom structure see the greatest benefit originating from the additional time this format provides for the implementation of student-centered learning activities during the classroom period. However, results from recent quasi-experiments suggest that improved course performance for students in flipped classroom environments has a significant contribution from the online preclass activities. In order to compare the impact of the preclass online learning environment to the in-class collaborative activities typically done in a flipped classroom, a randomized controlled trial (RCT) was conducted with student volunteers. A two-day organic chemistry stereochemistry unit was delivered to students who were randomly assigned to “flipped classroom” and “traditional lecture” treatment groups. Performance gains were measured after each phase of the instructional intervention for both treatment groups, and these gains were compared to students from a randomly assigned negative control group. A mixed-methods ANOVA indicates that under these experimental conditions the online learning component appears to account for most of the improvement in posttest scores observed in the flipped classroom treatment. These results suggest optimizing the design of the asynchronous online learning environment will positively impact student performance outcomes. Therefore, this component of the flipped classroom deserves more attention from instructional designers and classroom practitioners.

Full Article

ERCT Criteria Breakdown

  • Level 1 Criteria

    • C

      Class-level RCT

      • Random assignment occurred at the individual student level, not by entire class or school.
      • “The available participants were then randomly assigned to one of the three study groups prior to commencing the study.” (p. 3)
      • Relevant Quotes: 1) “The available participants were then randomly assigned to one of the three study groups prior to commencing the study (see Table 1)” (p. 3) 2) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre- and posttests and did not receive any learning interventions.” (p. 3) Detailed Analysis: Both quotes make clear that individual student volunteers— not intact classes or schools—were randomized to conditions. The ERCT standard’s Class‑level RCT (C) criterion requires randomization at the class or school level to avoid contamination across students in the same classroom. No exception for one‑on‑one tutoring applies here, as this is a group intervention. Therefore, criterion C is not met because the study randomized at the student level rather than class level.
    • E

      Exam-based Assessment

      • The study employed custom 22‑item free‑response tests rather than a standardized exam.
      • “The pre‑ and posttest assessments are 22‑item free response measures that probe student understanding of stereochemical properties of organic molecules.” (p. 3)
      • Relevant Quotes: 1) “The pre- and posttest assessments are 22‑item free response measures that probe student understanding of stereochemical properties of organic molecules (see Supporting Information…).” (p. 3) 2) “Because customized assessments had to be created to test the distinct set of learning objectives covered in the treatment group interventions, item‑analyses were carried out…” (p. 4) Detailed Analysis: Both quotes confirm the outcome measures were bespoke, researcher‑designed free‑response instruments aligned to the intervention’s content, not a recognized standardized exam. This fails the ERCT Exam‑based Assessment (E) requirement for using standard tests. Therefore, criterion E is not met because the study used custom instruments rather than a standardized exam.
    • T

      Term Duration

      • Outcomes were measured over a seven‑day period, not a full term.
      • “two learning intervention sessions during the seven‑day study period during the first and second weeks of the term” (Table 1)
      • Relevant Quotes: 1) “… two learning intervention sessions during the seven‑day study period during the first and second weeks of the term …” (Table 1, p. 3) 2) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests…” (p. 3) Detailed Analysis: The primary outcomes were measured over a one‑week period, far shorter than an academic term (~3–4 months). Thus, the ERCT Term Duration (T) criterion is not satisfied. Therefore, criterion T is not met because outcome measurement occurred within one week, not a full term.
    • D

      Documented Control Group

      • The negative control group’s composition and baseline data are clearly documented.
      • “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests and did not receive any learning interventions.” (p. 3)
      • Relevant Quotes: 1) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests and did not receive any learning interventions. These students provided a baseline comparison for the performance gains measured in the two treatment groups.” (p. 3) 2) “Number of participants (n) n = 16” for the negative control in Table 2 (p. 5) Detailed Analysis: The control group’s size, assessment schedule, and baseline role are clearly described in the methods and Table 2. This fulfills the ERCT Documented Control Group (D) requirement. Therefore, criterion D is met because the control group is fully documented.
  • Level 2 Criteria

    • S

      School-level RCT

      • Randomization was at the individual student level, not by school.
      • “The available participants were then randomly assigned to one of the three study groups prior to commencing the study.” (p. 3)
      • Relevant Quotes: 1) “The available participants were then randomly assigned to one of the three study groups…” (p. 3) 2) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests…” (p. 3) Detailed Analysis: Randomization occurred at the student level within a single university course, not at the school level. Hence, the ERCT School‑level RCT (S) criterion is not met. Therefore, criterion S is not met because randomization was not conducted at the school level.
    • I

      Independent Conduct

      • The intervention was designed, delivered, and assessed by the same team without independent oversight.
      • “Tests were scored by author M.D.C. using an objective answer key.” (p. 3)
      • Relevant Quotes: 1) “Tests were scored by author M.D.C. using an objective answer key.” (p. 3) 2) “The Playposit questions were embedded within the video … featuring the same instructor as the traditional lecture treatment.” (p. 4) Detailed Analysis: The same research team designed, delivered, and assessed the intervention without third‑party oversight, failing the ERCT Independent Conduct (I) requirement. Therefore, criterion I is not met because no external evaluator or independent conductor was involved.
    • Y

      Year Duration

      • The study tracked outcomes over one week, not a full academic year.
      • “two learning intervention sessions during the seven‑day study period during the first and second weeks of the term” (Table 1)
      • Relevant Quotes: 1) “two learning intervention sessions during the seven‑day study period during the first and second weeks of the term” (Table 1) 2) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests…” (p. 3) Detailed Analysis: Follow‑up lasted only one week, far shorter than a full academic year. This fails the ERCT Year Duration (Y) requirement. Therefore, criterion Y is not met because the study did not span a full academic year.
    • B

      Balanced Resources

      • The extra evening sessions are central to the intervention and thus the unmatched control is acceptable.
      • “… two learning intervention sessions … held in the evenings outside of the time frame in which most classes on campus are scheduled.” (p. 3)
      • Relevant Quotes: 1) “The flipped classroom and traditional lecture treatment groups attended two learning intervention sessions … held in the evenings outside of the time frame in which most classes on campus are scheduled.” (p. 3) 2) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests and did not receive any learning interventions.” (p. 3) Detailed Analysis: The additional evening sessions are integral to the intervention itself—testing the effect of extra instructional time in a flipped format. Under ERCT’s balanced‑resources exception for treatments explicitly testing extra resources, the control group’s “business as usual” condition is appropriate. Therefore, criterion B is met because the extra evening sessions were an intentional part of the flipped intervention.
  • Level 3 Criteria

    • R

      Reproduced Results

      • No independent replication of this RCT is reported.
      • Relevant Quotes: – No mention of any independent replication study appears. Detailed Analysis: The original paper and its supporting information do not reference any replication by a separate research team in a different context. Therefore, criterion R is not met because no independent replication by other researchers is reported.
    • A

      All Exams

      • Only stereochemistry outcomes were measured, and E was not met.
      • “The pre‑ and posttest assessments are 22‑item free response measures that probe student understanding of stereochemical properties of organic molecules.” (p. 3)
      • Relevant Quotes: 1) “The pre‑ and posttest assessments are 22‑item free response measures that probe student understanding of stereochemical properties of organic molecules…” (p. 3) Detailed Analysis: Only stereochemistry was assessed; no core subjects (e.g., mathematics, reading, science) were measured. Also, E is not met, hence A cannot be met under ERCT rules. Therefore, criterion A is not met because assessments focused exclusively on stereochemistry rather than all core subjects.
    • G

      Graduation Tracking

      • Tracking ended after one week; no graduation‑level follow‑up.
      • Relevant Quotes: 1) “Students randomly assigned to the negative control group only completed the day 0 and day 7 pre‑ and posttests and did not receive any learning interventions.” (p. 3) Detailed Analysis: No follow‑up or tracking until graduation is reported—only a one‑week assessment window—so the ERCT Graduation Tracking (G) criterion is not satisfied. Therefore, criterion G is not met because tracking ended at posttest 2 with no further follow-up toward graduation.
    • P

      Pre-Registered Protocol

      • No pre‑registration of the study protocol is mentioned.
      • Relevant Quotes: – No statement of pre‑registration or registry ID is provided. Detailed Analysis: The paper does not mention any pre‑registered protocol or public registry prior to data collection, failing the ERCT Pre‑registered Protocol (P) criterion. Therefore, criterion P is not met because no pre-registration of the study was reported.

Request an Update or Contact Us

Are you the author of this study? Let us know if you have any questions or updates.

Have Questions
or Suggestions?

Get in Touch

Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.

  • Submit a Study for Evaluation

    Share your research with us for review

  • Suggest Improvements

    Provide feedback to help us make things better.

  • Update Your Study

    If you're the author, let us know about necessary updates or corrections.