The role of working memory for learning with context-personalized tasks in elementary school

Ann-Kathrin Laufs, André Meyer, Maleika Krüger, and Sebastian Kempert

Published:
ERCT Check Date:
DOI: 10.3389/fpsyg.2026.1671810
  • science
  • K12
  • EU
0
  • C

    Randomization and delivery were organized at the small-group level rather than at the class (or school) level.

    "we treated the group as the unit of randomization." (p. 9)

  • E

    The primary educational outcome (CVS comprehension) was measured with a research instrument rather than a widely recognized standardized exam.

    "we employed an instrument developed by Edelsbrunner et al. (2018)" (p. 7)

  • T

    The study timeline (eight weeks plus a 2-week post-test) is shorter than an academic term.

    "The study consisted of six sessions, each lasting 50 min, conducted over an eight-week period." (p. 5)

  • D

    The control condition, control sample size, and baseline descriptive statistics by condition are clearly documented.

    "In the control condition, magnetism served as the task context" (p. 6)

  • S

    Multiple schools participated, but assignment was not randomized at the school level.

    "we treated the group as the unit of randomization." (p. 9)

  • I

    The paper does not provide a clear statement that an independent third party conducted the evaluation separate from the intervention designers.

    "Trained test administrators, who were teacher trainees or psychology students, facilitated the sessions" (p. 5)

  • Y

    Outcomes were not measured for 75% of an academic year, and ERCT rules also make Y not met when T is not met.

    "conducted over an eight-week period." (p. 5)

  • B

    The two conditions were designed to be equivalent in time and structure, differing primarily by contextual embedding.

    "differed only with respect to their thematic embedding" (p. 6)

  • R

    No independent replication of this specific trial was found in the paper or via targeted online searches as of the ERCT check date.

  • A

    The study does not report standardized exam outcomes across all core subjects, and ERCT rules make A not met when E is not met.

    "we employed an instrument developed by Edelsbrunner et al. (2018)" (p. 7)

  • G

    The study does not track students until graduation, and ERCT rules also make G not met when Y is not met.

  • P

    No pre-registration registry, ID, or pre-registration date is reported, and no public pre-registration record was found in targeted online searches.

Abstract

Context personalization is an instructional approach aimed at enhancing students’ engagement and cognitive processing by embedding learning content in familiar contexts. Numerous studies explore the benefits of personalized tasks for learning, but few empirically examine cognitive mechanisms underlying the effects of context personalization. In a cluster-randomized control trial with N = 156 elementary school students, we investigated (1) whether context personalization leads to an increased interest in the learning content. Furthermore, we examined (2) the role of working memory for learning and (3) whether the assumed effect of working memory on students’ learning performance was moderated by the use of context-personalized tasks. The results indicate that context personalization elicits interest in the learning content. In addition, working memory was a significant predictor of student performance across conditions. However, the hypothesized moderating effect of context personalization on the relationship between working memory and student performance was not supported. These results contribute to a more nuanced understanding of the cognitive and motivational effects of context-personalized tasks in elementary science education.

Full Article

ERCT Criteria Breakdown

  • Level 1 Criteria

    • C

      Class-level RCT

      • Randomization and delivery were organized at the small-group level rather than at the class (or school) level.
      • "we treated the group as the unit of randomization." (p. 9)
      • Relevant Quotes: 1) "Based on their interest ratings, we then assigned students to small groups of four on average." (p. 6) 2) "Although we tried to assign students to the conditions randomly, rigorous randomization was not possible" (p. 6) 3) "we treated the group as the unit of randomization." (p. 9) Detailed Analysis: Criterion C requires random assignment at the class level (or a stronger unit such as the school), unless the intervention is one-to-one tutoring/personal teaching (which is not the case here). The paper states students were organized into "small groups of four" and that the "group" was treated as the randomization unit. This is not class-level (and not school-level) randomization. The paper also notes that "rigorous randomization was not possible," further weakening the allocation rigor for the ERCT C criterion. Criterion C is not met because the unit of randomization was small groups rather than classes (or schools).
    • E

      Exam-based Assessment

      • The primary educational outcome (CVS comprehension) was measured with a research instrument rather than a widely recognized standardized exam.
      • "we employed an instrument developed by Edelsbrunner et al. (2018)" (p. 7)
      • Relevant Quotes: 1) "we employed an instrument developed by Edelsbrunner et al. (2018)" (p. 7) 2) "we selected nine out of the total of 14 test items." (p. 7) Detailed Analysis: Criterion E requires standardized exam-based assessments (i.e., externally standardized and widely recognized tests), rather than researcher-selected or researcher-assembled instruments. The study’s main learning outcome is CVS comprehension, assessed with an instrument from Edelsbrunner et al. (2018) with a study-specific selection of items ("nine out of ... 14"). This is a research measurement instrument rather than a standardized exam-based assessment system (e.g., state or national exams). Criterion E is not met because the main educational outcome is not measured using a standardized exam-based assessment.
    • T

      Term Duration

      • The study timeline (eight weeks plus a 2-week post-test) is shorter than an academic term.
      • "The study consisted of six sessions, each lasting 50 min, conducted over an eight-week period." (p. 5)
      • Relevant Quotes: 1) "The study consisted of six sessions, each lasting 50 min, conducted over an eight-week period." (p. 5) 2) "We assessed CVS comprehension in a post-test 2 weeks later" (p. 6) Detailed Analysis: Criterion T requires that outcomes be measured at least one full academic term (~3 to 4 months) after the intervention begins. The paper reports the study ran over an "eight-week period" and that CVS comprehension was assessed in a "post-test 2 weeks later." Even if interpreted generously as ~10 weeks from start to post-test, this is still shorter than a typical academic term. Criterion T is not met because the start-to-measurement window is weeks, not a full academic term.
    • D

      Documented Control Group

      • The control condition, control sample size, and baseline descriptive statistics by condition are clearly documented.
      • "In the control condition, magnetism served as the task context" (p. 6)
      • Relevant Quotes: 1) "In the control condition, magnetism served as the task context" (p. 6) 2) "The result was a total of 28 groups in the personalized condition" (p. 6) 3) "13 groups in the control condition with magnetism as a task context." (p. 6) 4) "(personalized condition: 100, control condition: 56)." (p. 6) 5) "Table 4 shows the mean scores and standard deviations" (p. 9) Detailed Analysis: Criterion D requires that the control group be well-documented, including what it received and sufficient descriptive information to support comparisons. The paper clearly describes the control condition (CVS instruction embedded in a magnetism context) and reports group counts and student sample sizes by condition. It also reports baseline descriptive statistics by condition in Table 4 (e.g., pre-test CVS and other measured variables), which supports comparability assessment. Criterion D is met because the control condition and its key characteristics are clearly documented.
  • Level 2 Criteria

    • S

      School-level RCT

      • Multiple schools participated, but assignment was not randomized at the school level.
      • "we treated the group as the unit of randomization." (p. 9)
      • Relevant Quotes: 1) "at six elementary schools" (p. 5) 2) "we treated the group as the unit of randomization." (p. 9) Detailed Analysis: Criterion S requires randomization among schools (or equivalent sites). Although the study took place "at six elementary schools," the paper explicitly states that randomization was done at the group level, not at the school level. Criterion S is not met because randomization was not conducted at the school level.
    • I

      Independent Conduct

      • The paper does not provide a clear statement that an independent third party conducted the evaluation separate from the intervention designers.
      • "Trained test administrators, who were teacher trainees or psychology students, facilitated the sessions" (p. 5)
      • Relevant Quotes: 1) "Trained test administrators, who were teacher trainees or psychology students, facilitated the sessions" (p. 5) 2) "we administered an adaptive, researcher-developed computer- based questionnaire" (p. 6) Detailed Analysis: Criterion I requires evidence that the study was conducted independently from the intervention designers (e.g., external evaluators handling implementation and/or data collection and analysis). The paper describes trained administrators facilitating sessions, and it explicitly describes at least one instrument as "researcher-developed." However, it does not include an explicit statement that an independent evaluation team (separate institution/agency) conducted the implementation, data collection, or analysis independent of the intervention designers. Criterion I is not met because independence of the evaluation is not clearly documented.
    • Y

      Year Duration

      • Outcomes were not measured for 75% of an academic year, and ERCT rules also make Y not met when T is not met.
      • "conducted over an eight-week period." (p. 5)
      • Relevant Quotes: 1) "conducted over an eight-week period." (p. 5) 2) "post-test 2 weeks later" (p. 6) Detailed Analysis: Criterion Y requires outcomes to be measured at least 75% of an academic year after the intervention begins. The reported duration is an "eight-week period" with a "post-test 2 weeks later," which is far shorter than an academic year. Additionally, per the ERCT rules provided with this task, if criterion T is not met then criterion Y is not met. Since T is not met here, Y cannot be met. Criterion Y is not met because the study duration is far shorter than 75% of an academic year (and T is not met).
    • B

      Balanced Control Group

      • The two conditions were designed to be equivalent in time and structure, differing primarily by contextual embedding.
      • "differed only with respect to their thematic embedding" (p. 6)
      • Relevant Quotes: 1) "differed only with respect to their thematic embedding" (p. 6) 2) "the underlying task and sentence structure were identical:" (p. 6) 3) "The study consisted of six sessions, each lasting 50 min" (p. 5) 4) "for the personalized condition, there was a short conversation" (p. 6) Detailed Analysis: Criterion B asks whether the control condition provides a comparable substitute for the intervention’s time and resources, unless extra resources are the treatment variable. The paper states the conditions "differed only" in thematic embedding and that the "underlying task and sentence structure were identical," with the same overall study dosage (six 50-minute sessions). This strongly supports balanced time and instructional structure. The personalized condition included "a short conversation" about students’ interests. This appears to be a minor component of the personalization manipulation itself and is unlikely to constitute a meaningful, separable extra resource (e.g., additional instructional time or budget) that would confound interpretation. Criterion B is met because the control condition is structurally matched in time and resources, with differences primarily limited to contextual personalization.
  • Level 3 Criteria

    • R

      Reproduced

      • No independent replication of this specific trial was found in the paper or via targeted online searches as of the ERCT check date.
      • Relevant Quotes: 1) (No relevant quote: the paper does not claim replication, and no replication paper was identified during the online search.) Detailed Analysis: Criterion R requires an independently replicated study by a different research team, published in a peer-reviewed venue. The paper itself does not state that it is a replication, nor does it cite a later independent replication of this specific trial. Targeted online searches on 2026-03-14 using the DOI, full title, and author names did not identify any peer-reviewed independent replication of this exact study (likely due to the paper’s very recent publication date). Criterion R is not met because independent replication evidence was not found.
    • A

      All-subject Exams

      • The study does not report standardized exam outcomes across all core subjects, and ERCT rules make A not met when E is not met.
      • "we employed an instrument developed by Edelsbrunner et al. (2018)" (p. 7)
      • Relevant Quotes: 1) "we employed an instrument developed by Edelsbrunner et al. (2018)" (p. 7) 2) "We conducted a standardized test of reading comprehension" (p. 8) Detailed Analysis: Criterion A requires standardized exam-based assessment across all main subjects, and it has a prerequisite: if criterion E is not met, criterion A is not met. Here, the main outcome is CVS comprehension measured with a research instrument (so E is not met). The study also includes a standardized reading comprehension test, but it does not present standardized exam outcomes across core school subjects (e.g., language arts, mathematics, science) suitable for evaluating broad, cross-subject trade-offs. Criterion A is not met because E is not met and all-subject standardized exams are not reported.
    • G

      Graduation Tracking

      • The study does not track students until graduation, and ERCT rules also make G not met when Y is not met.
      • Relevant Quotes: 1) (No relevant quote: the paper does not describe tracking to graduation.) Detailed Analysis: Criterion G requires follow-up tracking until graduation. This paper reports only short-term measurements around the intervention period and does not describe any graduation tracking or administrative linkage to later graduation outcomes. Additionally, per the ERCT rules provided with this task, if criterion Y is not met then criterion G is not met. Since Y is not met, G cannot be met. Criterion G is not met because there is no graduation tracking and Y is not met.
    • P

      Pre-Registered

      • No pre-registration registry, ID, or pre-registration date is reported, and no public pre-registration record was found in targeted online searches.
      • Relevant Quotes: 1) (No relevant quote: the paper does not mention pre-registration, a registry platform, or a registration ID.) Detailed Analysis: Criterion P requires that the study protocol be pre-registered before data collection begins, with an identifiable registry record. The PDF contains no mention of protocol pre-registration (no registry, ID, or date). Targeted online searches on 2026-03-14 using the DOI, full title, and author names did not identify a public pre-registration record for this study. Criterion P is not met because there is no evidence of pre-registration.

Request an Update or Contact Us

Are you the author of this study? Let us know if you have any questions or updates.

Have Questions
or Suggestions?

Get in Touch

Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.

  • Submit a Study for Evaluation

    Share your research with us for review

  • Suggest Improvements

    Provide feedback to help us make things better.

  • Update Your Study

    If you're the author, let us know about necessary updates or corrections.