Digital personalised learning to improve literacy and numeracy outcomes: a randomised controlled trial in Kenyan pre-primary classrooms

Louis Major, Rebecca Daltry, Mary Otieno, Kevin Otieno, Annette Zhao, Chen Sun, Jessica Hinks, and Aidan Friedberg

Published:
ERCT Check Date:
DOI: 10.1080/02671522.2025.2605645
  • mathematics
  • reading
  • pre-K
  • kindergarten
  • Africa
  • blended learning
  • EdTech app
  • mobile learning
2
  • C

    The study randomised at the school (cluster) level, which is class-level or stronger and reduces contamination risks.

    "During this research, a stratified, two-arm, cluster-RCT with one treatment and one control group compared learning gains following the implementation of DPL over four school terms."

  • E

    The outcomes were measured using IDELA, a widely used and validated assessment tool rather than a study-created exam.

    "To evaluate the primary study outcomes, the free-to-access IDELA tool was used to assess emergent numeracy and literacy skills."

  • T

    Outcomes were tracked from baseline (October 2022) to endline (October 2023), which exceeds one academic term.

    "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)."

  • D

    The control group is clearly described as business-as-usual and baseline characteristics and scores are reported for both groups.

    "Control schools did not receive the DPL tool and continued to teach as usual, following the Kenyan pre-primary national curriculum."

  • S

    Schools were the cluster unit and were randomly allocated to treatment or control, satisfying school-level randomisation.

    "The random allocation of all schools to treatment or control groups was carried out in August 2022."

  • I

    The paper explicitly states independence from the DPL provider for data collection, analysis, and conclusions, with independent enumerators collecting assessment data.

    "It is important to note that the research team maintained full independence in their collaboration with the DPL provider, who did not participate in data collection, analysis, or the formulation of conclusions."

  • Y

    The study assessed outcomes from October 2022 to October 2023 (about 13 months), exceeding 75% of an academic year.

    "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)."

  • B

    The treatment received devices and implementation support, but these additional resources are integral to the DPL intervention being tested against business-as-usual schooling.

    "Control schools did not receive the DPL tool and continued to teach as usual, following the Kenyan pre-primary national curriculum."

  • R

    No independent peer-reviewed replication of this specific RCT was found in the paper or through internet searching.

    "This study investigates, for the first time, a DPL programme aligned with national curricula and teaching practices."

  • A

    Although IDELA is used, the study assesses literacy and numeracy only and does not measure impacts across all main subject areas.

    "To evaluate the primary study outcomes, the free-to-access IDELA tool was used to assess emergent numeracy and literacy skills."

  • G

    No evidence was found that the study tracked the same learners through a graduation milestone beyond the endline in October 2023, and no follow-up paper reporting graduation tracking was identified.

    "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)."

  • P

    The study mentions protocol registration prior to analysis, but external checking indicates the publicly available protocol was published in October 2023 (after baseline began in October 2022), so it does not meet ERCT pre-registration timing requirements.

    "Details of the study’s design, methodology, and analysis plan were documented in a comprehensive research protocol, independently reviewed following peer-feedback, and registered on an open-access research repository prior to analysis to promote transparency and safeguard trial integrity (Major et al. 2023)."

Abstract

Research on digital personalised learning (DPL) alongside classroom teaching is limited in low- and middle-income countries. This study investigates, for the first time, a DPL programme aligned with national curricula and teaching practices. A randomised trial evaluated the impact of ‘classroom-integrated’ DPL on pre-primary literacy and numeracy in Kenya, involving 1955 learners aged 4 to 6 across 291 government schools. Learners engaged with DPL via a smartphone. DPL personalised the sequence of digital learning units based on learners’ device usage and teachers’ progression through digitised structured pedagogy lesson plans, while teachers could use their judgement to align DPL with teaching. Numeracy and literacy were assessed over 13 months. The findings show an effect of 0.534 SD from baseline to endline, with similar numeracy (0.450 SD) and literacy (0.449 SD) impacts.

Full Article

ERCT Criteria Breakdown

  • Level 1 Criteria

    • C

      Class-level RCT

      • The study randomised at the school (cluster) level, which is class-level or stronger and reduces contamination risks.
      • "During this research, a stratified, two-arm, cluster-RCT with one treatment and one control group compared learning gains following the implementation of DPL over four school terms."
      • Relevant Quotes: 1) "During this research, a stratified, two-arm, cluster-RCT with one treatment and one control group compared learning gains following the implementation of DPL over four school terms." (page number not available in provided text) 2) "Randomisation consisted of two stages: (1) Cluster-randomisation to assign schools to treatment and control groups (2) Random selection of 10 learners within each school, stratified by gender" (page number not available in provided text) Detailed Analysis: Criterion C requires that randomisation is at the class level (or stronger), rather than assigning students within the same class to different conditions (unless the intervention is one-to-one tutoring). The paper explicitly describes a "cluster-RCT" and clarifies that clusters are schools, via "Cluster-randomisation to assign schools to treatment and control groups". Because the unit of assignment is the school, this is stronger than class-level randomisation and is consistent with ERCT's goal of reducing within-school contamination. Final Summary: Criterion C is met because schools (clusters) were randomised to treatment or control rather than students within the same class.
    • E

      Exam-based Assessment

      • The outcomes were measured using IDELA, a widely used and validated assessment tool rather than a study-created exam.
      • "To evaluate the primary study outcomes, the free-to-access IDELA tool was used to assess emergent numeracy and literacy skills."
      • Relevant Quotes: 1) "To evaluate the primary study outcomes, the free-to-access IDELA tool was used to assess emergent numeracy and literacy skills." (page number not available in provided text) 2) "Developed by Save the Children, IDELA was rigorously piloted with 5300 learners across 11 countries over three years." (page number not available in provided text) 3) "Validated for children aged 3–6 years, internal consistency, construct validity, and inter-rater and test–retest reliability are established (Pisani, Borisova, and Dowd 2015, 2018; Wolf et al. 2017)." (page number not available in provided text) Detailed Analysis: Criterion E requires exam-based outcomes measured with a standardized, widely recognised assessment (not one created primarily for this specific study). The paper states it used IDELA, describes its developer (Save the Children), and provides evidence of multi-country piloting and established validity and reliability. While IDELA is not described as a national high-stakes exam, it is presented as a standardized early learning assessment tool used across contexts with established psychometric properties, which aligns with the intent of Criterion E (reducing bias from bespoke, intervention-aligned tests). Final Summary: Criterion E is met because the study used the validated IDELA tool rather than a custom-made assessment.
    • T

      Term Duration

      • Outcomes were tracked from baseline (October 2022) to endline (October 2023), which exceeds one academic term.
      • "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)."
      • Relevant Quotes: 1) "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)." (page number not available in provided text) 2) "Numeracy and literacy were assessed over 13 months." (page number not available in provided text) 3) "School years begin in January and consist of three terms ending late- October/early-November." (page number not available in provided text) Detailed Analysis: Criterion T requires that outcomes are measured at least one full academic term after the intervention begins (typically 3–4 months). The paper provides explicit assessment dates (October 2022 baseline; May 2023 midline; October 2023 endline), implying follow-up well beyond a single term. The paper also states that outcomes were assessed over "13 months", which is substantially longer than a term. Final Summary: Criterion T is met because the baseline-to-outcome measurement period exceeds one academic term by a large margin.
    • D

      Documented Control Group

      • The control group is clearly described as business-as-usual and baseline characteristics and scores are reported for both groups.
      • "Control schools did not receive the DPL tool and continued to teach as usual, following the Kenyan pre-primary national curriculum."
      • Relevant Quotes: 1) "Control schools did not receive the DPL tool and continued to teach as usual, following the Kenyan pre-primary national curriculum." (page number not available in provided text) 2) "Table 3. Comparison of learner characteristics and assessment scores in treatment and control groups at baseline (post-attrition)." (page number not available in provided text) 3) "Overall IDELA score (mean and SD) 0.284 (0.156 SD) 0.280 (0.151 SD)" (page number not available in provided text) Detailed Analysis: Criterion D requires that the control group is well documented, including who is in it, what it received, and baseline characteristics (demographics and baseline performance). The paper explicitly defines the control condition as not receiving the DPL tool and continuing standard instruction. It also reports comparative baseline information (sample sizes, gender distribution, and baseline IDELA, numeracy, and literacy scores) in a dedicated table comparing treatment and control groups. Final Summary: Criterion D is met because the control condition and baseline comparability are described with explicit group definitions and baseline data.
  • Level 2 Criteria

    • S

      School-level RCT

      • Schools were the cluster unit and were randomly allocated to treatment or control, satisfying school-level randomisation.
      • "The random allocation of all schools to treatment or control groups was carried out in August 2022."
      • Relevant Quotes: 1) "School-level randomisation The random allocation of all schools to treatment or control groups was carried out in August 2022." (page number not available in provided text) 2) "Schools (n = 316) in each sub-county were randomly assigned to ‘treatment’ or ‘control’ in Microsoft Excel following methodological guidance (J-PAL, no date)." (page number not available in provided text) Detailed Analysis: Criterion S requires randomisation at the school (institution/site) level, not merely at class level within schools. The paper has a "School-level randomisation" subsection and describes schools being assigned to treatment or control. Stratification by sub-county is an implementation detail that does not undermine school-level assignment; it clarifies how randomisation was performed given operational constraints. Final Summary: Criterion S is met because the unit of randomisation was the school.
    • I

      Independent Conduct

      • The paper explicitly states independence from the DPL provider for data collection, analysis, and conclusions, with independent enumerators collecting assessment data.
      • "It is important to note that the research team maintained full independence in their collaboration with the DPL provider, who did not participate in data collection, analysis, or the formulation of conclusions."
      • Relevant Quotes: 1) "It is important to note that the research team maintained full independence in their collaboration with the DPL provider, who did not participate in data collection, analysis, or the formulation of conclusions." (page number not available in provided text) 2) "While enumerators may potentially have been aware of the treatment status due to the visibility of DPL technology in schools, they were entirely independent of the treatment, ensuring no conflict of interest." (page number not available in provided text) 3) "One author (AF) was contracted by the DPL tool provider during the research period; however, they did not participate in undertaking data collection, analysis, or the formulation of conclusions." (page number not available in provided text) Detailed Analysis: Criterion I requires that the evaluation is conducted independently from the intervention designer/provider to reduce bias in implementation, measurement, and analysis. The paper includes a direct statement that the provider did not participate in data collection, analysis, or forming conclusions. It also explicitly states enumerators were "entirely independent of the treatment" and discloses a potential conflict (one author contracted by the provider) while narrowing that author's role away from data collection, analysis, and conclusions. Final Summary: Criterion I is met because the paper explicitly documents independence from the provider for data collection, analysis, and conclusions.
    • Y

      Year Duration

      • The study assessed outcomes from October 2022 to October 2023 (about 13 months), exceeding 75% of an academic year.
      • "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)."
      • Relevant Quotes: 1) "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)." (page number not available in provided text) 2) "Numeracy and literacy were assessed over 13 months." (page number not available in provided text) 3) "School years begin in January and consist of three terms ending late- October/early-November." (page number not available in provided text) Detailed Analysis: Criterion Y requires outcomes be measured at least 75% of an academic year after the intervention begins. The paper provides baseline-to-endline dates (October 2022 to October 2023) and explicitly summarises the overall assessment window as "13 months". Given the paper's description of the Kenyan school year ending in late- October/early-November, an October endline is effectively at the end of the school year. Final Summary: Criterion Y is met because the baseline-to-endline tracking period spans approximately 13 months.
    • B

      Balanced Control Group

      • The treatment received devices and implementation support, but these additional resources are integral to the DPL intervention being tested against business-as-usual schooling.
      • "Control schools did not receive the DPL tool and continued to teach as usual, following the Kenyan pre-primary national curriculum."
      • Relevant Quotes: 1) "Learners engaged with DPL via a smartphone." (page number not available in provided text) 2) "Two Android devices are provided per classroom in a phased rollout (Section 4.4.1)." (page number not available in provided text) 3) "ECDOs overseeing treatment schools were trained by EIDU to deliver ongoing support to teachers regarding the delivery of the Tayari lesson plans and deployment of the DPL tool in classrooms." (page number not available in provided text) 4) "Control schools did not receive the DPL tool and continued to teach as usual, following the Kenyan pre-primary national curriculum." (page number not available in provided text) Detailed Analysis: Criterion B requires comparing the nature, quantity, and quality of resources (time, budget, materials, adult support) provided to both intervention and control conditions, and asking whether the control provides a comparable substitute, unless the extra resources are explicitly the treatment being tested. Here, extra resources are clearly present in the treatment condition (smartphones/Android devices, plus trained support via ECDOs). The control condition explicitly did not receive the DPL tool and continued as usual. However, the intervention being evaluated is inherently a technology intervention delivered "via a smartphone" with associated deployment and support. These additional resources are integral to the treatment package (they are not a separable, optional add-on to a different educational intervention). Under the ERCT Criterion B decision rule, a business-as- usual control is acceptable when the additional resources are the treatment variable itself. Final Summary: Criterion B is met because the added devices and implementation support are integral components of the DPL intervention being tested against business- as-usual instruction.
  • Level 3 Criteria

    • R

      Reproduced

      • No independent peer-reviewed replication of this specific RCT was found in the paper or through internet searching.
      • "This study investigates, for the first time, a DPL programme aligned with national curricula and teaching practices."
      • Relevant Quotes: 1) "This study investigates, for the first time, a DPL programme aligned with national curricula and teaching practices." (page number not available in provided text) Detailed Analysis: Criterion R requires independent replication of the study by a different research team in a different context, published in a peer-reviewed outlet. The paper does not report that this specific RCT has been replicated. The abstract frames the work as "for the first time", which is consistent with the intervention/model not yet having an established replication literature for this exact trial. An internet search (conducted as part of this ERCT update) did not identify any peer-reviewed independent replication that (a) explicitly seeks to reproduce this RCT's design and (b) reports results for the same intervention model in a different setting by an independent author team. This is plausible given the publication date (07 Jan 2026). Final Summary: Criterion R is not met because no independent replication of this specific RCT was identified.
    • A

      All-subject Exams

      • Although IDELA is used, the study assesses literacy and numeracy only and does not measure impacts across all main subject areas.
      • "To evaluate the primary study outcomes, the free-to-access IDELA tool was used to assess emergent numeracy and literacy skills."
      • Relevant Quotes: 1) "To evaluate the primary study outcomes, the free-to-access IDELA tool was used to assess emergent numeracy and literacy skills." (page number not available in provided text) 2) "IDELA’s seven numeracy and seven literacy assessment items were utilised during the study (Appendix C)." (page number not available in provided text) Detailed Analysis: Criterion A requires measuring impact on all main subjects taught at the relevant educational level, using standardised exam-based assessments, and it depends on Criterion E being met. Criterion E is met here via IDELA. However, the paper's measured domains are explicitly limited to numeracy and literacy. The paper does not describe standardised assessment coverage of other core curricular areas (for example, other pre-primary learning areas beyond early numeracy/literacy), nor does it provide a justification that numeracy and literacy constitute "all main subjects" in the relevant setting. Final Summary: Criterion A is not met because the standardised outcome assessment is limited to numeracy and literacy rather than all main subjects.
    • G

      Graduation Tracking

      • No evidence was found that the study tracked the same learners through a graduation milestone beyond the endline in October 2023, and no follow-up paper reporting graduation tracking was identified.
      • "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)."
      • Relevant Quotes: 1) "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)." (page number not available in provided text) 2) "Although 7% of learners remained in pre-primary 1 (n = 143), most progressed to pre-primary 2 following baseline (1812 of 1955)..." (page number not available in provided text) Detailed Analysis: Criterion G requires tracking participants until graduation (a defined completion point) to evaluate long-term impacts. The paper documents three assessment points ending at "endline (October 2023)". It also notes that most learners progressed from pre-primary 1 to pre-primary 2 during the study period, but it does not describe tracking beyond the October 2023 endline to a later graduation milestone (for example, into subsequent primary grades or another terminal stage). Internet searching for follow-up publications by the same author group reporting graduation tracking for this RCT cohort did not identify any such paper at the time of this ERCT update. Final Summary: Criterion G is not met because the study does not report tracking the cohort through a graduation milestone, and no follow-up reporting such tracking was found.
    • P

      Pre-Registered

      • The study mentions protocol registration prior to analysis, but external checking indicates the publicly available protocol was published in October 2023 (after baseline began in October 2022), so it does not meet ERCT pre-registration timing requirements.
      • "Details of the study’s design, methodology, and analysis plan were documented in a comprehensive research protocol, independently reviewed following peer-feedback, and registered on an open-access research repository prior to analysis to promote transparency and safeguard trial integrity (Major et al. 2023)."
      • Relevant Quotes: 1) "Details of the study’s design, methodology, and analysis plan were documented in a comprehensive research protocol, independently reviewed following peer-feedback, and registered on an open-access research repository prior to analysis to promote transparency and safeguard trial integrity (Major et al. 2023)." (page number not available in provided text) 2) "Assessment points were baseline (October 2022), midline (May 2023), and endline (October 2023)." (page number not available in provided text) 3) "Published on 14 October 2023" (page number not available; quote is from the publicly available protocol landing page) 4) "The protocol is being disseminated prior to endline data analysis to promote transparency and a comprehensive understanding of the RCT approach and design." (page number not available; quote is from the publicly available protocol landing page) Detailed Analysis: Criterion P requires a publicly pre-registered protocol before the study begins (i.e., before data collection starts), with evidence of the registration date being prior to study start. The paper states registration occurred "prior to analysis", which is not sufficient for ERCT pre-registration requirements if data collection had already started. External checking found a publicly available protocol resource for this RCT with the publication date "14 October 2023" and describing itself as being disseminated "prior to endline data analysis". The paper reports the baseline assessment occurred in "October 2022", which is earlier than October 2023. Therefore, the available evidence indicates the protocol was not publicly registered before data collection began. Final Summary: Criterion P is not met because the publicly available protocol appears to have been published in October 2023, after baseline data collection began in October 2022.

Request an Update or Contact Us

Are you the author of this study? Let us know if you have any questions or updates.

Have Questions
or Suggestions?

Get in Touch

Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.

  • Submit a Study for Evaluation

    Share your research with us for review

  • Suggest Improvements

    Provide feedback to help us make things better.

  • Update Your Study

    If you're the author, let us know about necessary updates or corrections.