Abstract
Background: Foundational knowledge of anesthesia techniques is essential for medical students. Team-based learning (TBL) improves engagement. Web-based virtual environments (WBVEs) allow many learners to join the same session in real time while being guided by an instructor.
Objective: This study aimed to compare a WBVE with face-to-face (F2F) delivery of the same TBL curriculum in terms of postclass knowledge and learner satisfaction.
Methods: We conducted a randomized, controlled, assessor-blinded trial at a Thai medical school from August 2024 to January 2025. Eligible participants were fifth-year medical students from the Faculty of Medicine, Khon Kaen University, who attended the anesthesiology course at the department of anesthesiology. Students who had previously completed the anesthesiology course or were unable to comply with the study protocol were excluded. They were allocated to one of the groups using a computer-generated sequence, with concealment of allocation to WBVE (on the Spatial platform) or F2F sessions. Both groups received identical 10-section content in a standardized TBL sequence lasting 130 minutes. Only the delivery mode differed (Spatial WBVE vs classroom F2F). The primary outcome was the postclass multiple-choice questionnaire score. The secondary outcome was learner satisfaction. Individual knowledge was assessed before and after the session using a 15-item questionnaire containing multiple-choice questions via Google Forms. Satisfaction was measured immediately after class on a 5-point Likert scale. Outcome scoring and data analysis were blinded to group assignment. Participants and instructors were not blinded.
Results: In total, 79 students were randomized in this study (F2F: n=38, 48%; WBVE: n=41, 52%). We excluded 2% (1/41) of the students in the WBVE group due to incomplete data. There were complete data for the analysis for 78 participants (F2F: n=38, 49%; WBVE: n=40, 51%). Preclass scores were similar between groups (F2F: mean 6.03, SD 2.05; WBVE: mean 6.20, SD 2.04). Postclass knowledge did not differ significantly (F2F: mean 11.24, SD 1.93; WBVE: mean 10.40, SD 2.62; mean difference 0.88, 95% CI –0.18 to 1.94; P=.12). Learner satisfaction favored F2F learning across multiple domains, including overall course satisfaction. Overall satisfaction favored F2F learning (mean difference 0.42, 95% CI 0.07-0.77; P=.01). Both groups ran as planned. No adverse events were reported. No technical failures occurred in the WBVE group.
Conclusions: In this trial, WBVE-delivered TBL produced similar short-term knowledge gains to F2F delivery, but learner satisfaction was lower in the WBVE group. Unlike many previous studies, this trial compared WBVE and F2F delivery while keeping the TBL curriculum and prespecified outcomes identical across groups. These findings support WBVEs as a scalable option when physical space, learner volume, or constraints are present. However, lower satisfaction in the WBVE highlights the real-world need for improved facilitation, user experience design, and technical readiness before broader implementation.
Full
Article
ERCT Criteria Breakdown
-
Level 1 Criteria
-
C
Class-level RCT
-
E
Exam-based Assessment
- Knowledge was measured with a study-specific 15-item MCQ rather than a standardized exam.
- "Individual knowledge was assessed before and after the session using a 15-item questionnaire containing multiple-choice questions via Google Forms."
Relevant Quotes:
1) "Individual knowledge was assessed before and after the session using a 15-item questionnaire containing multiple-choice questions via Google Forms." (p. 1)
2) "Pre- and postintervention knowledge evaluations were assessed using a 15-item MCQ test mapped to a test specification table (Multimedia Appendix 1)." (p. 4)
Detailed Analysis:
Criterion E requires a standardized, widely recognized exam-based assessment (eg, national/state tests or established standardized instruments). Researcher-created quizzes or course-specific MCQs do not satisfy this criterion because they can be tailored to the intervention and are not broadly comparable across settings.
The paper describes a 15-item questionnaire/MCQ administered via Google Forms and mapped to a test specification table in an appendix. No standardized exam name (eg, national board exam or validated standardized achievement test) is provided, and the described instrument is specific to the course content.
Criterion E is not met because the outcome assessment is a study-specific MCQ rather than a standardized exam.
-
T
Term Duration
- Outcomes were measured immediately after a single session rather than at least one academic term after intervention start.
- "Immediately after the teaching session, 15 minutes were allocated for outcome measurement."
Relevant Quotes:
1) "Both groups participated in a single, standardized 130-minute TBL session." (p. 3)
2) "Immediately after the teaching session, 15 minutes were allocated for outcome measurement." (p. 4)
3) "Student satisfaction was evaluated immediately after the intervention using a 21-item survey, with responses on a 5-point Likert scale (1=strongly disagree, 2=disagree, 3=neither agree nor disagree, 4=agree, and 5=strongly agree)." (p. 4)
Detailed Analysis:
Criterion T requires outcomes to be measured at least one academic term (roughly 3-4 months) after the intervention begins, to support persistence and reduce short-term-only effects.
The paper states the intervention was a single 130-minute session, and it states outcomes were measured immediately after the teaching session (with only a 15-minute allocation for outcome measurement). This is same-day testing rather than term-long follow-up.
Criterion T is not met because outcomes were assessed immediately after a single session rather than after at least one term.
-
D
Documented Control Group
- The comparator condition and baseline group characteristics are documented, including a baseline characteristics table by group.
- "Baseline characteristics by group in a cluster-randomized trial of a face-to-face (F2F) group versus a web-based virtual environment (WBVE) group for team-based learning among fifth-year medical students (N=78)."
Relevant Quotes:
1) "F2F Group" (p. 3)
2) "The 10 modules described earlier were delivered F2F using a standard TBL sequence." (p. 3)
3) "Table 1. Baseline characteristics by group in a cluster- randomized trial of a face-to-face (F2F) group versus a web-based virtual environment (WBVE) group for team-based learning among fifth-year medical students (N=78)." (p. 5)
Detailed Analysis:
Criterion D requires that the control/comparator group is clearly described, including what the comparator received and basic baseline characteristics (or equivalent documentation) to support comparison.
The paper provides an explicit description of the F2F comparator under a labeled "F2F Group" section and describes how modules were delivered. It also provides a baseline characteristics table (Table 1) with group labels and participant counts.
This is sufficient documentation of the comparator condition and baseline composition for ERCT purposes.
Criterion D is met because the F2F comparator and baseline characteristics are explicitly documented.
-
Level 2 Criteria
-
S
School-level RCT
- Randomization occurred at the class (cluster) level rather than the school/site level.
- "We used cluster randomization at the class level to minimize contamination between teaching groups."
Relevant Quotes:
1) "We used cluster randomization at the class level to minimize contamination between teaching groups." (p. 2)
2) "Students were grouped into clusters based on their scheduled learning sessions." (p. 2)
Detailed Analysis:
Criterion S requires that schools (or equivalent sites) are the unit of randomization, which increases real-world relevance and captures school-level implementation factors.
The paper is explicit that randomization was at the "class level" using clusters based on scheduled sessions. It does not describe randomization across multiple schools/sites; it is conducted within a single medical school context.
Criterion S is not met because allocation was at the class/cluster level rather than at the school/site level.
-
I
Independent Conduct
- The paper does not document that an external independent team conducted the study separate from the intervention team.
- "The randomization sequence was generated and implemented by investigators who were not involved in teaching or assessment to reduce the risk of allocation bias."
Relevant Quotes:
1) "The randomization sequence was generated and implemented by investigators who were not involved in teaching or assessment to reduce the risk of allocation bias." (p. 2)
2) "The instructor monitored team rooms, offered real-time guidance, provided immediate feedback, and answered questions as they arose." (p. 3)
Detailed Analysis:
Criterion I requires evidence that the evaluation was conducted independently from the intervention designers/implementers (eg, an external evaluation team), reducing the risk of biased delivery, measurement, and interpretation.
The paper documents internal separation of roles for randomization, which helps reduce allocation bias. However, the intervention delivery is clearly instructor-led, and the paper does not state that an external, independent evaluator (separate institution or agency) conducted the data collection and/or analysis.
Therefore, while some bias-reduction steps are present (role separation; blinding), independence in the ERCT sense is not documented.
Criterion I is not met because an independent external evaluation team is not documented.
-
Y
Year Duration
- Outcomes were measured immediately after a single session, so the study does not meet the year-duration requirement.
- "Both groups participated in a single, standardized 130-minute TBL session."
Relevant Quotes:
1) "Both groups participated in a single, standardized 130-minute TBL session." (p. 3)
2) "Immediately after the teaching session, 15 minutes were allocated for outcome measurement." (p. 4)
3) "The study was conducted from August 26, 2024, to January 7, 2025." (p. 2)
Detailed Analysis:
Criterion Y requires outcomes to be measured at least 75% of an academic year after the intervention begins. In addition, per the ERCT dependency rule provided, if Criterion T is not met, then Criterion Y is not met.
The intervention is explicitly a single session, and outcomes are measured immediately after that session. This fails the minimum term-follow-up requirement, and it also clearly does not represent measurement after most of an academic year.
Criterion Y is not met because there is no year-long tracking from intervention start to outcome measurement.
-
B
Balanced Control Group
- Time and instructional content are matched across groups; the delivery-mode differences (including technology) are integral to the treatment being tested.
- "Both groups participated in a single, standardized 130-minute TBL session. The content was identical, organized into 10 sections on anesthesia techniques (listed in Multimedia Appendix 1)."
Relevant Quotes:
1) "Both groups participated in a single, standardized 130-minute TBL session. The content was identical, organized into 10 sections on anesthesia techniques (listed in Multimedia Appendix 1)." (p. 3)
2) "Materials were identical to those in the other group (handouts mirroring the slides and identical question stems)." (p. 3)
3) "Sessions ran on the Spatial platform (Spatial Systems, Inc; Thai localization) [19] using desktop computers on the university’s secure local network." (p. 3)
Detailed Analysis:
Criterion B requires comparing the nature, quantity, and quality of resources provided to intervention and control groups, and determining whether any additional resources are either balanced or are explicitly integral to the treatment being tested.
Here, the study is designed to isolate delivery mode while matching instructional time and content: both groups have a single 130-minute TBL session with identical content, and the paper explicitly states that materials are identical across groups.
The WBVE condition necessarily uses extra technological inputs (Spatial platform, desktop computers, and a 3D virtual environment), but those resources are not a separable add-on; they are the delivery-mode treatment itself. The control group receives an equivalent instructional substitute (the same modules delivered face-to-face) with matched time and matched materials.
Criterion B is met because instructional time and content are matched, and the technology inputs are integral to the delivery mode being tested rather than a confounding extra resource.
-
Level 3 Criteria
-
R
Reproduced
-
A
All-subject Exams
- The study does not use standardized exams, so it cannot meet the all-subject standardized exam requirement.
- "Individual knowledge was assessed before and after the session using a 15-item questionnaire containing multiple-choice questions via Google Forms."
Relevant Quotes:
1) "Individual knowledge was assessed before and after the session using a 15-item questionnaire containing multiple-choice questions via Google Forms." (p. 1)
2) "Pre- and postintervention knowledge evaluations were assessed using a 15-item MCQ test mapped to a test specification table (Multimedia Appendix 1)." (p. 4)
Detailed Analysis:
Criterion A requires standardized exam-based assessment across all main subjects. The ERCT dependency rule provided also states that if Criterion E is not met, then Criterion A is not met.
This study uses a course-specific 15-item MCQ rather than a standardized exam. It also assesses only anesthesia-techniques knowledge and satisfaction, not all main academic subjects.
Criterion A is not met because the study does not use standardized exams (fails E) and does not assess all core subjects with standardized tests.
-
G
Graduation Tracking
- The study does not track participants to graduation, and year-long tracking is not present.
- "Immediately after the teaching session, 15 minutes were allocated for outcome measurement."
Relevant Quotes:
1) "Immediately after the teaching session, 15 minutes were allocated for outcome measurement." (p. 4)
2) "Both groups participated in a single, standardized 130-minute TBL session." (p. 3)
Detailed Analysis:
Criterion G requires tracking participants until graduation from the relevant educational stage. Per the ERCT dependency rule provided, if Criterion Y is not met, then Criterion G is not met.
The paper describes immediate post-session measurement and does not describe any later follow-up (eg, end-of-year outcomes, course completion outcomes, or graduation outcomes).
A targeted internet search for subsequent publications by the same author team reporting longer-term follow-up (including graduation tracking) for this cohort did not identify any such follow-up paper.
Criterion G is not met because the study does not include (and no follow-up publication was found reporting) graduation tracking.
-
P
Pre-Registered
- The paper reports registration in the Thai Clinical Trials Registry before participant enrollment, but the registry entry posting date could not be independently verified here.
- "It was also registered with the Thai Clinical Trials Registry before participant enrollment (TCTR20240708012)."
Relevant Quotes:
1) "Trial Registration: Thai Clinical Trials Registry TCTR20240708012; https://www.thaiclinicaltrials.org/show/ TCTR20240708012" (p. 1)
2) "It was also registered with the Thai Clinical Trials Registry before participant enrollment (TCTR20240708012)." (p. 4)
3) "The study was conducted from August 26, 2024, to January 7, 2025." (p. 2)
Detailed Analysis:
Criterion P requires that the protocol be pre-registered before data collection/enrollment begins, with evidence of a registry identifier and timing.
The paper provides a registry identifier and URL, and it explicitly states the study was registered "before participant enrollment," which directly addresses the timing requirement. The stated study conduct period begins later (August 26, 2024).
Attempted independent verification of the registry entry’s posting date within the registry itself was not possible in this review (the registry page content was not retrievable via the browsing tool). Therefore, this assessment relies on the article’s explicit statement about pre-enrollment registration.
Criterion P is met because the paper explicitly documents trial registration and states it occurred before participant enrollment.
Request an Update or Contact Us
Are you the author of this study? Let us know if you have any questions or updates.