Comparing skill transfer and cognitive style effects across three laparoscopic training modalities: a prospective randomized study in medical students

L. Vradelis; N. Müller; F. Huettl; L. I. Hanke; A. Nedwed; H. Lang; C. Boedecker; T. Huber

Published: Jan 13, 2026

ERCT Check Date: Feb 25, 2026

DOI: 10.1007/s00464-025-12511-9

Link

Download PDF

science
higher education
EU
gamification
digital assessment

C

Randomization was at the individual student level (not class- or school-level), and the paper does not describe a tutoring/personal teaching exception.

"After providing informed consent for participation and pseudonymized data storage, the students were assigned to one of four groups: VR simulator (VR), Box trainer (BT), Serious Game (SG), or Control group (CG) (Fig. 1)." (PDF p. 2)
E

Outcomes are simulator-based performance tasks rather than a widely recognized standardized exam-based assessment.

"Pre-, mid-, and post-tests on the VR simulator consisted of a single standardized repetition of each of the three assessment tasks (“Grasping,” “Fine Dissection,” and “Clip Applying”), in accordance with established VR simulator assessment protocols." (PDF p. 3)
T

The study reports four one-hour training sessions with testing at the beginning/midpoint/end of the study, without documenting a term-long (3–4 month) follow-up window.

"The training protocol consisted of four one-hour sessions, two before and two after the midterm assessment, following a distributed practice structure." (PDF p. 2)
D

The control group condition is explicitly described and baseline characteristics are documented and reported as balanced.

"The control group did not train and only participated in the testing on the VR simulator." (PDF p. 3)
S

The study was conducted within a single institution and assigned individual students, not multiple schools/sites randomized to conditions.

"This prospective preclinical study was conducted at the Department of General, Visceral, and Transplant Surgery at the University Medical Center of Johannes Gutenberg-University Mainz." (PDF p. 2)
I

The evaluated training modalities are third-party tools and the authors disclose no conflicts of interest or financial ties.

"Lukas Vradelis, Natascha Müller, Florentine Huettl, Laura Isabel Hanke, Annekatrin Nedwed, Christian Boedecker, Hauke Lang and Tobias Huber have no conflicts of interest or financial ties to disclose." (PDF p. 12)
Y

The paper does not document outcome measurement at least 75% of an academic year after intervention start, and T is not met.

"Each group received four one-hour training sessions." (PDF p. 1)
B

The intervention adds structured training time by design, and that additional time is the treatment variable being tested against a no-training control group.

"The control group did not train and only participated in the testing on the VR simulator." (PDF p. 3)
R

No independent replication of this specific 2026 randomized study by a different author team was found.
A

Criterion A is not met because criterion E is not met and outcomes are limited to simulator tasks rather than all-subject exams.

"Pre-, mid-, and post-tests on the VR simulator consisted of a single standardized repetition of each of the three assessment tasks (“Grasping,” “Fine Dissection,” and “Clip Applying”), in accordance with established VR simulator assessment protocols." (PDF p. 3)
G

The study does not track participants to graduation, and G is not applicable under ERCT because Y is not met.

"Therefore, future studies should incorporate longitudinal follow-ups and evaluate surgical performance in actual clinical environments." (PDF p. 11)
P

The paper provides ethics approval information but no dated public pre-registration record (registry, ID, and registration date).

Abstract

Background Simulation-based training is an important component of modern surgical education. While virtual reality (VR) simulators, box trainers, and serious games are all used in laparoscopic training, comparative data on their effectiveness, transferability, and the role of individual cognitive learning styles remain limited. Methods In this prospective, randomized study, 80 medical students without prior laparoscopic experience were assigned to one of four groups: VR simulator, box trainer, serious game, or control. Each group received four one-hour training sessions. Cognitive style was assessed using the Object-Spatial Imagery and Verbal Questionnaire (OSIVQ), and all participants completed standardized pre- and post-tests on a VR simulator. Performance was evaluated using z-scores for efficiency, accuracy, and task completion. Results All intervention groups demonstrated significant performance improvements. The VR group showed the greatest gains, particularly in complex tasks such as “Fine Dissection” and “Clip Applying”. Box trainer training led to marked reductions in error rates. The serious game group primarily improved in basic skills but showed limited transfer to complex tasks. Spatial learners outperformed other cognitive styles across all modalities, whereas verbal learners improved significantly only in the VR group. Conclusion All three simulation modalities support laparoscopic skill acquisition, but their effectiveness varies by task complexity and cognitive profile. VR training appears to be the most inclusive and effective, particularly for learners with non-spatial cognitive styles. These findings support the integration of cognitive profiling and task-specific modality selection into surgical training.

Full Article

ERCT Criteria Breakdown

Level 1 Criteria
- C
  Class-level RCT
  - Randomization was at the individual student level (not class- or school-level), and the paper does not describe a tutoring/personal teaching exception.
  - "After providing informed consent for participation and pseudonymized data storage, the students were assigned to one of four groups: VR simulator (VR), Box trainer (BT), Serious Game (SG), or Control group (CG) (Fig. 1)." (PDF p. 2)
  - Relevant Quotes: 1) "In this prospective, randomized study, 80 medical students without prior laparoscopic experience were assigned to one of four groups: VR simulator, box trainer, serious game, or control." (PDF p. 1) 2) "After providing informed consent for participation and pseudonymized data storage, the students were assigned to one of four groups: VR simulator (VR), Box trainer (BT), Serious Game (SG), or Control group (CG) (Fig. 1)." (PDF p. 2) Detailed Analysis: Criterion C requires randomization at the class level (or stronger), with an explicit exception only when the intervention is personal teaching/tutoring where class-level randomization is not applicable. The paper describes individual medical students being assigned to one of four groups. No part of the paper describes randomizing intact classes, cohorts, or sites, and it does not present the intervention as tutoring or one-to-one personal teaching (it is independent simulation training). Therefore, the study does not meet the ERCT requirement for class-level (or stronger) randomization, and it also does not document the tutoring/personal-teaching exception. Criterion C is not met because assignment is at the individual learner level and no ERCT exception is documented.
- E
  Exam-based Assessment
  - Outcomes are simulator-based performance tasks rather than a widely recognized standardized exam-based assessment.
  - "Pre-, mid-, and post-tests on the VR simulator consisted of a single standardized repetition of each of the three assessment tasks (“Grasping,” “Fine Dissection,” and “Clip Applying”), in accordance with established VR simulator assessment protocols." (PDF p. 3)
  - Relevant Quotes: 1) "Pre-, mid-, and post-tests on the VR simulator consisted of a single standardized repetition of each of the three assessment tasks (“Grasping,” “Fine Dissection,” and “Clip Applying”), in accordance with established VR simulator assessment protocols." (PDF p. 3) 2) "Performance data from the pre- and post-tests were analyzed using z-scores." (PDF p. 4) Detailed Analysis: Criterion E requires exam-based assessment using a standardized, widely recognized exam (i.e., a test used for external comparability beyond the specific study context). The paper describes standardized simulator tasks on a VR simulator and analyzes performance using z-scores derived from simulator outputs. While these tasks are structured and objective, they are not a broadly recognized standardized exam in the ERCT sense; they are platform-specific performance assessments. Criterion E is not met because the outcomes are VR-simulator task measures rather than standardized exam-based assessments.
- T
  Term Duration
  - The study reports four one-hour training sessions with testing at the beginning/midpoint/end of the study, without documenting a term-long (3–4 month) follow-up window.
  - "The training protocol consisted of four one-hour sessions, two before and two after the midterm assessment, following a distributed practice structure." (PDF p. 2)
  - Relevant Quotes: 1) "Each group received four one-hour training sessions." (PDF p. 1) 2) "The training protocol consisted of four one-hour sessions, two before and two after the midterm assessment, following a distributed practice structure." (PDF p. 2) 3) "All participants completed the standardized tests on the VR simulator at the beginning, midpoint, and end of the study." (PDF p. 3) Detailed Analysis: Criterion T requires outcomes to be measured at least one full academic term after the intervention begins (approximately 3–4 months), or at least that the paper documents a term-long follow-up interval from intervention start to outcome measurement. The paper specifies a total training dose of four hours (four one-hour sessions) and indicates that testing occurred at the beginning, midpoint, and end of the study. The paper does not provide dates or an interval demonstrating that the post-test was collected at least a term after training began. Criterion T is not met because the paper does not document a term-long elapsed time from intervention start to outcome measurement.
- D
  Documented Control Group
  - The control group condition is explicitly described and baseline characteristics are documented and reported as balanced.
  - "The control group did not train and only participated in the testing on the VR simulator." (PDF p. 3)
  - Relevant Quotes: 1) "The control group did not train and only participated in the testing on the VR simulator." (PDF p. 3) 2) "The detailed demographic data of all participants are presented in Table 1. The distribution of relevant baseline characteristics was approximately balanced across all intervention and control groups." (PDF p. 4) Detailed Analysis: Criterion D requires that the control group is well documented, including what it received and sufficient baseline information to support comparability. The paper clearly defines the control condition as no training, with participation only in VR simulator testing. It also points to Table 1 for demographic characteristics and explicitly states that baseline characteristics were approximately balanced across groups. Criterion D is met because the control condition and baseline characteristics are clearly documented.
Level 2 Criteria
- S
  School-level RCT
  - The study was conducted within a single institution and assigned individual students, not multiple schools/sites randomized to conditions.
  - "This prospective preclinical study was conducted at the Department of General, Visceral, and Transplant Surgery at the University Medical Center of Johannes Gutenberg-University Mainz." (PDF p. 2)
  - Relevant Quotes: 1) "This prospective preclinical study was conducted at the Department of General, Visceral, and Transplant Surgery at the University Medical Center of Johannes Gutenberg-University Mainz." (PDF p. 2) 2) "After providing informed consent for participation and pseudonymized data storage, the students were assigned to one of four groups: VR simulator (VR), Box trainer (BT), Serious Game (SG), or Control group (CG) (Fig. 1)." (PDF p. 2) Detailed Analysis: Criterion S requires school-/site-level randomization (i.e., multiple implementing institutions or comparable sites randomized to treatment/control). The paper describes a single study location (one university medical center department) and indicates that individual students were assigned to groups. There is no randomization of multiple schools or sites. Criterion S is not met because randomization is not at the school/site level.
- I
  Independent Conduct
  - The evaluated training modalities are third-party tools and the authors disclose no conflicts of interest or financial ties.
  - "Lukas Vradelis, Natascha Müller, Florentine Huettl, Laura Isabel Hanke, Annekatrin Nedwed, Christian Boedecker, Hauke Lang and Tobias Huber have no conflicts of interest or financial ties to disclose." (PDF p. 12)
  - Relevant Quotes: 1) "The VR simulator used was the “LapSim” (Surgical Science Sweden AB, Gothenburg, Sweden)..." (PDF p. 3) 2) "The box trainer, the “Lübecker Toolbox” (LTB Germany Ltd., Lübeck, Germany)..." (PDF p. 3) 3) "The serious game used was "Underground" (Grendel Games, Leeuwarden, Netherlands) for the Nintendo Wii U..." (PDF p. 3) 4) "Lukas Vradelis, Natascha Müller, Florentine Huettl, Laura Isabel Hanke, Annekatrin Nedwed, Christian Boedecker, Hauke Lang and Tobias Huber have no conflicts of interest or financial ties to disclose." (PDF p. 12) Detailed Analysis: Criterion I requires that the evaluation be conducted independently from the intervention designers/providers (or otherwise clearly document independence that reduces bias). The interventions evaluated are commercially provided tools from external organizations (LapSim, Lübecker Toolbox, and Underground). The disclosures explicitly state there are no conflicts of interest or financial ties to disclose, reducing concerns that the study was conducted by a financially interested intervention provider. Criterion I is met because the interventions are third-party tools and the paper discloses no conflicts/financial ties.
- Y
  Year Duration
  - The paper does not document outcome measurement at least 75% of an academic year after intervention start, and T is not met.
  - "Each group received four one-hour training sessions." (PDF p. 1)
  - Relevant Quotes: 1) "Each group received four one-hour training sessions." (PDF p. 1) 2) "All participants completed the standardized tests on the VR simulator at the beginning, midpoint, and end of the study." (PDF p. 3) Detailed Analysis: Criterion Y requires outcomes to be measured at least 75% of an academic year after the intervention begins; additionally, per the ERCT instruction, if criterion T is not met then criterion Y is not met. The paper describes only four hours of training and testing at the beginning/midpoint/end of the study, without dates establishing a year-long (or near year-long) follow-up window. Since T is not met, Y must also be not met. Criterion Y is not met because the study does not document a year-duration follow-up and T is not met.
- B
  Balanced Control Group
  - The intervention adds structured training time by design, and that additional time is the treatment variable being tested against a no-training control group.
  - "The control group did not train and only participated in the testing on the VR simulator." (PDF p. 3)
  - Relevant Quotes: 1) "Each group received four one-hour training sessions." (PDF p. 1) 2) "The training protocol consisted of four one-hour sessions, two before and two after the midterm assessment, following a distributed practice structure." (PDF p. 2) 3) "The control group did not train and only participated in the testing on the VR simulator." (PDF p. 3) 4) "Facilitators were present during all training sessions to ensure correct technical setup and adherence to the standardized procedure, but they did not provide individualized coaching or corrective feedback." (PDF p. 3) Detailed Analysis: Criterion B evaluates whether the control group offers a comparable substitute for the intervention inputs (time/resources), unless the added resources are explicitly integral to the treatment being tested. Extra resources are present: the intervention groups receive four one-hour training sessions (plus facilitator presence for setup and adherence), while the control group does not train. However, this imbalance is not a confound here because the central treatment contrast explicitly includes structured simulation training time versus no training; the additional time/resources are integral to what is being tested. Criterion B is met because the added training time/resources are the intended treatment variable compared against a no-training control condition.
Level 3 Criteria
- R
  Reproduced
  - No independent replication of this specific 2026 randomized study by a different author team was found.
  - Relevant Quotes: 1) (No statements describing an independent replication of this specific study were found in the paper.) (N/A) Detailed Analysis: Criterion R requires an independently conducted replication of the same study (or clearly the same intervention/design) by other authors, published in a peer-reviewed journal. The paper presents itself as an original prospective randomized study. Internet searching by DOI and full title (as of the ERCT check date) did not identify an independent replication of this specific trial by a different research team. Criterion R is not met because no independent replication of this specific study was found.
- A
  All-subject Exams
  - Criterion A is not met because criterion E is not met and outcomes are limited to simulator tasks rather than all-subject exams.
  - "Pre-, mid-, and post-tests on the VR simulator consisted of a single standardized repetition of each of the three assessment tasks (“Grasping,” “Fine Dissection,” and “Clip Applying”), in accordance with established VR simulator assessment protocols." (PDF p. 3)
  - Relevant Quotes: 1) "Pre-, mid-, and post-tests on the VR simulator consisted of a single standardized repetition of each of the three assessment tasks (“Grasping,” “Fine Dissection,” and “Clip Applying”), in accordance with established VR simulator assessment protocols." (PDF p. 3) Detailed Analysis: Criterion A requires standardized exam-based assessment across all main subjects, and per ERCT instruction, if criterion E is not met, criterion A is not met. This study assesses laparoscopic simulator performance tasks and does not use standardized exams. Therefore E is not met, and A cannot be met. Criterion A is not met because criterion E is not met and the study does not assess all subjects via standardized exams.
- G
  Graduation Tracking
  - The study does not track participants to graduation, and G is not applicable under ERCT because Y is not met.
  - "Therefore, future studies should incorporate longitudinal follow-ups and evaluate surgical performance in actual clinical environments." (PDF p. 11)
  - Relevant Quotes: 1) "First, the total training duration was limited to four hours. This restricts conclusions regarding long-term retention and skill transferability to real-world surgical settings." (PDF p. 11) 2) "Therefore, future studies should incorporate longitudinal follow-ups and evaluate surgical performance in actual clinical environments." (PDF p. 11) Detailed Analysis: Criterion G requires tracking participants until graduation, and per ERCT instruction, if criterion Y is not met then criterion G is not met. The paper explicitly frames its results as short-term and calls for future longitudinal follow-ups, indicating that long-term tracking (including to graduation) was not conducted as part of this study. Additionally, criterion Y is not met, which makes criterion G not met by rule. Internet searching did not identify subsequent follow-up papers by the same author team reporting tracking of this cohort through graduation. Criterion G is not met because the study ends at immediate post-testing, does not track to graduation, and Y is not met.
- P
  Pre-Registered
  - The paper provides ethics approval information but no dated public pre-registration record (registry, ID, and registration date).
  - Relevant Quotes: 1) "Ethical approval and consent to participate This study was approved by the Ethik-Kommission der Landesärztekammer Rheinland-Pfalz, approval number 2020–15126." (PDF p. 12) 2) (No statements indicating pre-registration on a public registry, including an ID and registration date, were found in the paper.) (N/A) Detailed Analysis: Criterion P requires a publicly accessible pre-registered protocol with a registration date that is before data collection started. The paper reports ethics approval (including an approval number) but does not provide a trial registry/platform (e.g., ClinicalTrials.gov, DRKS, ISRCTN, OSF), an identifier, or a registration date. Internet searching did not locate a public registration that is clearly linked to this specific published trial. Criterion P is not met because no public pre-registration record (platform, ID, and date) is documented for this study.

Request an Update or Contact Us

Are you the author of this study? Let us know if you have any questions or updates.

Request Valuation Update

All Other Requests

Have Questions
or Suggestions?

Get in Touch

Have a study you'd like to submit for ERCT evaluation? Found something that could be improved? If you're an author and need to update or correct information about your study, let us know.

Submit a Study for Evaluation

Share your research with us for review
Suggest Improvements

Provide feedback to help us make things better.
Update Your Study

If you're the author, let us know about necessary updates or corrections.

Comparing skill transfer and cognitive style effects across three laparoscopic training modalities: a prospective randomized study in medical students

Randomization was at the individual student level (not class- or school-level), and the paper does not describe a tutoring/personal teaching exception.

Outcomes are simulator-based performance tasks rather than a widely recognized standardized exam-based assessment.

The study reports four one-hour training sessions with testing at the beginning/midpoint/end of the study, without documenting a term-long (3–4 month) follow-up window.

The control group condition is explicitly described and baseline characteristics are documented and reported as balanced.

The study was conducted within a single institution and assigned individual students, not multiple schools/sites randomized to conditions.

The evaluated training modalities are third-party tools and the authors disclose no conflicts of interest or financial ties.

The paper does not document outcome measurement at least 75% of an academic year after intervention start, and T is not met.

The intervention adds structured training time by design, and that additional time is the treatment variable being tested against a no-training control group.

No independent replication of this specific 2026 randomized study by a different author team was found.

Criterion A is not met because criterion E is not met and outcomes are limited to simulator tasks rather than all-subject exams.

The study does not track participants to graduation, and G is not applicable under ERCT because Y is not met.

The paper provides ethics approval information but no dated public pre-registration record (registry, ID, and registration date).

Abstract

ERCT Criteria Breakdown

Level 1 Criteria

Class-level RCT

Exam-based Assessment

Term Duration

Documented Control Group

Level 2 Criteria

School-level RCT

Independent Conduct

Year Duration

Balanced Control Group

Level 3 Criteria

Reproduced

All-subject Exams

Graduation Tracking

Pre-Registered

Request an Update or Contact Us

Have Questions or Suggestions?

Submit a Study for Evaluation

Suggest Improvements

Update Your Study

Have Questions
or Suggestions?