Effects of virtual reality OSCE on nursing students education: a study protocol for systematic review and meta-analysis


The objective structured clinical examination (OSCE) provides an objective, orderly and organised assessment framework, in which medical schools, hospitals, or medical or examination institutions can add corresponding assessment contents and methods according to their teaching and examination syllabus.1 This method tests the clinical abilities of nurses or nursing students by simulating clinical scenarios.2 It is also a clinical ability assessment method that emphasises knowledge, skills and attitude.3 Candidates conduct practical tests through a series of predesigned exam stations, including standardised patients (SP), practical operations on medical simulators, collection of clinical data and document retrieval.4 The exam station is divided into long and short stations, with a duration ranging from 5 min to 20 min; candidates are evaluated by the examiner or SP.5

However, given that the OSCE requires a person-in-person offline operation, some objective factors, such as the restrictions of the COVID-19 epidemic some time ago,6–8 the development of virtual reality (VR) technology in the field of nursing education,9–11 and the increasingly popular cross-regional and multicentre joint training,12 13 exist. As a result, the traditional offline OSCE cannot satisfy the requirements of modern nursing education.

VR OSCE refers to the implementation of traditional OSCE on VR devices and the use of VR technology.14 More applications are developed in the fields of medical and nursing education.15–17 Compared with traditional OSCE, VR OSCE has some significant advantages. It can be carried out without physical distance limitations, thereby allowing participation from long distances, multiple locations and simultaneous engagement, making it highly accessible.18 Modern VR devices are more popular among young people.19 Meanwhile, research has shown that these devices can increase participants’ confidence, allowing them to perform better in exams.20 With the rapid development of VR OSCE, its application in the field of nursing education is gradually increasing, playing an important and irreplaceable role in the assessment of nursing students.21 22

Nevertheless, the effects of VR OSCE as an assessment method for nursing students are controversial. Some studies showed that VR OSCE can improve confidence and competence among nursing students,23 24 whereas others presented no significant growth.17 Although VR OSCE demonstrates potential in improving the assessment of nursing students, adequate evidence confirming the effects of VR OSCE as an assessment method for nursing students is lacking.

To the best of our knowledge, a meta-analysis of the effects of VR OSCE on the education performance of nursing students has not yet been carried out. A systematic review23 has reported the implementation of VR OSCE, strengthening confidence in the virtual environment. However, their study population includes health professionals, rather than nursing students exclusively. Another systematic review25 has reported that OSCE is a more credible assessment format than the virtual style in evaluating the clinical competence of nursing students. Therefore, assessing the effects of VR OSCE for nursing students is urgently necessary. In this study, we aim to systematically evaluate the effectiveness of VR OSCE as an assessment method, particularly in terms of students’ competence, stress, anxiety, confidence, satisfaction with VR OSCE and examiners’ satisfaction.



This study aims to assess the effects of VR OSCE on nursing students’ education.


This protocol study is conducted in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA) guidelines26 and has been registered in the International Prospective Register of Systematic Reviews (PROSPERO) with registration number CRD42023437685.

Search strategy

Electronic data search is carried out on PubMed, Web of Science, CINAHL, EBSCO, EMBASE and the Cochrane Library. In addition, references of included papers are searched to identify additional eligible studies. For studies without full text or with missing original data, the original authors are contacted. Finally, the research that contains sufficient information to assess eligibility for inclusion criteria is also included.

We establish the search strategy by using preretrieval PubMed. Search terms related to “virtual” are as follows: “virtual reality” OR “virtual” OR “online” OR “digital” OR “remote” OR “electronic” OR “video” OR “web”. The Boolean operator “OR” is used to combine these terms with different syntaxes adapted to each original database.

The keywords used to capture the concept of “OSCE” include “OSCE” OR “objective structured clinical examination” OR “clinical simulation in nursing” OR “high fidelity simulation training” OR “clinical examination” OR “clinical assessment” OR “clinical skill assessment” OR “clinical competence” OR “clinical performance”. We use the Boolean operator “OR” to combine these search terms with different syntaxes adapted to each database.

Search terms related to “nursing students” are “students, nursing” OR “nursing student*” OR “pupil nurse*” OR “nurse intern” OR “nursing staff” OR “nurse education” Similarly, the Boolean operator “OR” is used to combine the search terms with different syntaxes adapted to each database.

We use the Boolean operator “‘AND” to combine the three search terms, namely, “virtual,” “OSCE” and “nursing students”. The search time is from the inception of each database to 30 June 2023, and no language restriction is considered. References of included studies are searched for additional identification. For studies without original data, we attempt to contact the original authors to obtain the required information. The search algorithm will be developed by an experienced librarian to ensure the comprehensiveness of the literature retrieval and processing. The search strategy is shown in online supplemental appendix.

Supplemental material

Eligibility criteria


Nursing students comprising those studying in school and engaging in hospital internships are included.


Studies on the use of VR OSCE as an assessment tool.


Studies on the use of traditional clinical examinations, such as in-person OSCE as assessment tools.


We assess the outcome list as follows: (1) competence, (2) stress, (3) anxiety, (4) confidence, (5) students’ satisfaction with VR OSCE and (6) examiners’ satisfaction. Competence refers to the ability of an individual to complete a task appropriately.27 It can be assessed using different instruments, such as the nurse competency scale.28 Stress is a cognitive and behavioural experience composed of psychological stress source and psychological stress response.29 It can be assessed using instruments, such as the Perceived Stress Scale.30 Anxiety is a restless emotion caused by excessive concerns about the safety of family members or one’s own life, future and destiny.31 It can be assessed using different instruments, such as state–trait anxiety inventory32 and Self-rating Anxiety Scale.33 Confidence refers to a psychological characteristic that reflects an individual’s level of trust in his ability to successfully complete a certain activity; it is a positive and effective expression of self-worth, self-respect and self-awareness, as well as a psychological state.34 It can be assessed using student satisfaction and self-confidence in the learning scale.35 Satisfaction is a psychological state that refers to a person’s subjective evaluation of the quality of a relationship.36 It can be assessed using the Simulated Clinical Experience Satisfaction Scale,37 Clinical Learning Environment, Supervision and Nurse Teacher Scale38 and some self-made scales.39 Data extracted by additional scales can also be applied to this study.

Study design

This study includes randomised controlled trials (RCTs) and quasi-experimental studies, focusing on VR OSCE groups versus traditional clinical examination groups.

Exclusion criteria

The exclusion criteria are as follows: (1) outcome measures are inappropriate and relevant data cannot be obtained from original authors; (2) animal experiments, reviews, notes, editorials or errata articles and (3) duplicate published literature.

Study selection and data extraction

Study selection

Preliminary search results are downloaded from the software ‘EndNote V.X9’. First, on accessing titles and abstracts using the function ‘Find duplicates’ of the software, we delete duplicate articles by comparing titles and authors. Second, we enter the manual screening stage, where the preliminary screening allows removing documents that do not satisfy the requirements by reading the titles and abstracts of included articles. Third, we download the remaining documents to obtain, read the full texts and then remove documents that do not satisfy the requirements. Fourth, for documents with missing texts or original data, we attempt to contact the authors to obtain information. If such an attempt is still unsuccessful, then we delete the related documents and provide reasons. Finally, references of included documents in the final study are reviewed and assessed for additional research that may satisfy the inclusion criteria. Two of the present authors (PL and XD) independently conducted the literature retrieval. Disputes are resolved through discussion, and unresolved issues are decided on by consulting the research director (HF). The selection process is conducted according to the PRISMA flow chart.

Data extraction

Data for extraction include the following information: (1) Basic information of each study, including author, publication year and country (or region); (2) Participant characteristics: sample size, grouping and sample size of each group, mean age and gender; (3) Intervention method characteristics: study design, specific intervention and control methods, VR intervention duration, and comparator; (4) Research results: result measurement method, data type, statistical data and results. Outcome data are expressed as mean±SD (M±SD). If data are provided in other formats, such as median range or median IQR, then M±SD values are calculated following the recommendations of the Cochrane Handbook for Systematic Reviews of Interventions.40 (5) Other information includes support from funding institutions and potential conflicts of interest. Two of the authors (PL and XD) conduct data extraction independently, and any dispute is settled by discussion. Unresolved disputes are decided on by consulting the third author (HF). The data extraction method is to manually fill in the Excel table, the extracted data are input into the software RevMan V.5.3 for meta-analysis.

Quality assessment of included studies

For randomised trials, we use the Cochrane risk-of-bias (ROB) tool41 to evaluate the bias risk of RCT. Seven criteria are included, namely, random sequence generation, allocation concealment, participant and personnel blinding, outcome assessment blinding, incomplete data outcome, selective outcome reporting, and other biases. Risk bias level is classified as high, unclear and low. We select the risk of bias in non-randomised studies of intervention42 tool for non-RCT studies to evaluate the ROB. The ROB includes issues related to confounding, participant selection, intervention classification, deviation from intended intervention, missing data, outcome measurement, reported result selection and overall bias.

The research quality will be assessed by applying GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach,43 and calculating the between-rater agreement coefficient. The kappa coefficients will be classified according to the study of Landis and Koch44 as follows: 0.0–0.20=slight agreement, 0.21–0.40=fair agreement, 0.41–0.60=moderate agreement, 0.61–0.80=substantial agreement and 0.81–1.00=nearly perfect agreement.45

The two authors (PL and XD) independently evaluate the ROB for each included study and evidence quality. Any dispute is resolved through discussion. Unresolved disputes are decided on by consulting the third author (HF).

Data synthesis and statistical analysis

Data synthesis

SPSS V.22.0 and RevMan V.5.3 software will be used for statistical analysis. For continuous data, if the measurement methods used in each study are the same, then we select the weighted mean difference model for statistical analysis; otherwise, the standardised mean difference model is preferred. For dichotomous data, the OR value is calculated. All effective quantities are expressed with 95% CI. A p<0.05 indicates a statistically significant difference.

Heterogeneity assessment

I2 test is used to assess the heterogeneity level. According to the Cochrane handbook, large heterogeneity exists when I2>50%. If p>0.1 and I2<50%, then a fixed effect model is used; otherwise, if p<0.1 and I2>50%, then the random effect model is applied. If conditions permit, then we collect quantitative data for meta-analysis; otherwise, data are presented in narrative form. Sensitivity and subgroup analyses are performed to explain possible heterogeneity sources.

Subgroup analysis

If significant heterogeneity (I2>50%) is found and the heterogeneity source cannot be detected through sensitivity analysis, then a subgroup analysis is conducted. Grouping analysis can be applied to basic research information, subject characteristics, intervention methods, intervention duration, sample size or other aspects.

Sensitivity analysis

When heterogeneity is large, the leave-one-out method is used to determine whether it is caused by a certain study. For example, we remove one study to determine whether heterogeneity decreases. This method is used to test each study to find the possible heterogeneity source.

Publication bias assessment

For 10 or more studies available for meta-analysis, we use funnel plot to measure the publication bias level. Specifically, the method evaluates whether the funnel plot is symmetrical through visual inspection using the Egger’s test with a significance level of 5%.46 If less than 10 items exist, then we determine whether publication bias exists according to the characteristics of the included studies.

Evidence quality

The quality of each evidence is assessed using the GRADE rating scale.47 We classify the quality as high, moderate, low or very low according to the consideration of ROB, inconsistency (heterogeneity), indirectness, imprecision and publication bias.48 The results start with ‘high’-quality evidence and are then degraded according to the problems in each field. Results can also be enhanced when the evidence shows that all possible confounding factors and other deviations increase confidence in the estimated effect. The two authors (PL and XD) score each area of comparison and resolve differences through consensus. Unresolved disputes are decided on by consulting the third author (HF).

Expected dates for research

The literature search is from 30 June 2023 to 31 January 2024, data extraction is from 1 February 2024 to 31 March 2024, quality evaluation is from 1 April 2024 to 30 April 2024, the meta-analysis is from 1 May 2024 to 30 June 2024 and evidence quality evaluation is from 1 July 2024 to 31 July 2024.

Patient and public involvement

Our study will not involve or did not involve patients or the public in the design, execution or planning for reporting and dissemination.

Future directions and clinical implications

With the continuous improvement and development of VR technology, it has been applied in clinical research and has achieved satisfactory treatment results.49 Previous studies showed that VR plays a certain role in the treatment of psychological conditions in ICU patients, but the specific efficacy remains controversial.39 50 We further analyse which aspects of VR have positive therapeutic effects on the psychological conditions of ICU patients, which aspects have no therapeutic effects, and which have adverse effects. We also explore the possible causes and reasons. How to maximise the advantages of VR in clinical intervention will become the future development direction. This concept has clinical significance for providing more scientific intervention plans and a theoretical basis for the application of VR in the treatment of psychological disorders in ICU patients.

Ethics and dissemination

This protocol study does not carry out clinical research and thus does not require ethical approval. Research findings will be published in a peer-reviewed journal.

Supplemental material

This post was originally published on https://bmjopen.bmj.com