If recognised and treated early, patients who experience deterioration conditions have a lower risk of developing adverse events, such as sepsis and acute and kidney injury.1 To ensure these patients receive interventions early, escalation protocols that include evaluation criteria for the patients are commonly established in hospitals.1 2 However, by some accounts, bedside staff only follow protocols in 8% of all hospital adverse events2–4
Several scoring systems, such as Early Warning Score (EWS) and Modified Early Warning Score, have been developed and adopted widely to help clinicians identify patients whose conditions may deteriorate in the hours to come. However, clinical outcomes from the use of EWS have been mixed.5–7 Bedside warnings reported by these scores are often not acknowledged or acted on because bedside staff encounter high false positives and low actionable values. The perception of warnings that are not actionable could be due to the timing of the warning.8 To achieve better predictive performance, researchers have turned to artificial intelligence (AI) and machine learning (ML) algorithms to predict adverse events. Little is known about how the user interface (UI) design of these systems impacts clinician workload and clinical outcomes.
Since the 1990s, the development and continuous refinement of scoring systems to predict patient deterioration have garnered many reviews of their effectiveness. Lagadec et al found anecdotal evidence that various EWSs are beneficial to clinical staff when implemented.9 However, clinical outcomes also depend on factors other than the EWSs’ predictive performance and incorporated escalation protocols. McNeill et al reviewed studies that included early detection tools in the activation of rapid response teams.1 They concluded that the lack of appropriate integration into clinical workflows and UI design shortcomings might have curtailed these systems’ performance.1 In studying how nurses activate rapid response teams, Wood et al’s review found that mistrust, over-reliance, miscalculation and the lack of understanding of the EWSs contribute to the failure of escalation. In some cases, such failures may place patients at risk.10 With broader adoption of AI and ML algorithms, Muralitharan et al found that, generally, ML algorithms have greater accuracy in predicting clinical deterioration when developed and evaluated retrospectively. However, few studies assess the clinical benefits of these algorithms in the real world.11
There are scoping reviews covering issues surrounding the development and implementation of ML decision support tools in general. However, those reviews have objectives that are different than this protocol. Schwartz et al reviewed the level of clinicians’ involvement in developing and implementing any decision support tools used in the hospitals.12 Their inclusion criteria were broad, and their analysis did not include design features of the decision support tools. Similarly, Lee et al focus their review on implementation issues of decision support tools without feature analysis.13 According to our knowledge, this is the first scoping review that focuses on design features of the UI tools specifically for the early prediction of patient deterioration and adverse events.
Methods and analysis
We will conduct our scoping review under the guidance of the latest version of the JBI Manual for Evidence Synthesis and organise the protocol on the framework of five stages proposed by Arksey and O’Malley: (1) identifying the research question, (2) identifying relevant studies, (3) study selection, (4) extracting the collected data and (5) reporting the results.14–16 For transparency and reproducibility, we will adhere to the reporting guidelines defined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews.17 Details regarding electronic sources of data, date ranges, and inclusion and exclusion criteria are outlined in the ‘Stage 2 Identifying relevant studies’ section.
We will use Covidence (Veritas Health Innovation) an online systematic reviewing platform, to screen and select studies. Citation management and duplicate detection and removal will be accomplished with EndNote (Clarivate Analytics.) We will use a spreadsheet programme to extract and chart our data.
A search for existing reviews was conducted in PubMed (pubmed.gov), Epistemonikos (www.epistemonikos.org), PROSPERO (www.crd.york.ac.uk/PROSPERO) and Open Science Framework (osf.io). None were identified as focusing on the UI of the surveillance tool directed to the clinician.
Stage 1: identifying research questions
We seek to address the following research question constructed with JBI’s ‘PCC’ mnemonic: what approaches, at what frequency, have designers and developers used to present patient deterioration risk information to clinicians?15 Participants in the studies include clinicians who use or represent intended users of automated surveillance tools that supply computed deterioration risk information in clinical decision-making. The key concept we are exploring is evaluations of automated surveillance tools that support the prediction of patient deterioration by measuring user experience, human–system clinical performance, workflow processes or clinical outcomes. The relevant contexts include automated patient surveillance tools in hospital settings in any country.
Stage 2: identifying relevant studies
The second stage of Arksey and O’Malleys’ framework is identifying relevant studies. While many studies evaluate algorithms that provide predictions of patient deterioration, this scoping review focuses on only studies that operationalise these algorithms into usable tools with relevant clinician UIs. Settings should be live or simulated clinical settings that incorporate realistic patient data.
An information specialist (MMM) will develop the search string for our primary database (Medline) and translate it to the other preselected databases by database subject terms and keywords. Library colleagues will peer review the strategy using PRESS guidelines.18 An example of the search string is included as an online supplemental appendix.
Before the incentives under the EHR Meaningful Use program in 2009, EHR adoption was low.19 Tools that predict patient deteriorations became technically feasible for design and development only after clinical data were made available electronically. While there may have been decision support tools using automated surveillance before 2009, the potential for the implementation of such tools was limited. Accordingly, we will search for articles from 1 January 2009 to the present.
Electronic sources will include Medline (Ovid), Embase (embase.com), CINAHL Complete (Ebscohost), Cochrane Library (wiley.com), CENTRAL (wiley.com) and IEEE Xplore (IEEE.org). No methodological nor language filters will be applied.
We will check references of included studies for relevant studies. No grey literature will be selected to search.
The queries will include the following general concepts. Table 1 shows an example of the concepts and example search terms used in the query strings. Medline is our primary database, and our search is highly sensitive to our research question. Search strategies for the other four databases will include more precision and not be as sensitive. The exact preliminary search strategy for Medline is included in online supplemental appendix 1.
We will include studies that engage clinicians who use or represent intended users of surveillance tools that supply computed deterioration risk information in clinical decision-making as participants. As a minimum criterion, studies must include participants recruited from outside of the investigating team.
Studies will be included that address evaluating the UI or user experience of automated surveillance tools that support the prediction, classification or identification of patient deterioration by measuring user experience, human–system clinical performance, workflow processes or clinical outcomes.
Automated surveillance tools are defined as tools that: (1) leverage and aggregate multiple data types that are already being collected within standard care practices, (2) analyse these data dynamically, and (3) provide information to support patient monitoring or clinician decision-making. We limit our review to tools that leverage some form of computational, algorithmic, AI, or ML approach to predict or to classify the risk of patient deterioration in advance of a relevant, clearly defined clinical outcome. Relevant outcomes may include the following: cardiac arrest, stroke, sepsis, acute kidney injury, acute lung injury, haemorrhage, ventilator-associated pneumonia, thrombosis, seizures, syncope, loss of consciousness, or death. Prediction or risk assessment of surrogate outcomes for clinical deterioration will also be included. Examples include transfer to a higher level of care, activation of rapid response, or code team. Emergent treatments such as mechanical ventilation or rescue medication delivery also are relevant outcomes for inclusion.
For the user experience and subsequent outcomes of automated surveillance tools, we are limiting our review to evaluations that engage clinicians in evaluating any part of the system, including:
the UI: the device used for conveying the information such as a phone, pager, or monitor; details of the interface such as display design, message content, risk scoring approach; and integration of information into existing clinical systems such as an EHR or patient monitor.
clinical workflow processes: to whom the information is provided and in what clinical situations.
We will include all English-language articles. Non-English studies appearing to meet inclusion criteria via English abstract will be noted as non-English in our data charting form (and no further data abstracted). Funding for translation services has not been allocated.
We will include evaluations in the context of automated patient surveillance tools in hospital settings in any country.
Any study that engages users in an evaluation of the relevant tool will be included. For example, original studies including observational, cohort, case control, clinical trial, usability tests, qualitative evaluations will be included.
In sum, the following inclusion criteria will be applied:
Must include descriptions of tools that are used for the surveillance, prediction and detection of patient deterioration events.
Algorithms must automatically synthesise multiple types of information.
The articles must contain formal evaluation involving human subjects.
Intended end-users must be hospital clinicians.
Naturally, any articles that do not meet the inclusion will be excluded. However, to ensure consistency and agreement among evaluators, the exclusion criteria are outlined as follows:
Studies that only include analysis of algorithm performance without a clinical use.
Studies that describe the UI or architecture designs without an evaluation.
Simple monitors that only trigger on preset thresholds for a single parameter.
Systems that are only intended for epidemiology studies.
Calculators that require manual entry.
Step 3: study selection
Pairs of evaluators will screen the title and abstract from the first 20 randomised entries of the queries’ result set for inclusion based on defined criteria. Discrepancies will be resolved through discussions. After resolution, the following 20 studies will be evaluated. This cycle will be repeated until an acceptable kappa agreement of 0.8 is achieved between the reviewers. All titles and abstracts will then be reviewed to identify studies to include for full-text review.
The subset included for full-text review will be evaluated by two reviewers for inclusion. Discrepancies will be resolved through discussions. If discussions fail to resolve differences, a third reviewer will adjudicate.
As in common for scoping review methodology, we do not plan to conduct a quality assessment of included studies. Our goal is to map the literature rapidly to understand the scope of approaches that have been implemented and evaluated.
Stage 4: data extraction
Electronic spreadsheets will be used in the data extraction process. Three researchers will develop an initial data extraction form and present it to a panel of experts for review and revision. Using the revised form, two researchers will independently perform data extraction on a small sample of articles to evaluate the form’s reliability and clarity by calculating interrater agreements. Discrepancies of the extracted data will be resolved by discussion. If new categories are found during the review, they will be added to the extraction form. Redundant categories will be removed, and ambiguous categories will be clarified. The abstraction form will be fine tuned iteratively until good agreement of the extracted data is reached. Core data elements of the data extraction have been submitted as a supplemental in this protocol.
Pairs of researchers will review the included articles and extract data using the extraction process during the extraction process. Differences will be resolved by discussions. A third researcher will adjudicate any unsolved differences.
The following data should be collected:
Definition of patient deterioration.
The clinical workflow and the targeted patient population.
Demographics of the targeted end-users and their professional roles.
The users that are included in the evaluation process, along with their demographics and professional roles.
The design process/method that was used in developing the tool.
Display data: what and how data are displayed in the tool.
Contextual data supporting the prediction or risk assessment.
Evaluation metrics being used to measure the effects of the tool.
The subject focus of the journals.
The extracted data will be classified into categories such as design approach, problem predicted and definitions used to define relevant outcomes. Once classified, the frequency of each of the categories will be counted. We will use descriptive statistics to analyse their frequencies. If available, descriptive statistics will be applied to the sample sizes of the included manuscript. Correlations among related categories will also be analysed.
Stage 5: data reporting
Along with a narrative description of results, frequency counts of each category identified will be reported in tabular formats. Categories, such as defined patient deterioration outcomes, methods of users’ interaction with the systems, and types of information displayed, will be displayed as bar charts or other figure formats for comparison. For example, the types of information displayed in the UI and correlation with definitions of patient deterioration may be displayed as bubble charts.
Change(s) in scoping protocol methodology will be acknowledged and defined in the manuscript.
The queries for other databases are under development, and an initial version extraction form has been drafted. We have begun title and abstract screening for articles retrieved with the Medline search. Depending on the size of the result set, the entire project is expected to be completed by April 2022.
Patient and public involvement
Due to the limited scope of our research support, patient and public involvement has not been included as part of the protocol.
Ethics and dissemination
Ethics review is not required for this scoping review. Findings will be dissemination through peer-reviewed publications.