Appropriate use of blood cultures in the emergency department through machine learning (ABC): study protocol for a randomised controlled non-inferiority trial

Introduction

Over 20% of adult emergency department (ED) visits occur due to infections.1 Often, physicians will request blood cultures in these patients due to fear of a bloodstream infection (BSI).2 Because of the liberal use of blood cultures, the yield tends to be low. Depending on the setting, only about 1%–15% of blood cultures grow pathogens, of which 30%–55% turn out to be false-positive, contaminated results.3–5 Those contaminated cultures are associated with prolonged hospital stays, unnecessary use of antibiotics and even hospital mortality.3 6 7 To optimise blood culture use, it is crucial to better target which patients would benefit from blood culture analysis.8

A machine learning model based on extreme gradient-boosting (XG Boost) that can predict the outcome of blood cultures in the ED has been developed to address this clinical dilemma, aiming to reduce unnecessary tests in low-risk patients, and help avoid the harms associated with false-positive results due to blood culture contamination.9 10 The performance of the model has been internally and externally validated in various settings, including a real-time background evaluation in the Amsterdam University Medical Centers (Amsterdam UMC, locations VUmc and AMC) Electronic health record (EHR) system, where it maintained an area under the operating curve of 0.76 and showed a potential reduction of approximately 30% blood cultures in the future.9 10 Additionally, the model has shown a consistent performance over time despite changes in clinical practice and patient characteristics.11

The aim of this study is to investigate the safety and potential clinical benefits of using the blood culture prediction tool as decision support in reducing unnecessary blood cultures and thereby the negative side effects of false-positive results for patients. This will be the first randomised controlled trial testing such a machine learning prediction tool in clinical practice.

Methods and analysis

This study protocol adheres to the Standard Protocol Items: Recommendations for Interventional Trials statement.

Model development and validation

This segment features a summary of the model development, more detailed descriptions have been published previously.9 10 For the development of the model data from the VUmc was split into a training (80%) and a test set (20%), stratified by blood culture outcomes. Missing data were imputed by medians. The model was trained on the training set and subsequently validated in the VUmc test set, and three other hospital datasets (AmsterdamUMC location AMC, Zaans Medisch Centrum (ZMC, Zaandam, The Netherlands) and Beth Israel Deaconess Medical Center (BIDMC, Boston, Massachusetts, USA)). These hospitals differ in size and population, comprising two tertiary teaching hospitals and one secondary care hospital in the Netherlands, along with one hospital dataset from the USA. Part of the data was gathered during the COVID-19 pandemic, but this did not affect its performance. The model contains 49 variables, namely age, sex, 6 vital sign measurements and 18 laboratory tests. Additionally, indicator variables of the laboratory tests were included. Figure 1 shows the 20 most important variables. The area under the receiver operating characteristic (AUROC) curves in the various validation cohorts ranged between 0.75 and 0.81. A prospective validation was also performed, integrating the model into the EHR of VUmc, showing an AUROC of 0.76. With the threshold for blood culture analysis set at 5%, analyses could be avoided at approximately 30%.

Figure 1
Figure 1

SHAP values for model variables. This figure depicts SHAP (SHapley Additive exPlanation) values, illustrating the individual contributions of each variable to the model’s prediction. In the left panel (blue bars), we see the average impact of the 20 most important features on the prediction. Meanwhile, the right panel not only shows the feature’s contribution but also its direction. Contributions to the left of 0 on the x-axis correspond to negative predictions for blood cultures, while those to the right correspond with positive predictions. Additionally, the colour indicates the predictor’s actual value, with blue indicating low values and red indicating high values.

Study design

This study will be a multicentre randomised controlled non-inferiority trial. Participants will be randomised into an intervention and control arm. In the intervention group, the blood culture prediction model is used to predict the probability of a positive blood culture in the patient. The model will start predicting when sufficient variables are available in the EHR. A new prediction will be made every 20 min until 3 hours have passed since check—in of the patient in the ED. The predictions made by the model will be shown to the physician after randomisation in a dashboard integrated into the EHRs. The blood culture analysis will be cancelled if the probability of a positive culture is less than 5%. This will be done manually by the study team. If the probability is over or equal to 5%, the blood culture will be analysed as per standard practice. In the control group, all patients will have their blood cultures analysed. The study design is summarised in figures 2 and 3. Patients will be randomised within the EHR system, to ensure that physicians cannot be influenced in their treatment decisions by seeing the score before the patient is included. The first patient was included in February 2024. Follow-up of the last patients is expected in February 2027.

Figure 2
Figure 2

Consolidated Standards of Reporting Trials flow chart.

Figure 3
Figure 3

Study design flow chart.

Population

All adult patients (aged 18 years or older) presenting to the EDs of multiple hospitals in the Netherlands, and in whom blood cultures are ordered during ED stay will be assessed for eligibility. This process is automated within the EHR. The treating physician determines whether there is an indication for performing blood cultures according to the current local standards, after which both the patient and study team are informed about possible study recruitment. Inclusion and exclusion criteria are summarised in table 1. Some patients will be excluded due to the higher possibility of BSIs with a pathogen generally considered a contaminant (eg, a central line in situ) or because of the possibility of severe clinical implications of omitting blood cultures (eg, severe neutropenia). Since the study’s goal is to reduce unnecessary testing, patients for whom blood culture analysis is deemed imperative are excluded (eg, patients with a suspected diagnosis of endocarditis, spondylodiscitis or infected prosthetic material). Furthermore, patients unable to give informed consent and pregnant or breastfeeding patients will be excluded, due to them being considered high-risk groups by the ethics review board. Therefore, the review board did not give permission to include these groups.

Table 1

Overview of inclusion and exclusion criteria

Sample size calculation

An analysis of 5907 unique ED visits with blood culture sampling in Amsterdam UMC showed a 30-day mortality rate of 7.6%, a hospital admission rate of 67.4% and an average length of stay in the hospital of 6.7 days. Based on these numbers, the sample sizes with a relative non-inferiority margin of 1.25 for the rates and 1 day for the length of stay were calculated. The sample sizes were calculated by a statistician based on a one-sided alpha of 5% and a target power of at least 80%. For 30-day mortality, the calculated sample size was 6066 (3033 per arm) to test non-inferiority. After inflation for a dropout rate of 20%, the final sample size is 7584 (3792 per arm). For the secondary outcomes, described in table 2, fewer participants were needed for adequate power.

Table 2

Overview of outcome measures

Recruitment

In order to achieve adequate participant enrolment, an announcement regarding the study is built into the EHR system. When the physician initiates a blood culture request in the EHR for an adult patient in the ED, a pop-up notification will automatically appear. This pop-up briefly outlines the study and presents both inclusion and exclusion criteria. If the patient provides permission to receive information about the study, the study team will be alerted and will provide the patient with additional information and obtain informed consent.

Outcomes

The primary objective is to investigate whether using a machine learning-based blood culture prediction tool is non-inferior to current practice regarding 30-day all-cause mortality. All secondary outcomes are described in table 2. Model performance will be evaluated using the AUROC, Area Under the Precision-Recall Curve (AUPRC) and calibration for the model in the complete study population and subgroups based on comorbidities. This will be assessed every 3 months so model performance over time can be guaranteed. The model will also continue to undergo prospective validation in the background of the EHR. Additionally, patient-reported outcomes will be included to determine if the model also influences patient-reported outcome measures such as side effects from antibiotics and quality of life. For assessment of quality of life, the EuroQol 5-Dimension 5-level (EQ-5D-5L) Questionnaire12 will be filled out at baseline and 3 months. Lastly, a cost-effectiveness analysis will be performed using cost questionnaires based on the iMTA Medical Consumption Questionnaire (iMCQ) and iMTA Productivity Cost Questionnaire (iPCQ).13 A timeline visualising data collection is shown in table 3.

Table 3

Data collection timeline

Data collection and management

The data underlying this study, including the outcome measures such as mortality and admission rates, will be automatically extracted from the EHR system into an electronic data capture system (Castor EDC). The questionnaires regarding patient-reported outcomes and cost–benefit analysis will subsequently be sent out digitally through the EDC system. The data will be stored for 15 years, in agreement with local regulations and with consent. Data management and monitoring will be performed by the study team. Additionally, a monitor from the local independent clinical monitoring centre will monitor the data and perform data validation checks. Participant data underlying the results of this study can be shared. The data can be requested following publication of this work. The data can be shared with researchers on reasonable request, which is allowed under local privacy regulations. Proposals should be directed to the corresponding author and requestors will need to sign a data access agreement.

Safety, adverse events and monitoring

Adverse events will be monitored by the investigators, including device-related adverse events. Serious adverse events will be reported within a maximum of 15 days to the local authority, and an annual safety report will be submitted to the local authorities. Examples of possible adverse events include delayed appropriate antibiotic treatment, hospital readmission or prolonging of hospital stay. The study risk was assessed using the Nederlandse Federatie van Universitair Medische Centra (NFU) risk classification and classified as moderate risk.14 The study has its own data safety monitoring board (DSMB) consisting of clinicians, a statistician and an epidemiologist, independent of the study team. The DSMB will meet before the first year of the study ends and once a year during the study period. During the study period, the DSMB will be supplied with an interim analysis and will advise the study team based on these results. A meeting will be held to evaluate mortality after 2850 patients have completed the trial. Additional meetings may be requested based on trial events. At least 2 weeks before each meeting, the DSMB will receive reports regarding data quality, recruitment and patients’ safety. The DSMB will report back to the principal investigator regarding any advice they may have. Additionally, the study will be monitored by an independent clinical trial monitoring committee, which performs quality checks on the data and regular on-site auditing.

Allocation and randomisation

Participants will be randomly assigned to either the intervention or control group with a 1:1 allocation through a validated computer-generated random number within the EHR system. This will be a simple randomisation without stratification. Randomisation will occur automatically after consent and after blood cultures have been ordered. Physicians caring for a patient in the control group will not see any algorithm predictions. Physicians caring for a patient in the intervention group will be able to see the algorithm prediction. Due to the nature of the intervention, neither participant nor staff can be blinded to the allocation.

Statistical analysis

We will carry out a per-protocol analysis. The primary outcome, the 30-day mortality rate, will be compared between the groups using a non-inferiority test for the ratio of two proportions. This will be a one-sided test with a significance level of 0.05. Additionally, we will do a multivariable analysis of the primary and key secondary outcomes to adjust for potential effect of age, Charlson Comorbidity Index, use of immunosuppressive medication, Modified Early Warning Score and resuscitation status. Similar tests will be performed for in-hospital mortality rate and the hospital admission rate. The hospital length-of-stay will be analysed as a continuous variable and a non-inferiority test for the differences between the two means will be used. The test statistic to be used is a one-sided T-test with a significance level of 0.05. The model-related outcomes will be assessed using an area under the curve and calibration plots. Baseline characteristics will be presented using descriptive statistics, without statistical testing for differences, since this is a randomised trial.

Patient and public involvement

Patient participation is important in both the clinical trial and the implementation of Artificial Intellegence (AI) in clinical practice. The Client Advisory Board of Amsterdam UMC and a former sepsis patient evaluated the study question as relevant and the study design as achievable from patients’ perspective. They will stay involved during the study as well. Also, to disseminate results of the study, we will collaborate with relevant patient organisations. Furthermore, a patient safety and implementation expert was involved in the development of the study protocol and will remain involved during the study period. They will continue to stay involved if the study is deemed successful and the model can be implemented into clinical practice.

Ethics and dissemination

Obtaining informed consent is of the utmost importance when working with study participants. In order to ensure that participants are well informed about the study before consenting, all potential participants will have the trial clearly explained to them by a trained member of the study team. In addition to the verbal information, a paper information brochure will be provided to each participant, see online supplemental material 1. We inform patients about the procedures and what the consequences of doing or withholding blood cultures can be, as this is the most relevant information for them. Each participant will be made aware of their right to withdraw consent at any given time. If a patient withdraws consent their data will no longer be used in the study and the study personnel will not follow them up. If the patient was in the intervention group and blood culture analysis was already cancelled, we cannot reinstate the analysis, since the material has been destroyed. Due to the nature of the study, the potential participant will get 1 hour to consider joining the study. If any change to the protocol is made the ethics review board will be notified and will have to give permission. If the changes are relevant to the participants, they will receive additional information regarding the changes made.

Supplemental material

Research results will be published in peer-reviewed journals approximately 1 year after the trial has ended. This will give the research team sufficient time for statistical analyses.

This study will be conducted according to the principles of the Declaration of Helsinki and in accordance with the Medical Research Involving Human Subjects Act, General Data Privacy Regulation and Medical Device Regulation. The study was approved by the Amsterdam University Medical Center’s medical ethics review committee with number 22.0567. The trial was registered at Clinicaltrials.gov (NCT06163781).

Discussion

Blood cultures are a commonly used diagnostic tool in the ED. Unfortunately, low yields and high rates of contamination lead to unnecessary use of antibiotics, prolonged length-of-stay in the hospital and even higher mortality. A machine learning model has been validated retrospectively and prospectively and can identify patients at low risk for positive blood cultures. The study described in this protocol is the first randomised controlled trial using machine learning to predict blood culture outcomes. However, possible limits to this trial include restricted future application due to the exclusion of severely immunocompromised patients. Future studies may need to focus on this specific subgroup as well. Nevertheless, this study will include a heterogeneous group presenting to the ED and may greatly influence current standards of practice. By using this model in practice, it can potentially reduce a significant amount of blood culture analyses and prevent the undesirable effects of false-positive blood culture results.

This post was originally published on https://bmjopen.bmj.com