Multivariate time series approaches to extract predictive asthma biomarkers from prospectively patient-collected diary data: a systematic review

Studies identified

The literature search yielded 1930 results across the four databases, of which 377 were excluded since they were duplicate studies. The remaining titles and abstracts were screened and narrowed down to 65 results for which full-text articles were sought. Using the predefined selection criteria, 48 of these results were excluded. Reasons for exclusion included conference abstract with no related full-text publication (n=4), conference abstract with the full text included elsewhere in the literature search (n=4), studies with data collection being too infrequent (ie, less than at least once-daily over the course of at least 2 weeks) (n=24), publication did not include any data (n=7), studies where the variables were beyond the scope of the review (eg, studies analysed forced expiratory volume in one second (FEV1), immunological markers, airway impedance, etc) (n=8) or a publication was in a language such that translation services were not available (n=1). This left 17 studies for inclusion to the review. Their bibliographies were also searched for relevant papers, from which an additional six studies were identified for inclusion. Additionally, the bibliography of a systematic review that summarised the use of artificial intelligence (AI) in asthma18 was searched for potentially relevant studies, and yielded one additional study. Overall, there were 24 studies included in the review. A Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram19 is shown in figure 1. Online supplemental table 1 summarises the 24 included studies.

Figure 1
Figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow chart illustrating study selection (modified from Page et al).19

Diary variable usage

The usage of diary variables in the studies are shown in figure 2A. From the included studies, PEF was the most used diary variable, with 18 studies including it in their analyses. Conversely, night-time awakenings were the least used, with only eight studies using it in their analyses. Symptom scores and short-acting bronchodilator reliever use were used in 14 and 11 of the studies, respectively. Nine of the included studies used only one diary variable in their analyses, and six used all of four. FeNO was included in four of the studies, where it was used as a diary variable in three of them.

Figure 2
Figure 2

Breakdowns of the included studies by (A) diary variable usage and (B) biomarker extraction methods.

Biomarker extraction methods

The methods used to extract biomarkers from diary variables are summarised in figure 2B. Several studies used simple summary measures to quantify the behaviour of the diary variables throughout the observation periods. These include moving averages,20 21 diurnal variability,22 seasonal/periodic averages,23–27 coefficient of variation26 28 29 and autocorrelation.29 Overall, these studies showed that increased variability in the diary variables is associated with more adverse outcomes, namely exacerbation risk, loss of asthma control and treatment failure to inhaled steroids. Additionally, higher levels of symptom scores, reliever use and night-time awakenings were also associated with increased exacerbation risk or occurrence, and poorer asthma control. Conversely, decreases in PEF were associated with increased exacerbation risk or occurrence. The cross-correlation between daily FeNO and symptom scores were also associated with moderate exacerbation risk, where stronger correlations between the two variables was associated with increased risk.29

A non-parametric approach, DFA was used in six of the studies, five of which applied it to time series of PEF recordings, and one to time series of FeNO measurements. DFA quantifies the strength of long-range correlations in the time series through the resulting long-range scaling coefficient, denoted by α. Four of the PEF studies28 30–32 used DFA to extract biomarkers as potential predictors of asthma PROs and the other33 solely used DFA to simulate additional PEF time series. These studies show that α is related to asthma PROs, specifically the risk of exacerbations and airway obstruction. Some studies report that a lower α is indicative of increased risk of airway obstruction,30 but some found that higher values may be indicative risk of treatment failure to inhaled steroids, when coupled with an increase in the coefficient of variation of PEF.28 Lower α values were also found in patients with uncontrolled asthma, but α values did not differ significantly between asthma severity groups.32 The DFA coefficient α from PEF during the placebo period was also shown to predict treatment response to salmeterol, but notably, not salbutamol, where higher values of α during the placebo period was associated with improved treatment response. DFA was also applied to time series of FeNO data, and one study found significantly increased α in patients who had experienced an exacerbation.34

Several studies used prespecified threshold changes in diary variables over prespecified windows of time to develop markers, and surrogate or early endpoints of asthma PROs. Fuhlbrigge et al aimed to develop an intermediate endpoint for asthma exacerbations using diary variables.35 The endpoint was defined based on prespecified threshold changes or worsening (slope) greater than some prespecified magnitude, over at least 2 or 5 days, respectively. These thresholds were amalgamated with the American Thoacic Society (ATS)/European Respiratory Society (ERS) definition of asthma exacerbations, defined by oral steroid treatment utilisation,3 yielding a composite score. The final endpoint, denoted by CompEx only included PEF, reliever use and symptom scores (CompEx-PRS). CompEx-PRS identified an increased exacerbation event frequency by 2.8-fold, while preserving treatment effect sizes observed on exacerbations.

Kupczyk et al also used multiple diary variables and aimed to find a proxy for exacerbations.36 A 20% decrease in PEF or a 20% increase in day symptoms on two consecutive days was able to detect severe exacerbations with a sensitivity of 65% and a specificity of 95%, where combining the two improved the overall predictive performance.

Honkoop et al aimed to validate optimal action points of PEF and symptoms to aid with early detection of exacerbations.37 The optimal combination (PEF and symptoms) action point comprised an increase of more than 2SD of the symptom score from the run-in mean, and a decrease of PEF to <70% of their personal best. This action point detected exacerbations 1.4 days before their occurrence with 80.5% sensitivity and 98.3% specificity.

Wu et al used simple thresholds to aggregate daily diary card scores into a symptom score for each 4-month block and evaluated its associations with severe exacerbation occurrence.27 Symptom scores were associated with severe exacerbations, where patients with more blocks of persistent symptoms being more likely to experience more exacerbations during the 4-year study.

Spencer et al validated a composite measure of asthma control.38 The measure was comprised of daytime symptom score, rescue beta2- agonist use, morning PEF, night-time awakening, asthma exacerbations, emergency visits and treatment-related adverse events, and used simple prespecified thresholds to determine asthma control level. The resulting measure showed good discriminative ability of other measures of asthma control, both cross-sectionally and longitudinally.

Van Vliet et al compared two methods for assessing asthma control, namely prospective symptom and lung function monitoring versus retrospective recall using the Asthma Control Questionnaire (ACQ).39 Prospective assessment of asthma control was measured using daily symptom questionnaires and FEV1 values and using thresholds to classify the level of asthma control on a weekly basis, based on Global Initiative for Asthma (GINA) control criteria.40 Conversely, retrospective assessment of asthma control was conducted using the ACQ during the routine clinic visits. There was low concordance between the two methods, but it seems that prospective monitoring provides a more realistic image of patient health, potentially since it minimises recall bias for retrospective recall.

Frey et al30 and Thamrin et al33 both used threshold changes of PEF to calculate the conditional probability of an airway obstruction, defined as PEF <80% (moderate) or PEF <60% (severe) of the age-predicted and height-predicted normal values that occur within a certain time period,40 given a patient’s current PEF value, denoted by π. As previously mentioned, Frey et al found that airway obstruction risk was associated with increased variability and loss of deterministic behaviour of PEF. Thamrin et al found that π was associated with actual occurrences of airway obstructions. Additionally, π was shown to be associated with future exacerbation risk, where an increase in this probability was associated with an increase in the OR of having a future exacerbation.

Greenberg et al used a threshold-based approach to develop a composite score, named ADAS-6, comprised of rescue beta-agonist use (daily use and diurnal variability), PEF diurnal variability and night-time awakenings, as well as FEV1 % predicted and AQLQ (symptom domain score), to determine the level of disease activity in patients.41 The authors defined disease activity based on high and/or low cut-offs for the following variables: daytime symptom score, night-time awakenings, average rescue beta-agonist use, AQLQ score (activity domain), FEV1 % predicted, and asthma attacks. ADAS-6 was discriminative of disease activity and demonstrated content and convergent validity. The study found that each of the six included variables contributed to the regression models in a relatively, balanced manner, looking at their standardised coefficients.

Of the studies in the review, four analysed data, including the diary variables of interest using ML and AI. These studies aimed to build predictive models. Several algorithms were used, namely ensemble learning, Naïve Bayes, support vector machines (SVMs), adaptive Bayesian networks, XGBoost, one class SVM, logistic regression, decision trees and perceptrons. ML models demonstrated good predictive performance in the studies.

Khasha et al used an ensemble model, which combined numerous disease-related variables and and medical knowledge to detect asthma control level, which was determined using a rule-based classifier derived from the physicians’ knowledge.42 The resulting classifier had a good performance, with an accuracy of over 91%. Interestingly, among the large number of variables used in the algorithm, morning and evening PEF, along with ACT score, as a measure of daily symptoms, were the most important features.

Finkelstein and Jeong used diary data collected through telemonitoring and evaluated three ML methods to build a predictive model for early prediction of asthma exacerbations.43 The authors used a naïve Bayesian classifier, adaptive Bayesian network and SVM, of which the adaptive Bayesian network performed best, resulting in a perfect classification in terms of sensitivity, specificity and accuracy when a 7-day window was used to predict an exacerbation on the eighth day.

Zhang et al was also interested in predicting exacerbation occurrence using daily diary data, but whether an exacerbation occurred on the same day or up to 3 days in the future.44 The authors evaluated the performance of several ML methods, namely, logistic regression, decision tree, naïve Bayes classifier and perceptron algorithms. The best performing model was logistic regression applied to data processed using principal component analysis, and achieved ROC=0.85, sensitivity=90%, specificity=83% for detecting severe asthma exacerbations.

De Hond et al45 developed and compared predictive models for the early detection of severe asthma exacerbations, using a 2-day prediction horizon. The authors compared the performances of two ML models (XGBoost and one class SVM), a logistic regression model and a simple asthma action plan. The logistic regression model (AUC=0.88) outperformed the XGBoost model (AUC=0.81), as well as the one class SVM model. Notably, both the XGBoost and logistic regression models reached higher discriminative performance compared with the simple clinical rule.

With the extracted biomarkers, regression was the most utilised class of methods for assessing their associations with the PROs. Of the included studies, 15 used a regression method in their analyses. These include many different classes of models, including linear, multinomial, random effects, Cox, etc. A few studies used more complex regression models, such as repeated time-to-event analysis21 and generalised estimating equations.27

This post was originally published on https://bmjopen.bmj.com