Missed opportunities in hospital quality measurement during the COVID-19 pandemic: a retrospective investigation of US hospitals CMS Star Ratings and 30-day mortality during the early pandemic


  • This study used a nationally representative inpatient data set, encompassing nearly 80% of US hospitals.

  • This study relied on the Centres for Medicare and Medicaid Services (CMS) Star Rating, which is a well-known, publicly reported, and nationally calculated quality benchmark for US hospitals.

  • The results of this study might not be generalisable to the roughly 20% of US hospitals, which do not receive a CMS Star Rating, which are typically smaller, rural or critical access hospitals.

  • A small amount of correlation between hospitals’ CMS Star Ratings and their non-COVID 30-day mortality rates may be impacting these results given that 22% of the weight of CMS Star Rating is composed of cardiorespiratory-specific mortality rates.

  • The CMS Star Rating does not have a comparable or well-correlated international measure to allow for the inclusion of non-US hospitals in the analysis.


Patients expect and deserve safe, high-quality care, as well as reliable and valid hospital quality reporting, irrespective of the challenges faced by health systems. At the onset of the COVID-19 pandemic in early 2020, there was widespread understanding among hospital quality stakeholders that focusing on the delivery of high-quality hospital care is even more important during times of health system crisis.1 As the US healthcare system passes the 3-year mark of the COVID-19 pandemic and the public health emergency (PHE) status is lifted, there remain surprisingly few publicly reported analyses exploring the relationships between hospitals’ quality performance, structures and processes, leading up to the pandemic and their subsequent pandemic-era quality outcomes. In fact, many hospital quality reporting entities such as the US News & World Report and the Centres for Medicare and Medicaid Services (CMS) and United Kingdom’s Quality and Outcomes Framework have limited pandemic era outcome data from their rankings, ratings and pay-for-performance programmes through the exclusion of months or years of patient data for both COVID and non-COVID-related hospital encounters.2–5 These decisions were presumably made pragmatically in response to anticipated staffing and resource constraints across both health systems and public/private quality rating stakeholders. We have previously published data questioning the necessity and statistical implications of these decisions,6 but failing to learn from our collective experience represents a missed opportunity for our healthcare system and for our patients. High reliability organisations are defined as organisations with systems in place that are exceptionally consistent in accomplishing their goals and avoiding potentially catastrophic errors.7 However, the dearth of meticulous data analyses during the PHE has hindered our comprehension of the defining traits of hospitals that persistently attained commendable quality outcomes. Simply put, the well-intentioned decisions to exclude these data were made without evidence to support what amounted to a paradigm shift away from the use of risk adjustment to benchmark hospital outcomes. This effectively set a new precedent that hospitals need not be held accountable for the quality of non-COVID care delivered during certain periods of the pandemic and could have implications for quality reporting in future times of crisis.

In a letter to CMS in 2021, we advocated for providing relevant information to assist patients and consumers to make informed decisions and to allow stakeholders to identify and learn from those hospitals, which were able to provide high-quality clinical outcomes in the face of difficult circumstances. We hypothesised that high-quality outcomes during the pandemic may reflect resiliency and a high reliability mindset, supported by pre-existing excellence in hospital processes and structures.8 9 Analyses of hospital-level variation in pandemic-era outcomes could lead to the development of ‘lessons learnt’ or ‘best practices’ documentation, which could better prepare hospitals to maintain high-quality care delivery in the next pandemic or health system crisis. Here, we performed a focused analysis of the relationship between US hospitals’ pre-pandemic quality ratings and early pandemic 30-day mortality among both COVID and non-COVID encounters.


Study design and data source

We assessed risk-adjusted 30-day mortality during the early pandemic, a generalised timeframe defined more specifically for purposes of this research as April 2020 through November 2020, during which time there were no widely available vaccinations, for both COVID and non-COVID encounters among US Medicare beneficiaries, stratified by hospitals’ CMS Overall Hospital Star ratings, which were released in January 2020. The 2020 CMS Star Rating provided a baseline measurement of hospitals’ publicly reported quality performance that closely aligned with the onset of the pandemic, including outcome data through calendar year 2018. Briefly, the CMS Star Ratings are a well-known general assessment of overall hospital quality, which is scored based on the domains of mortality, readmissions, patient safety, patient experience and timely and effective care.10 To conduct this analysis, we used the Inpatient Standard Analytic File and Medicare Beneficiary Summary File 100% US national samples from 2020. The study team has significant experience analysing these data sets.6 Specifically, we included all Medicare inpatient encounters from 1 April 2020 through 30 November 2020 and linked to hospitals’ CMS Star Ratings using each hospital’s unique 6-digit CMS provider IDs.

Statistical analysis

We used multivariate logistic regression, with 30-day mortality as the outcome (dependent variable), adjusting for the independent covariates of age (as a continuous variable), sex, Elixhauser mortality index (as a continuous variable), US Census Region (a proxy for region-specific COVID burdens), month (April through November as a categorical variable), hospital-specific January 2020 CMS Star rating10 (1, 2, 3, 4 or 5 stars), COVID diagnosis (binary, indicated by presence of the U07.1 ICD-10 diagnosis code) and COVID diagnosis×CMS Star Rating interaction. This modelling approach using an interaction term for CMS Star Rating and COVID diagnosis is analogous to stratifying by COVID versus non-COVID encounters for the purposes of determining whether pre-pandemic quality was differentially associated with pandemic era mortality for patients hospitalised with, versus without, COVID-19 diagnoses. From this analysis, we reported risk-adjusted 30-day mortality ORs using 5-star hospitals as the reference group among both COVID and non-COVID beneficiaries. ORs above 1.0 indicate worse relative 30-day mortality performance compared with 5-star hospitals, and ORs below 1.0 indicate better relative 30-day mortality compared with 5-star hospitals. As a post hoc sensitivity analysis, we examined the hospital-level correlation between the CMS Star Rating summary score (as a continuous variable) and the CMS Star mortality domain score, which contributes 22% of the weight of the overall Star Rating by calculating the coefficient of determination (r2) between these two scores.

Patient and public involvement

There was no patient or public involvement. All patient records are deidentified by CMS, so study participants cannot be reached for dissemination.


During this early pandemic time frame (April 2020–November 2020), prior to the development of COVID-specific vaccines, we included 4 473 390 Medicare fee-for-service encounters across 2533 US hospitals. Of 293 (11.6%) hospitals received 5 stars, 677 (26.7%) 4 stars, 746 3 stars (29.5%), 601 2 stars (23.4%) and 216 (8.5%) 1 star (table 1). 5-star hospitals had 38 215/626 772 (6.1%) COVID encounters, 4-star hospitals had 75 595/1 116 159 (6.8%) COVID encounters, 3-star hospitals had 92 693/1 219 995 (7.6%) COVID encounters, 2-star hospitals had 84 846/1 087 131 (7.8%) COVID encounters and 1-star hospitals had 38 754/423 333 (9.2%) COVID encounters. There were 92 896 (28.2%) 30-day mortalities among COVID-19 hospitalisations and 387 029 (9.3%) 30-day mortalities among non-COVID hospitalisations. Risk-adjusted ORs for 30-day mortality showed a clear dose–response relationship (figure 1), with significantly greater odds of 30-day mortality among both COVID and non-COVID encounters as CMS Star Ratings decreased, with 18% (95% CI 15% to 22%; p<0.0001), 33% (95% CI 30% to 37%; p<0.0001), 38% (95% CI 34% to 42%; p<0.0001) and 60% (95% CI 55% to 66%; p<0.0001), greater odds of COVID mortality comparing 4-star, 3-star, 2-star and 1-star hospitals (respectively) to 5-star hospitals. Among non-COVID encounters, there were 17% (95% CI 16% to 19%; p<0.0001), 24% (95% CI 23% to 26%; p<0.0001), 32% (95% CI 30% to 3%; p<0.0001) and 40% (95% CI 38% to 42%; p<0.0001) greater odds of mortality at 4-star, 3-star, 2-star and 1-star hospitals (respectively) as compared with 5-star hospitals. The interaction between COVID-19 diagnosis and CMS Star Rating was significant (p<0.0001), and the model performed well with c-statistic=0.77. The coefficient of determination (r2) between the CMS Star Rating summary score and the CMS Star Mortality Domain score was r2=0.10.

Table 1

Characteristics of Medicare inpatient encounters from 1 April 2020 to 30 November 2020 by hospital-level CMS Star Rating

Figure 1
Figure 1

ORs and 95% CIs for risk-adjusted 30-day mortality among COVID and non-COVID encounters during early pandemic by CMS Hospital Overall Star Rating blue circles are non-COVID encounters (defined as absence of a U07.1 diagnosis code on the encounter claim), red lines are COVID encounters (defined as presence of a U07.1 diagnosis code on the encounter claim); the reference at 1.0 indicates the reference group (5-star hospitals). CMS, Centers for Medicare and Medicaid Services.


Our results indicated a significant and clear dose–response increase in early pandemic 30-day mortality among both non-COVID and COVID encounters corresponding with decreasing CMS Hospital Star Ratings. The extent of this increase can be seen even at 4-star hospitals, which had nearly 20% greater odds of 30-day mortality for both COVID and non-COVID patients as compared with 5-star hospitals. These results lend credence to our assertion that we can analyse and learn from high-performing and low-performing hospitals during the pandemic, especially in relation to their pre-existing structures, processes and outcomes related to quality that may have allowed for greater pandemic era resiliency. Such learning will not be possible if data continue to be partially or wholly excluded from hospital quality assessments, which are normally performed in the US, UK and Europe.

There are several limitations of our analysis of CMS Star Ratings. First, there are likely several hospital characteristics that may confound this relationship which we did not adjust for, including bed size, teaching status and socioeconomic factors such as uncompensated care and disproportionate share patient percentages, hospital or regional COVID burdens (as seen in the higher relative proportion of COVID encounters with decreasing Star Rating), ICU utilisation, census, coding or documentation of COVID and comorbidities and other community or social factors.11 Likewise, potential residual confounding including unmeasured differences in patient populations in the early pandemic may be presenting an incomplete picture of the true results. However, our risk model discrimination performed comparably to those from hospital quality ratings and federal pay-for-performance programmes, which likewise do not adjust for the community factors or period-specific population changes named above.2 Second, while 30-day mortality does constitute 22% of a hospital’s CMS Star Rating, the CMS Star Rating only considers mortalities for a narrow set of patients: acute myocardial infarction, coronary artery bypass grafting, congestive heart failure, chronic obstructive pulmonary disease, stroke and pneumonia, which make up a fraction of all inpatient encounters. As such, there is unlikely to be significant collinearity between the CMS Star Rating and 30-day mortality, and, therefore, a hospital’s CMS Star Rating and hospital-wide 30-day mortality performance for all patients are not inherently statistically intertwined measurements. Nevertheless, future analyses may seek to identify specific domains of quality such as patient experience, patient safety or readmissions, which may be correlated with pandemic resiliency. As a post hoc sensitivity analysis, we examined the hospital-level correlation between the CMS Star Rating summary score and the CMS Star Mortality Domain score. The coefficient of determination (r2) between these scores was r2=0.10, implying that the mortality domain explains only approximately 10% of the overall CMS Star summary score, supporting the notion that collinearity is not unduly impacting the results of our main analysis. Finally, this analysis may have some level of bias due to the exclusion of the approximately 20% of US hospitals, which were not eligible to receive CMS Star Ratings, and, therefore, no publicly available data. However, excluded hospitals are typically very small, rural and/or critical access hospitals. Thus, the bias is implicit to the Star Ratings rather than our study design, and our results are likely relevant across all large, academic medical centres in the USA, the vast majority of which qualify for CMS Star Ratings. Finally, there are likely important differences in the inpatient, non-COVID Medicare population in the early pandemic as compared with the pre-pandemic era. Namely, it is likely that the hospitalised non-COVID population was at higher risk of mortality due to fewer elective procedures occurring as hospitals prioritised bed space for the most comorbid patients. Thus, higher rated hospitals may excel in caring for patients with complex conditions and the strength of the associations seen here may be attenuated for routine cases.

Despite these limitations, our results were consistent with what might have been expected intuitively: hospitals’ pre-pandemic CMS Star Ratings were significantly associated with their subsequent risk-adjusted 30-day mortality among both COVID and non-COVID-hospitalised Medicare beneficiaries during the initial 8 months of the pandemic. 5-star hospitals had the lowest risk-adjusted 30-day mortality among both non-COVID and COVID encounters, with a clear and significant dose–response relationship of increasing risk-adjusted mortality associated with decreasing star ratings, such that 1-star hospitals had 40% greater odds of mortality among non-COVID patients and 60% greater odds of mortality among patients with COVID than 5-star hospitals.

Does this prove that all highly rated hospitals performed well during the pandemic, and all low-rated hospitals performed poorly? No, and therein lies the point—through variation, at least some of which is likely attributable to reliable systems in place for resiliency in maintaining high-quality care delivery, some hospitals outperformed their peers, whereas some under-performed. On average, it appears from this initial analysis that higher quality pre-pandemic hospitals were more likely to have better 30-day mortality performance in the early pandemic. Regardless, it is in exploring this variation in hospital characteristics and performance that we can gain insight into what works, what does not work and what may be out of our control entirely during future health system crises. In light of the evidence provided here regarding the better early pandemic 30-day mortality performance of higher rated hospitals, what can be done to compensate for the years of absent pandemic-era quality reporting, and how should quality be measured at the time of future systemic strains on our healthcare system? To date, we have neglected to ask or answer this question, but it is not too late to begin. We propose three next steps to reprioritise and assess hospital quality during the pandemic:

  1. Pandemic era hospital outcomes should be investigated and reported rather than a priori exclusion of months or years of data. Regardless of whether analyses suggest that exclusions are warranted,6 or whether results suggest that decisions made by stakeholders over the last several years to exclude pandemic era data2–5 may indeed be justified, the results of both sets of analyses should be publicly reported to allow stakeholders and health service researchers to assess both approaches. This will prevent the unintentional masking of critical successes or gaps in hospital quality during the pandemic that we may be able to learn from moving forward.

  2. New approaches to fairly and adequately risk-adjust for patient-level COVID-19 status and hospital-level COVID-19 pandemic burden should be prioritised, as is already being done for measures such as the Agency for Healthcare Research and Quality’s Patient Safety Indicators.12 We are not the first to suggest that quality measurement methodologies could maintain the traditional risk-adjustment paradigm through the pandemic.13

  3. The above steps will facilitate the identification of hospitals, which performed in the top or bottom percentiles of any given risk-adjusted quality outcome during the pandemic. Such analyses could support intentional interinstitution discussion or formation of an expert panel to develop ‘resilience in quality’ best practices for future pandemics. Learning how leaders managed threats to reliability may inform quality initiatives in the aftermath of the pandemic.14 We owe it to our patients to learn from successes and failures to provide the highest quality care, regardless of the adversities faced by our healthcare system during crises.

The results from the analyses suggested in the three steps above need not involve financial punishment or quality reporting embarrassment for low performers, or accolades for high performers. We must rather examine these data in order to promote sensible and intentional discussions among all healthcare leaders who possess vested interests in enhancing quality and resiliency. This is especially poignant given emerging evidence of worsening patient safety event rates for some event types during the pandemic.14 15 We continue to advocate that quality reporting stakeholders should transparently report risk-adjusted pandemic era hospital quality outcomes as opposed to permitting potentially valuable insights into elude society’s grasp and repeat the same mistakes in future pandemics. Health services researchers have made significant strides over the past several decades to cultivate a meaningful, risk-adjusted hospital quality reporting framework. This journey must continue in order to provide timely, valid, reliable hospital quality data, which best support the needs of our patients and the communities we are dedicated to serving.

Data availability statement

No data are available. The study data (Inpatient SAF, MBSF) are not available for sharing via the study investigators per terms of the Data Use Agreement (DUA) with Medicare. However, these same CMS data files may be purchased separately by researchers through an institution-specific DUA with CMS.

Ethics statements

Patient consent for publication

Ethics approval

This study is approved through Mayo Clinic (IRB 19-001210).

This post was originally published on https://bmjopen.bmj.com