Multicentre external validation of the prognostic model kidney failure risk equation in patients with CKD stages 3 and 4 in Peru: a retrospective cohort study

Introduction

Chronic kidney disease (CKD) represents a significant global public health challenge, imposing a substantial economic burden on healthcare systems and exhibiting an ever-increasing disease prevalence.1 2 The global prevalence of CKD is estimated to exceed 10%.2–4 However, the impact of CKD is particularly pronounced in low and middle-income countries, where limited resources and fragmented healthcare systems exacerbate the issue.2 5 6 In Peru, a middle-income country in Latin America, over 2.5 million adults are afflicted with some degree of CKD, with an estimated national prevalence ranging between 16% and 20%.7–10

Timely referral of CKD patients to nephrologists can significantly mitigate healthcare costs as CKD progresses towards kidney failure and, ultimately, death. In healthcare systems with limited established health networks and a scarcity of nephrology specialists, early referral is of paramount importance.11 CKD progression risk prediction models serve as valuable tools in clinical decision-making, guiding the timing of nephrologist referrals, offering counsel on kidney replacement therapy (KRT) options, and aiding in the planning of vascular access to prevent abrupt and unplanned emergency admissions.12 13 Accurate short-term predictions are particularly crucial in situations where establishing precise individualised risk is critical.14–16 Conversely, long-term predictions may be more informative in identifying patients who should remain under primary care for secondary prevention, treatment, and follow-up.14–16

International guidelines recommend using individualised risk prediction models to inform the appropriate time for nephrologist referral and KRT planning.17–20 However, in Peru, referral recommendations are primarily based on the isolated or combined use of estimated glomerular filtration rate (eGFR), albuminuria, urine albumin:creatinine ratio (uACR) thresholds.21 These criteria do not incorporate individualised risks, potentially leading to the unnecessary referral of low-risk patients and the failure to refer high-risk patients.17 22

The kidney failure risk equation (KFRE) is an individualised risk prediction equation for CKD patients’ progression to kidney failure.23 Some international guidelines have recommended KFRE for CKD management.14 18 19 24 The KFRE exists in an eight-variable and a four-variable version. The latter, requiring age, gender, eGFR and uACR, is an attractive alternative in resource-limited settings like Peru, as these variables are relatively easily accessible. The eight-variable version, while offering some predictive improvements, includes serum calcium, phosphate, bicarbonate and albumin, which are not widely available tests, thus limiting its usability. However, the predictive accuracy of these models can significantly vary between populations, underlining the necessity of validating their predictive performance in the population where they are to be applied.25–27 Regrettably, the lack of external validation studies of KFRE in the Latin American population, including Peru, has delayed its adoption in local clinical practice.

Therefore, this study aims to conduct an independent external validation of the predictive performance of the four-variable KFRE model in a large and diverse sample of insured patients assigned to various health establishments of the Social Health Insurance (EsSalud) in Lima, Peru, for predicting the risk of kidney failure at 2 and 5 years.

Methods

Design, study population and data source

We conducted a retrospective cohort study following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis guidelines (see TRIPOD checklist in online supplemental material)27 28 to validate the KFRE model for predicting the risk of kidney failure at 2-year and 5-year horizons in patients with CKD.

Supplemental material

As an initiative of the EsSalud National Kidney Health Plan, the Rebagliati Healthcare Network established an electronic registry for patients with CKD receiving treatment across its health facilities, ranging from primary to tertiary care. We extracted demographic and clinical data from the electronic medical records of the EsSalud Hospital Management System and the Chronic Kidney Disease Management Unit (UMERC) of the Kidney Health Surveillance Subsystem application.29

Our study population comprised patients aged 18 years or older diagnosed with CKD between 1 January 2013 and 31 December 2017, who received treatment in 17 primary care healthcare centres within the Rebagliati Healthcare Network in Lima, Peru. As the capital city, Lima hosts the majority of insured patients at the national level. We included patients with an eGFR between 15 mL/min/1.73 m² and 60 mL/min/1.73 m², corresponding to categories 3a, 3b and 4 of the Kidney Disease Improving Global Outcomes classification20 and those with a recorded quantifiable uACR measurement taken concurrently with eGFR. The date of ACR measurement marked the commencement of the follow-up period and the point for predicting the risk of kidney failure using KFRE.

We validated the KFRE models in two populations of interest: (1) a broad population of patients with CKD stages 3a, 3b, and 4 (3a–4), as originally validated and (2) a more specific population with more advanced CKD stages (3b–4).

Patient and public involvement

There was no direct patient and public involvement in the design, conduct, or reporting of this study.

Sample size

Given that we had full access to all the electronic data from the electronic medical records of the EsSalud Hospital Management System, we did not perform a sample size calculation. However, we were mindful of the potential unreliability of performance assessment with inadequate sample sizes, especially when the number of events is low. To mitigate this, we focused our analysis on clinically relevant subpopulations, ensuring a minimum of 100 events and non-events in each group.27 Given these considerations, we deemed it impractical to analyse groups 3a, 3b and 4 separately. Consequently, we amalgamated them into subgroups 3a–3b–4 and 3b–4 for the purpose of our analysis.

Validation model

Tangri et al developed the KFRE model in Canada in 2011 to predict kidney failure in populations with CKD stages 3–5.23 Subsequently, the model underwent recalibration based on a comprehensive meta-analysis encompassing 31 cohorts from over 30 countries and involving more than 72 000 participants.30 The four-variable KFRE, which includes age, patient sex, eGFR and ACR, collected concurrently for each patient, presents an appealing alternative due to its reliance on a limited number of variables that are readily available within the Peruvian health system. This version offers two prediction horizons: a short-term 2-year prediction and a long-term 5-year prediction (see equations in online supplemental table S1).

Supplemental material

Predictors

The four predictors of the four-variable KFRE are age (years), sex (male/female), eGFR (mL/min/1.73 m2) and ACR (mg/g) (see online supplemental table S2). The eGFR was estimated using the 2009 CKD Epidemiology Collaboration formula20 31 (see online supplemental methods—section 1.3 for details). The health establishments within the Rebagliati Network adhered to standardised care protocols for patients with CKD, which include specific laboratory procedures. Serum creatinine, essential for estimating eGFR, was derived from blood samples and measured using an assay with calibration traceable to an isotope dilution mass spectrometry reference measurement procedure. For the computation of uACR, urine creatinine and albumin levels were ascertained through quantitative and automated laboratory tests using a random urine sample. Each hospital’s qualified personnel ensured the verification of preanalytical conditions. Urine samples were collected in 10–15 mL containers and transported at temperatures between 4°C and 8°C to the respective laboratory for daily processing. The entire analytical process adhered to good laboratory and analytical quality control practices. We sourced all data on these variables from UMERC, a computer application specifically designed for this task.

Outcome variable

The outcome variable in this study was kidney failure, defined as end-stage renal disease necessitating KRT. The initiation of haemodialysis or peritoneal dialysis, as indicated by a nephrologist based on clinical parameters of uraemia and an eGFR of less than 15 mL/min/1.73 m2, constitutes KRT.

In estimating the observed risk of kidney failure, we accounted for the competing event of death without KRT. We sourced data on the date of KRT initiation from the dialysis database and corroborated this information with the digital clinical history. The date of death, up until 31 December 2019, was obtained from the National Registry of Identification and Civil Status of Peru.

Follow-up time

Patients were followed until the occurrence of kidney failure, death or the point of censorship, whichever came first. Observations were censored when a patient was lost to follow-up or at the conclusion of the study (31 December 2019). We selected this end date to exclude data from the pandemic period, during which the health system experienced a collapse, kidney care services were disrupted, and the reliability of the information was compromised.

Statistical analysis

Initial data analysis

We performed an initial data analysis to identify implausible extreme values, missing data and inconsistencies. Plausible extreme data were retained without any transformation in the main analysis. Numerical and categorical variables were described using median (IQRs) and absolute frequencies (percentages), respectively.

Estimate of observed risk

We estimated non-parametric cumulative incidence function (CFI) curves and their 95% CI using the Aalen-Johansen estimator32 for kidney failure and considering the competing risk of death without kidney failure.

Predictive performance of KFRE

We estimated the individual predicted risks of developing kidney failure using the four-variable KFRE for 2-year and 5-year horizons (prediction formulas in online supplemental table S1). We assessed the performance of the models based on discrimination and calibration measures27 28 according to TRIPOD guidelines. Additionally, we considered the risk of death without kidney failure and based our analysis workflow on two recently published methodological guides on external validation of prediction models in the presence of competing risks.33 34

Discrimination is a relative measure of how well the model distinguishes between patients with or without the condition of interest.34 35 To assess discrimination, we estimated the truncated concordance index (C-index) at 2 and 5 years for each model and the areas under the ROC time-dependent curves of cumulative sensitivity and dynamic specificity (C/D time-dependent area under the curve (AUC-td)).36 A C-index or C/D AUC-td of 1 indicates perfect discrimination, 0.5 indicates no discrimination, and values ≥0.8 are generally considered appropriate for prognostic models.35 We accounted for the competing risk for death without kidney failure by censoring patients who die at infinite, indicating that they may not develop kidney failure in the future.33 34

Calibration is a measure that indicates how well the absolute predicted risks agree with the observed risks. These observed risks were estimated using CFI to take into account the competing risk of death without kidney failure.33 34 We assessed calibration-in-the-large using the observed to expected (O/E) results, a measure of mean calibration, and the calibration intercept, a measure of weak calibration. We also assessed weak calibration through calibration slope. Moderate calibration was assessed inspecting calibration plots.

We estimated the ratio of O/E results. An O/E indicates perfect global calibration, an O/E >1 indicates an underestimation of the average risk and an O/E <1 reveals an overestimated of the average risk. The calibration intercept is another measure that evaluates the average over or underestimation that we estimate in this study. An intercept of 0 indicates perfect agreement between the predicted and observed risk average. An intercept <0 significantly indicates an overestimation, and an intercept >0 indicates an underestimation of the risk average. We also estimated the calibration slope. A slope of 1 reflects ideal agreement. A slope less than 1 indicates that the predicted risks are too extreme (very high and low), while a slope greater than 1 indicates that the predictions do not show enough variation. To formally test statistical evidence of miscalibration, we first performed a Wald test of the joint contribution of the intercept and slope, as previously described for calibration models in prediction models.33 34

Calibration plots allow calibration to be assessed in detail by comparing observed individual risks with those predicted. A curve exactly following the 45° straight line would indicate a perfect situation named strong calibration that is ideal and utopic. A more realistic goal is to assess if the curve is close to the diagonal, indicating moderate calibration. We plotted calibration curves estimated by smoothed local linear regression (loess) based on pseudo values obtained from cumulative incidence estimates that account for the competing risk of death.33 34 37

Sensitivity analysis

We perform two sensitivity analyses:

  1. The same analysis approach after eliminating the extreme values of ACR by winsorising at the 1st and 99th percentiles of the distribution of this variable.

  2. A predictive performance analysis ignoring competitive risk. This analysis relied extensively on the methodology described by McLernon et al.37

General approach

The data preparation and all the analyses were carried out with the statistical programme R V.4.2.1 for Windows 11×64 bits. Except for the C-index, all the 95% CIs were Wald-type.33 34 The 95% CI for C-index was obtained using the percentile bootstrap method using 1000 bootstrapped samples.33 34

Results

Study population

Among the 22 744 patients with CKD screened between 1 January 2013 and 31 December 2017, at 17 hospitals within the Rebagliati Healthcare Network in Lima, only 13 890 had complete ACR data, of which 7519 patients were eligible due to a diagnosis of CKD 3a–4, while 2798 were eligible for the CKD 3b–4 subgroup (figure 1). All eligible patients had complete data on outcome, age, sex. The number of events of outcome was over 100 in all populations, except for kidney failure at 2 years in patients with CKD 3b–4 stages (n=88), and thus estimates in this group should be interpreted with caution.

Figure 1
Figure 1

Study flowchart. CKD, chronic kidney disease; VISARE, Kidney Health Surveillance Subsystem.

Within the CKD 3a–4 subgroup, 114 patients developed kidney failure within 2 years, while 239 developed kidney failure within 5 years. Moreover, 563 patients died without experiencing kidney failure within 2 years, and 1400 patients died without experiencing kidney failure within 5 years. Within the CKD 3b–4 group, 88 patients developed kidney failure within 2 years, and 182 developed kidney failure within 5 years, while 300 patients died without experiencing kidney failure within 2 years, and 683 patients died without experiencing kidney failure within 5 years. The median observation time was 4.9 years, and the maximum follow-up was 7.8 years in the CKD 3a–4 group.

Table 1 summarises the baseline characteristics of the study population, and online supplemental results provide a breakdown of the study population’s characteristics according to kidney failure at 2 and 5 years for CKD 3a–4 (see online supplemental table S3) and CKD 3b–4 (see online supplemental table S4), respectively. The numbers of cases of kidney failure at 2 years were low for the subpopulations with stages 3a (n=26), 3b (n=36) and 4 (n=52) (see online supplemental table S5). Similarly, the 5-year case numbers were low for subpopulations with stages 3a (n=57), 3b (n=81) and 4 (n=101) (online supplemental table S5). Therefore, evaluating predictive performance in these specific subgroups was unreliable. The distribution of patients in stages 3a–4 and 3b–4 who entered the analysis in each of the 17 health facilities of the EsSalud Rebagliati Network is shown in online supplemental tables S6 and S7.

Table 1

Baseline characteristics of the study population according to CKD stages

Observed and predicted risk of kidney failure

Figure 2 displays the observed risk of kidney failure and death without kidney failure for both study populations. The 2-year and 5-year observed risks of kidney failure in patients with CKD stages 3 a–4 were 1.52% and 3.37%, respectively (online supplemental table S8). In patients with CKD 3b–4, the 2-year and 5-year observed risks of kidney failure were 3.15% and 6.87%, respectively (online supplemental table S9). The distribution of the 2-year and 5-year predicted risk by KFRE is shown in online supplemental figure S1.

Figure 2
Figure 2

Cumulative incidence function curves for kidney failure (sky-blue line) and death before kidney failure (red line) in patients with (A) CKD stages 3a–3b–4 and (B) CKD stages 3b–4. CKD, chronic kidney disease.

KFRE predictive performance

The KFRE demonstrated good discriminatory ability across all time horizons and study populations, as evidenced by C-index and C/D AUC-td values exceeding 0.8 (table 2). In contrast, miscalibration tests show that data had low compatibility with good calibration of KFRE at all-time horizons and groups (all p values ≤0.001) proportioning evidence of miscalibration of the model (table 2).

Table 2

Performance measures of KFRE in the external dataset of patients with CKD stages 3a–3b–4 and 3b–4

Regarding calibration in the large, for patients with CKD stages 3a–4, the 2-year average observed risk of kidney failure was 1.52%, while the average risk predicted by KFRE was lower at 0.96%, yielding an O/E ratio of 1.57 (95% CI 1.39 to 1.76). This indicates an overall underestimation of the actual 2-year risk of kidney failure by the model. In patients with CKD stages 3b–4, a similar pattern of underestimation of the actual 2-year risk of kidney failure was observed (O/E ratio: 1.33; 95% CI 1.13 to 1.54). In contrast, the imprecision of the calibration intercepts made these estimates less useful for evaluating the calibration in-the-large of KFRE at 2 years for both populations.

For the 5-year KFRE model, evidence of poor calibration in the large was also observed, although in the opposite direction, suggesting an overprediction. In this case, the O/E ratio was less useful for evaluating the long-term calibration of KFRE due to the high imprecision of its estimates. However, the calibration intercepts revealed that, on average, KFRE overestimated the actual 5-year risk of kidney failure for both the CKD stages 3a–3b–4 population (calibration intercept: −0.26; 95% CI −0.45 to −0.07) and the CKD stages 3b–4 subgroup (calibration intercept: −0.29; 95% CI −0.48 to −0.1) (table 2).

KFRE also showed evidence of poor weak calibration. At 2 years, there was statistical evidence of very extreme predictions (ie, very high and low) for the population with CKD stages 3a–3b–4 (calibration slope: 0.79; 95% CI 0.61 to 0.96) and CKD stages 3b–4 (calibration slope: 0.82; 95% CI 0.6 to 1.03), although the latter with an uncertainty associated with the estimate in the borderline. In the case of the 5-year KFRE, very extreme predictions were also found in both groups, with calibration slopes of 0.75 (95% CI 0.65 to 0.86) for the population with CKD stages 3a–3b–4 and 0.79 (95% CI 0.65 to 0.92) for the population with CKD stages 3b–4 (table 2).

Regarding evidence of poor moderate calibration, the calibration curves (figure 3) revealed that the underestimation of actual risk of renal failure at 2 years was mainly concentrated in patients with predicted risk less than 0.3–0.4 for patients with CKD stages 3 a–4 (figure 3A). Conversely, the overestimation of the actual risk of renal failure at 5 years mainly occurred in individuals with risk predicted by KFRE greater than 0.2 in patients with CKD stages 3a–4 (figure 3B). A similar pattern and magnitude of underestimation of actual risk at 2 years and overestimation of actual risk at 5 years was observed in patients with CKD stages 3b–4 (figure 3C,D).

Figure 3
Figure 3

Calibration curves for each group and prediction horizon. The x-axis shows the risk predicted by the KFRE model, and the y-axis represents the observed risk estimated using the cumulative incidence function to consider the competing risk of death without kidney failure. CKD, chronic kidney disease; KFRE, kidney failure risk equation.

Sensitivity analysis: impact of outliers in ACR

The distribution of the four variables constituting the KFRE equation and the distribution of risks predicted by KFRE are depicted in online supplemental figures S1 and S2, respectively. We observed that age and eGFR do not have extreme values (online supplemental figure S2A,B). By the other hand, the ACR variable exhibited very extreme values (online supplemental figure S2C,D), prompting a sensitivity analysis to evaluate the robustness of our predictive performance assessment after mitigating the influence of these extreme values. Winsorisation of ACR’s extreme values was applied at its 1st and 99th percentiles, and risks predicted by KFRE were recalculated using the winsorised ACR variable. Despite observing notable changes in ACR’s distribution after winsorisation (online supplemental figure S3), the distribution of risks predicted by KFRE in the original data did not exhibit discernible alterations (online supplemental figure S4). Median and mean values of risks predicted by KFRE before and after winsorisation were strikingly similar, as were variability measures such as SD, IQR and range (online supplemental table S10). As anticipated, the predictive performance of KFRE on the winsorised data closely resembled those obtained on the original data (online supplemental table S11 and figure S5).

Sensitivity analysis: predictive performance assessment ignoring competing risk

We assessed the extent to which the predictive performance results differed when not accounting for competing risks. We found that the 2-year incidence of kidney failure in CKD stages 3a–4, when not considering competing risk, was 1.58%, only slightly higher than when accounting for competing risk (1.52%) (online supplemental figure S6). At 5 years, these differences become more pronounced but remain relatively small, with a 3.37% incidence when considering competing risk and 4.24% when not considering competing risk. In the CKD stages 3b–4 population, the 5-year differences are only about 2% (6.89% when considering competing risk vs 8.99% when not considering competing risk) (online supplemental figure S6). Consequently, the assessment of KFRE’s predictive performance without considering competing risks also found that KFRE was miscalibrated (online supplemental table S12 and figure S6); however, notably, the magnitude of miscalibration was substantially smaller when ignoring competing risks compared with when competing risks were considered (online supplemental figure S7).

Discussion

Principal findings

We conducted an independent external validation of the four-variable KFRE for kidney failure prognosis at 2 and 5 years in patients with CKD at stages 3a–4 and 3b–4 from Peru. Despite showing good discrimination to predict kidney failure, KFRE exhibited poor calibration.

Poor calibration in the large resulted in the model underestimating the average actual risk of developing kidney failure in the short term (2 years) and overestimating the average actual risk of kidney failure in the long term (5 years) in patients with CKD stages 3a–4. This pattern of poor calibration was also observed in the subgroup of patients with CKD stages 3b–4. KFRE also had poor weak calibration manifested as very extreme predictions, while poor moderate calibration was evident in the underestimation of actual short-term risk in patients with KFRE-predicted risks below 0.3–0.4 and overestimation of individual long-term risks, primarily in individuals with a KFRE-predicted risk greater than 0.2.

Comparison with previous literature

It is noteworthy that the KFRE has been externally validated in several independent external studies worldwide.16 23 30 38–49 However, the majority of these validations have been conducted in North American23 30 39 41 43–48 or European countries,16 30 40 49 with more recent validations taking place in other Asian countries.38 42 On conducting a systematic literature search, we were unable to identify any published external validation studies of KFRE specifically within the Latin American population at primary care level. A single study did incorporate a Latin American cohort, comprising Chilean and Brazilian patients; however, extrapolating the study data to the current CKD patient population presents a challenge, as the cohort’s patient recruitment took place between 1996 and 1998.30 In the past two decades, significant advancements in CKD management, including diagnostic methods, treatment strategies and patient care, have contributed to improved patient outcomes. As a result, the external validation results from the Chilean and Brazilian patient cohort30 may no longer accurately represent current CKD populations in these countries. This emphasises a notable gap in the literature and underscores the need for updated KFRE validation studies in Latin America, ensuring its applicability and accuracy across diverse regional populations.

In line with prior research, the non-North American versions of the KFRE models at 2-year and 5-year intervals exhibited good discrimination in our study and across diverse population groups from countries.16 30 38 40 42 49 The initial external validation of the non-North American KFRE, a meta-analysis comprising 13 cohorts including Chilean and Brazilian patients,30 reported pooled C-statistics of 0.9 and 0.88 for predicting 2-year and 5-year kidney failure, respectively.30 A more recent study observed lower discrimination values, with C-indexes ranging from 0.76 to 0.84 for 2-year KFRE and 0.75 to 0.81 for 5-year KFRE in cohorts from Germany, Italy, the Netherlands, Poland, Switzerland and the United Kingdom, none of which included Latin American countries.16 Despite the lower values in the recent study, the discrimination remained good. These findings are consistent with other studies reporting C-index values greater than 0.8 for the non-North American KFRE at 2- and 5 years.38 40 42 49

In contrast to discrimination, our study’s calibration assessment results differ from the initial study that recalibrated and validated KFRE for non-North American populations.30 Although the few studies that assess calibration of the non-North American version of KFRE have reported poor calibration,33 40 42 49 most primarily focus on moderate calibration using calibration curves, with limited attention given to calibration in the large or weak calibration. Our findings for moderate calibration are in line with previous studies that identified overprediction of kidney failure risk at 5 years,33 40 42 49 particularly in individuals with high predicted risk groups (>0.3 to 0.4).42 49

On the other hand, the 2-year results display greater heterogeneity among existing studies: some cohorts show good calibration,33 42 others overpredict risk in high- and low-risk groups.40 In our study, we found that KFRE exhibits an opposite pattern of underprediction at 2 years. Regarding calibration in the large, only Ramspek et al16 assessed this aspect, finding an overprediction of the average 5-year risk by>10% while observing good calibration in the large at 2 years. This contrasts with our study, which also identified underestimation of the actual average risk by KFRE at 2 years in patients with CKD 3a–4 and CKD 3b–4.

Differences in the case mix between our study and the initial study validation by Tangri et al30 could elucidate the observed discrepancies in the performance of KFRE. Case mix, which accounts for variations in patient characteristics and comorbidities across different cohorts, is often influenced by factors such as prevalence of certain conditions, healthcare systems, and demographic profiles.50 For instance, our study reported a prevalence of 24.5% for diabetes and 59.7% for hypertension, whereas Tangri et al30 showed a slightly higher prevalence of 33% for diabetes and 74% for hypertension, underscoring the variances in case mix. In this context, such distinctions in the prevalence of diabetes and hypertension between the studies contribute to these case mix differences. In our study, more than half of patients were classified as having moderate stage of severity of CKD (stage G3a), and we did not include any patients with advanced CKD (stage G5) (table 1). In comparison, Tangri et al30 included stage 5 CKD patients, although it is difficult to assess the impact of these differences as Tangri et al did not report information on the stage distribution of their study population. We also did not observe significant differences in eGFR and albuminuria distributions, recognised markers for kidney failure prognosis47 in patients with CKD (see online supplemental table S13). The initial study reported mean (SD) values of 47 mL/min/1.73 m2 (12 mL/min/1.73 m2 mL/min/1.73 m2) for eGFR and a 34% of prevalence of albuminuria for their non-North American population, while our study reported similar values of 46.2 mL/min/1.73 m2 (9.8 mL/min/1.73 m2) and 36.5%, respectively (online supplemental table S13). The actual event risk may also explain the model’s miscalibration. Kidney failure incidence rates suggest that risk of kidney failure was lower in the patients with CKD stages 3a–4 of our study (7.4 per 1000 person-years compared with 9.2 per 1000 person-years).

Another explanation for the pattern of overprediction we found in the long term for the KFRE model, especially in advanced CKD populations,43–45 is partly due to not accounting for death without kidney failure as a competing risk.33 34 This competing risk is crucial for patients with advanced CKD, especially in frail or older populations requiring long-term predictions with more frequent death events.51 Most existing models censor patients who die, leading to overestimation of the actual risk.52 Ramspek et al16 and Ravani et al53 found that KFRE overestimated the actual average risk of terminal CKD by 10–18% and 1–27% at 5 years, respectively, with overestimation increasing over time among high-risk individuals attributable to competing event. By this reason, we considered the competing risk of death without kidney failure in our study. Initial study validating KFRE for non-North American populations also evaluated the impact of competing risk but found no significant differences.

In our study involving patients with CKD stages 3a–4, 7.5% died without kidney failure at 2 years of follow-up, and 20.5% at 5 years (online supplemental table S8). Conversely, the cumulative incidence of kidney failure was low, with 1.5% at 2 years and 3.4% at 5 years. This demonstrates the relatively minor impact of competing risk at 2 years, which becomes substantially more significant at 5 years. Even without considering competing risk, the Cox analysis revealed miscalibration, displaying the same patterns as the competing risk analysis, although the degree of miscalibration would have been less pronounced.

It is important to note that, while differences exist between using Cox and competing risk analyses at the 2-year horizon, these disparities are minimal in our study. In contrast, at 5 years, marked differences emerge, with the Cox analysis evidently biasing the performance evaluation. Therefore, we chose to report the competing risk analysis as our primary method and the Cox analysis as secondary. This approach better reflects the increasing impact of competing risk when the incidence of the competing event (death without renal failure) becomes more frequent, as observed in the 5-year assessment.

Strengths and limitations of this study

This study boasts several strengths and represents, as far as we know, the second investigation of the KFRE in Latin America, further expanding on the limited research in this region. The only previous study in the region was a meta-analysis that validated the original equation for non-North American populations, which included cohorts from Chile and Brazil.30 However, these cohorts are no longer current, as they date back to 1996–1998.30 By providing the first external validation of KFRE in Peru, this study fills a critical gap in the literature and offers valuable insights into the applicability of KFRE in a contemporary Latin American context. Drawing from a retrospective cohort of over 7000 patients across 17 primary and secondary care EsSalud health establishments in Lima, the capital city accounting for a third of Peru’s population, the findings have a certain degree of generalisability. The results are particularly relevant for EsSalud-insured patients in Lima, who make up a significant proportion of CKD patients in the country.

Employing robust statistical methods and sound analytical techniques, the study appropriately assesses the performance of KFRE in predicting kidney failure while considering the competing risk of death without kidney failure. This approach helps avoid overestimation of the observed risk and reduces bias in performance assessment, as opposed to solely relying on Cox methods, which were originally considered in the study protocol.33 34 51 54 The decision to use this competing risk approach as a primary analysis, with Cox methods employed in a sensitivity analysis, was informed by contemporary evidence demonstrating that accounting for competing risks offers a less biased and more accurate estimation of the actual kidney failure risks.33 53 As such, this study adheres to best practices in the field and contributes essential knowledge regarding KFRE’s performance in Peru and Latin America more broadly.

Although our study presents valuable findings; it is important to acknowledge several limitations. First, we used secondary data routinely recorded by multiple evaluating clinicians across 17 healthcare centres in Lima. On one hand, using routine clinical data has the advantage of potentially reflecting the model’s performance more accurately in real-world clinical practice. However, despite standardisation of laboratory measurements as part of the National Kidney Program in Peru, clinical registries are inherently susceptible to errors in data recording, thereby introducing the potential for measurement error.

Another limitation stems from our use of renal replacement therapy initiation as an indicator of kidney failure in patients. While this operational definition aligns with the criteria employed by studies that developed and validated the original KFRE,23 30 and it mirrors the predominant approach in external validations of KFRE across different settings,16 39 41 43 45 46 48 49 it is essential to recognise that this methodology carries the risk of misclassifying patients who have chosen a conservative treatment approach. However, we expected that the impact of misclassification in our specific Peruvian context is relatively low primarily due to infrequent use of conservative management in Peru, particularly among patients covered by EsSalud, our national social security system. EsSalud beneficiaries in Lima enjoy complete financial coverage for RRT, eliminating any financial barriers to access. This extensive support significantly reduces the risk of patients not receiving RRT when needed. Nonetheless, it holds significant methodological value that future prospective studies delve into the exploration of alternative operational definitions of renal failure. These definitions could encompass not only the initiation of RRT but also the inclusion of stage 5 CKD without RRT initiation. However, it’s crucial to acknowledge that adopting such an approach would necessitate an extensive updating of the original KFRE equations—a task that extends beyond the scope of our current study.

Lastly, although the similarities among EsSalud service networks in Lima may support the notion that our findings could be generalisable to other networks in the city, it is crucial to recognise that Lima is not representative of the entirety of Peru, and EsSalud is not the sole healthcare system in the country.55 56 Significant disparities exist in healthcare services provided outside Lima and between different healthcare systems, such as Comprehensive Health Insurance and the Health of the Armed and Police Forces. These variations may influence KFRE’s predictive performance, necessitating specific external validation studies in these populations due to the distinct differences among Peruvian health subsystems.

Implications for clinical practice

The observed differences in KFRE model performance among various cohorts highlight the necessity of broadening external validation across diverse populations and settings.25 Although our study demonstrates KFRE’s capacity to effectively discriminate between patients who will develop kidney failure at 2 and 5 years, it also reveals the model’s shortcomings in accurately predicting individual risks, underestimating them in the short term and overestimating them in the long term. Sole reliance on KFRE’s discrimination ability may have adverse implications for patients. Given our findings that short-term overestimation of kidney failure risk occurs in patients with low predicted risk and long-term underestimation occurs in those with high predicted risk, this pattern of poor moderate calibration could result in detrimental patient outcomes.

For example, overestimation of risk (overprediction) in patients with a lower true long-term progression risk may lead to unnecessary referrals for dialysis preparation, provoking unwarranted anxiety for patients and potentially increasing the risk of death from non-kidney failure causes, such as preventable cardiovascular events, if the patient had remained at the primary care level. In contrast, underestimation of risk (underprediction) in patients with a higher true short-term progression risk due to a miscalibrated KFRE might cause unnecessary delays in their referral and preparation for dialysis.

Future research

Future research should focus on ensuring model recalibration for clinically relevant populations in Peru, considering our findings that reveal KFRE miscalibration for EsSalud patients in Lima. It is imperative that external validation and recalibration assessments expand to a national level, encompassing the broader EsSalud population. This evidence could expedite the incorporation of KFRE into EsSalud’s clinical practice guidelines.

In concurrence with the recommendations previously put forward by Ramspek et al,33 it is essential to examine the influence of patient mix heterogeneity on model performance, such as through regional comparisons within Peru. This will help in understanding KFRE’s limitations and proposing specific recalibrations to enhance performance at the local level. When conducting external validation and recalibration for these new populations, researchers should account for competitive risk. Further studies should assess KFRE’s clinical utility and determine optimal risk thresholds, if necessary, for informed decision-making while taking clinical relevance into account. Moreover, research should evaluate the impact of KFRE use on significant patient and healthcare system outcomes, including dialysis complications, mortality, and emergency admissions, among others. Notably, these investigations should be conducted within the context of clinical trials.

This post was originally published on https://bmjopen.bmj.com