Intended for healthcare professionals

Research Methods & Reporting

Bias by censoring for competing events in survival analysis

BMJ 2022; 378 doi: https://doi.org/10.1136/bmj-2022-071349 (Published 13 September 2022) Cite this as: BMJ 2022;378:e071349
  1. Maarten Coemans, biostatistician1 2,
  2. Geert Verbeke, professor of biostatistics3,
  3. Bernd Döhler, statistician4,
  4. Caner Süsal, professor of immunology4 5,
  5. Maarten Naesens, professor nephrology1 6
  1. 1Department of Microbiology, Immunology and Transplantation, KU Leuven, Leuven, Belgium
  2. 2Leuven Biostatistics and Statistical Bioinformatics Centre (L-Biostat), KU Leuven, Leuven, Belgium
  3. 3Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat), Universiteit Hasselt and KU Leuven, Hasselt and Leuven, Belgium
  4. 4Institute of Immunology, University of Heidelberg, Heidelberg, Germany
  5. 5Transplant Immunology Research Centre of Excellence, Koç University, Istanbul, Turkey
  6. 6Department of Nephrology and Renal Transplantation, University Hospitals Leuven, Leuven, Belgium
  1. Correspondence to: M Naesens maarten.naesens{at}kuleuven.be (or @mnaesens on Twitter)
  • Accepted 28 July 2022

In survival analysis, competing events preclude the occurrence of the event of interest. The censoring of competing events is common in medical studies but leads to biased cumulative incidence estimators. Competing risks methods, such as the non-parametric Aalen-Johansen method or the semi-parametric Fine and Gray model, alleviate this bias and should be preferred above the Kaplan-Meier method and the Cox model, respectively. As an illustrative example, in a large European cohort, we report on the differences in the cumulative incidence estimates of graft failure after kidney transplantation, caused by censoring for recipient death.

In time-to-event or survival modelling, an important aim is to estimate the cumulative incidence (ie, the absolute risk by time t) of an event of interest. In medical applications, such events include death due to cancer, infection or cardiovascular causes, time to remission after treatment, and time to graft failure after transplantation. The occurrence of these events is often precluded by another earlier event (a competing risk) that, when censored for, leads to biased cumulative incidence estimators. Examples of such competing events are death due to other causes when the interest is in death due to cancer, death before remission when analysing time to oncological remission, and death with a functioning graft when interested in graft failure after transplantation. Non-fatal competing events also exist—for instance, receiving a second transplant organ when analysing the time to graft failure, receiving a vaccine when analysing the time to infection, or receiving a competing diagnosis in studies analysing the time to which diagnosis (out of several) occurs first.1 Multiple competing risks (eg, different causes of death, different diagnoses) can be present.

Censoring competing events is common in medical studies2 and this throughout all specialties. When estimating cumulative incidences, however, this censoring is only justified if interest is in a hypothetical population in which the competing event does not occur, and if the event of interest and the competing events are independent.34 In all other circumstances, traditional estimators of the cumulative incidence (with censored competing risks) tend to be biased upward, sometimes even leading to impossible probabilities. An example of this has been discussed by Wolkewitz and colleagues,5 where a combined probability of the competing events nosocomial pneumonia and hospital discharge without infection of 114% was achieved at day 40 after admission.

Such probability of more than 100% is clearly impossible and illustrates that sometimes spectacular errors are made when competing risks are not considered. However, not all errors are this obvious: the bias is often subtle and too easily dismissed by clinicians or researchers. The bias stems from the definition of censoring, which asserts that an individual will still experience the event of interest after the censoring time point. As experiencing the event of interest cannot happen after a competing event occurrence, adapted statistical techniques are required, especially when aiming at prediction. Indeed, studies that focus on predicting the risk of an event need unbiased estimators of cumulative incidence. By contrast, when studies focus on cause (ie, assessing the relation between a risk factor and the hazard of an event of interest), hazard ratios from a more traditional analysis (eg, Cox model) remain valid.67

Summary points

  • Censoring competing events is common in medical studies and this throughout all specialties

  • By censoring competing events, traditional survival techniques such as the Kaplan-Meier method and the Cox model provide upward biased estimators of the cumulative incidence of an event of interest

  • This bias increases with time and with higher incidences of the competing event

  • To obtain an unbiased estimator of the cumulative incidence of an event of interest, competing event censoring should be abandoned, in favour of competing risks methods

Example: censoring for recipient death after kidney transplantation

An example of the widespread use of competing risk censoring can be found in kidney transplantation research where the main outcome, graft failure after transplantation (ie, return to dialysis or repeat transplantation), is often censored for death with a functioning graft. In this field, censoring for death is so universal that it is even included in the endpoint definition of many clinical trials: death censored graft failure.

In this example, the competing events (graft failure and death with a functioning graft) are mutually exclusive. This scenario differs from other contexts, where, for example, death in general is studied as event, and other non-fatal events might occur earlier, so that these events are not mutually exclusive. In such a setting of semi-competing risks, if the interest is in the fatal event (death), one should opt for a model in which the non-fatal event is treated as an effect modifier of the risk of death, rather than as a competing event.8

The incidence of death with a functioning graft is comparable to the incidence of graft failure, so censoring for recipient death leads to a non-negligible (upward) bias in the estimator of the cumulative incidence of graft failure. Based on a single centre study, for example, El Ters et al9 observed an over-estimation of 13 percentage points (34% minus 21%) in older recipients 20 years after transplantation. Given the increasing interest in implementing prognostic models in medicine,10 also specifically in kidney transplantation (eg, finding surrogate endpoints for clinical trials11 and for individual patient monitoring), the accuracy of such prognostications becomes important.12 We therefore exemplify the bias in a large European registry of kidney transplantations (see also box 1). More specifically, we show these biases after transplantation in the short term up to 10 years after transplantation, in both high and low risk groups for recipient death, and according to non-parametric (Kaplan-Meier versus Aalen-Johansen) and semi-parametric (Cox versus Fine and Gray) statistical methods. We provide clear guidance on how to overcome these biases.

Box 1

Example of death censored graft failure after kidney transplantation to illustrate the bias in a cumulative incidence estimator

  • Graft failure and recipient death with a functioning graft are two easily understood competing events

  • Typically, the problem of competing events is circumvented by censoring graft failure for recipient death

  • Because recipient death with a functioning graft is a frequent competing event, censoring of this event results in a non-negligible (upward) bias in the estimator of the cumulative incidence of graft failure

  • Donor age, an easily understood background risk factor, affects both the incidence of graft failure and recipient death, and thus can be used to create high and low risk groups that show different degrees of bias

RETURN TO TEXT

Collaborative Transplant Study

To illustrate the bias in the estimator of the cumulative incidence of graft failure due to censoring for recipient death, we use single kidney transplantations in adults in Europe performed between 2000 and 2019 (n=167 190), recorded in the Collaborative Transplant Study.13 This study, to which many transplant centres have contributed voluntarily since 1982, receives active support of more than 400 transplant centres across 42 countries worldwide and is the largest kidney transplantation database in Europe, adhering to a rigorous methodology.13 More details on the data collection and demographics are available in the supplementary methods. For all analyses, we used SAS 9.4 (SAS Institute, Cary, NC). Details on evaluating assumptions, baseline risk estimation, and the cause specific hazards approach are available in the supplementary methods, while box 2 provides a glossary of the terms used in this work.

Box 2

Glossary

Event of interest

  • The outcome central to the study. Survival analysis evaluates the time to this event. In this article’s example, the event of interest is graft failure after kidney transplantation.

Competing event

  • An event that precludes the event of interest from occurring. Multiple competing events can be present, but the most obvious competing event is death. In this article’s example, the competing event is recipient death with a functioning graft.

Cumulative incidence of an event

  • The probability of that event occurring by time t. It is the inverse of the survival function and equals the absolute risk of the event occurring by time t. Its formula applied to the example of graft failure is the following:

  • Fgf(t)=∫0tS(s).hgf(s).ds

  • (where Fgf(t) is the cumulative incidence of graft failure by time t, S(s) is the overall survival function at time s, and hgf(s) is the cause specific hazard of graft failure (ie, death censored) at time s).14

  • The overall survival function S(s) is the survival function of the composite event of graft failure and death with a functioning graft and lies therefore always lower than Sgf(s), the death censored survival curve of graft failure. Hence, replacing S(s) by Sgf(s), thus introducing death censoring, leads to an over-estimation of the cumulative incidence of graft failure.

Biased estimator

  • When the estimators’ expected value is systematically deviating from the quantity that is to be estimated. In our illustrative case, differences between estimated cumulative incidences indicate the (upward) bias underlying the competing risk censored estimators.

(Cause specific) hazard rate

  • The instantaneous rate (probability per unit time)7 of the event of interest occurring, given that neither the event of interest nor any competing event has occurred before. The hazard rate at time t is the ratio of the probability of the event occurring in an infinitesimally small interval after t, to the probability of being free of the event of interest at t, divided by the length of the interval. The risk set is all individuals who have not experienced any event before (neither the event of interest nor any competing events).

Subdistribution hazard rate

  • The instantaneous rate (probability per unit time)7 of the event of interest occurring, given that the event of interest has not occurred before. The subdistribution hazard rate at time t is the ratio of the probability of the event of interest occurring in an infinitesimally small interval after t, to the probability of being free of the event of interest at t, divided by the length of the interval. Compared with the (cause specific) hazard rate, the risk set is different; individuals who previously experienced a competing event are considered to remain indefinitely at risk for the event of interest thereafter.

Hazard ratio

  • The multiplicative change in the hazard rate of an event with increasing covariate values. The hazard ratio is not equal to the relative risk.15

Subdistribution hazard ratio

  • The multiplicative change in the subdistribution hazard rate of an event with increasing covariate values. The subdistribution hazard ratio is not equal to the relative risk.7

Non-parametric survival methods

  • These methods (eg, the Kaplan-Meier and the Aalen-Johansen estimator) make no assumptions about the underlying distribution of survival times, leading to estimated cumulative incidences as step functions. These methods (with, for example, the log rank or Gray’s test) allow to test for differences in survival or in cumulative incidence between two or more groups. Since no assumptions on the relative position of the cumulative incidences (between groups) are made, all non-parametric curves are estimated independently, making these methods ideal to visualise the raw data.

Semi-parametric survival models

  • These models (eg, Cox16 and Fine and Gray17) make no assumptions about the underlying distribution of survival times but do allow for a relative effect parameter of multiple continuous or categorical covariates on the (subdistribution) hazard rate. Supplemented with a non-parametric estimator of the baseline cumulative incidence (eg, Breslow), semi-parametric survival models estimate conditional cumulative incidences as step functions.3

  • Since the Cox model is a proportional hazards model and the Fine and Gray model is a proportional subdistribution hazards model, they restrict the hazard rates and subdistribution hazard rates, respectively, to be proportional. When the proportionality assumption is not met, the predicted cumulative incidence curves resulting from these models could deviate considerably from the raw data pattern, which is not the case with the non-parametric methods.

Parametric survival models

  • Models that assume a specific distribution (eg, exponential or Weibull) for the underlying survival times, thereby providing smooth cumulative incidence curves that allow for out-of-sample predictions. Extra flexibility in the underlying survival time distribution can be provided by flexible parametric survival models.18

RETURN TO TEXT

Absolute and relative bias of the Kaplan-Meier versus the Aalen-Johansen estimator

The Kaplan-Meier method estimates the event-free survival or, when inversed (1 minus), the cumulative incidence of an event. Typically, in Kaplan-Meier analyses, competing risks are censored, leading to the analysis of a hypothetical population in which these competing events do not exist (and thus with the assumption that all patients eventually will experience the event of interest). Such a hypothetical population is rarely useful in medical science and is also only interpretable when the event of interest and the competing events occur independently of each other.4 In case of independence, the Kaplan-Meier estimator of the cumulative incidence refers to a world where all individuals live long enough to experience the event. Such a hypothetical world could be relevant for engineers who develop mechanic heart valves and who do not want the durability assessment to be affected by the death of a patient.4 However, doctors take the risk of death into account when prescribing potentially harmful treatments for clinical events, and thus operate in a world where competing events can occur. When not accounting for competing events, the summation of the Kaplan-Meier cumulative incidences could, for example, lead to impossible probabilities that exceed 100%.15

The Aalen-Johansen method similarly estimates the cumulative incidence of the event of interest, but deliberately accounts for competing events by acknowledging that after a competing event occurrence the event of interest can no longer happen (which is the opposite of the censoring mechanism).19 As such, these cumulative incidence estimates pertain to the real world situation in which people can experience different types of events. In the presence of censored competing events, the Kaplan-Meier estimator of the cumulative incidence is biased upward since censoring assumes that the event of interest will still occur after the censoring time point, leading to a positive event probability (according to a redistribution algorithm)20 that should be zero after the competing event. The Aalen-Johansen estimator alleviates this over-estimation bias of the Kaplan-Meier method by removing patients who experience a competing event from the risk set, while imposing a zero probability for the event of interest.

In the example of kidney transplantation, because of censoring for recipient death, the cumulative incidence of graft failure (fig 1, top left graphs) is over-estimated by 0.48 percentage points (13.04% minus 12.56%) and 2.08 percentage points (23.61% minus 21.53%) at five and 10 years after transplantation, respectively, with the Kaplan-Meier method compared to the Aalen-Johansen method (table 1). Relatively, these differences amounted to 3.8% and 9.7%, respectively. Analogously, when analysing the cumulative incidence of recipient death (censored for graft failure; fig 1, top right graphs), these over-estimations amounted to 0.77 percentage points (9.65% minus 8.88%) at five years, and 2.81 percentage points (20.73% minus 17.92%) at 10 years, according to relative differences of 8.7% and 15.7%. All observed differences in estimates increase with time after transplantation, as does the underlying estimator bias.

Fig 1
Fig 1

Cumulative incidence plots of graft failure and death with a functioning graft overall (first and second rows) and grouped by donor age of <65 and ≥65 years (ie, low and high risk groups, respectively; third and fourth rows), in the Collaborative Transplant Study (n=167 190) database, according to the (1 minus) Kaplan-Meier and the Aalen-Johansen method. Relative difference in second and fourth row graphs=relative difference between the Kaplan-Meier and the Aalen-Johansen method, calculated as the absolute difference (pink area in first and blue and orange area in the third row graphs) divided by the Aalen-Johansen estimate of the cumulative incidence (dashed lines in first and third row graphs). These graphs illustrate the over-estimation of the cumulative incidence with the Kaplan-Meier estimator, compared with the Aalen-Johansen estimator

Table 1

Cumulative incidence estimates of graft failure at five and 10 years after kidney transplantation, in the Collaborative Transplant Study (n=167 190) database, by analysis method, including the difference between estimates of the (non-parametric and semi-parametric) biased and unbiased estimators

View this table:

Bias of the Kaplan-Meier estimator, according to the background risk of the competing event

Not only does the (upward) bias in the Kaplan-Meier estimator always increase by time, it also increases with the frequency of the competing event,1 because the Aalen-Johansen estimator, in its calculation, uses the overall (ie, event of interest plus competing events) survival function. Consequently, when a competing event occurs, the overall survival function decreases while the Kaplan-Meier survival function (in which the competing events are censored) remains stable. The more competing events, the stronger the divergence (see also box 2 for explanation of cumulative incidence).

To illustrate this, in the example of kidney transplantation, donor age at time of transplantation above and below 65 years was used to create a high and low risk group, respectively, for both kidney graft failure and recipient death. Donors aged 65 years and older have an increased risk for kidney allograft failure. Moreover, recipients of kidneys from high risk donors are more likely to be older themselves and thus have an increased risk of death with a functioning graft.21 In the example, 21 761 (12.1%) graft failures and 135 316 (16.1%) deaths with functioning graft occurred in the low risk group (donors aged <65 years; n=135 316); 6375 (20.0%) graft failures and 6390 (20.1%) deaths with functioning graft occurred in the high risk group (donors aged ≥65 years; n=31 874). In the high risk group, the cumulative incidence of graft failure was over-estimated by 1.32 and 5.38 percentage points at five and 10 years, respectively, with the Kaplan-Meier method compared to the Aalen-Johansen method, amounting to relative differences of 7.4% and 19.2%, respectively (fig 1, bottom left graphs; and table 1). In the low risk group, over-estimation of the Kaplan-Meier method equated to 0.36 and 1.62 percentage points, respectively, corresponding to relative differences of 3.2% and 8.0%, respectively. For the cumulative incidence of recipient death with a functioning graft (fig 1, bottom right graphs), the relative over-estimation was 13.5% at five years and 24.0% at 10 years in the high risk group, and 7.8% and 14.4% in the low risk group, respectively. The group at high risk for the competing event thus showed considerably larger absolute and relative differences than the low risk group.

Biased semi-parametric cumulative incidence estimators—Cox v Fine and Gray competing risks model

The non-parametric Kaplan-Meier and Aalen-Johansen methods allow for a one dimensional comparison of the cumulative incidence between groups, which can be adjusted for confounding via, for example, inverse probability of treatment weighting.22 By contrast, the semi-parametric Cox16 and competing risks Fine and Gray model17 allow for the easy inclusion of multiple continuous and categorical covariates and provide relative effect parameters (hazard and subdistribution hazard ratios) for each. After meeting the underlying assumptions of proportionality and linearity for the covariate effects and with an additional estimation of the baseline hazard function (eg, Breslow estimator),23 these models also provide cumulative incidence curves (conditional on specific covariate values).

Nonetheless, in the presence of competing risks, the cumulative incidence estimators of both semi-parametric models are biased. Analogous to the Kaplan-Meier method, the cumulative incidence estimator of the Cox model is upward biased. The Fine and Gray model accounts for competing events by modelling the subdistribution hazard of the event of interest, which is directly related to the cumulative incidence. However, since there are as many subdistribution hazards as competing events, and all of these are modelled separately, small inconsistencies (in either way) in the estimator of the cumulative incidence could still occur.24

These inconsistencies (bias) can be resolved by estimating the cumulative incidence of the event of interest via a combination formula that includes the cause specific hazard of the event of interest and the competing event.31424 The combinatorial formula produces an unbiased estimator of the cumulative incidence but, unlike the Fine and Gray model, requires a model for each event type, provides no insight in covariate effects on the cumulative incidence scale, and is not user friendly.14 Therefore, Fine and Gray’s easy-to-use cumulative incidence estimator is still preferred when confronted with competing risks, at the cost of a small remaining bias.

Biased semi-parametric cumulative incidence estimators, according to the background risk of the competing event

In the example of kidney transplantation, we included donor age as a continuous covariate in the Cox model and the Fine and Gray competing risks model for graft failure, while controlling for the effect of recipient age. As suggested by a visual linearity and proportionality check for both models, we included donor age as a linear and as a quadratic, proportional effect (supplementary fig S1). Consequently, we compared the cumulative incidence estimates of these models for donor ages 20, 40, 60, and 70 years to these of the unbiased cause specific hazards approach (see supplementary fig S2 for the cause specific model of recipient death).

We saw a consistent over-estimation of the cumulative incidence of graft failure by the Cox model (fig 2 and table 1). For donor ages 20, 40, 60, and 70 years, this over-estimation amounted to 0.18, 0.23, 0.39, and 0.54 percentage points at five years, respectively, and to 0.88, 1.15, 1.85, and 2.48 percentage points at 10 years, respectively. The relative differences amounted to 2.4%, 2.5%, 2.7%, and 2.7% at five years and to 6.6%, 7.0%, 7.3%, and 7.4% at 10 years (fig 2 and table 1). By comparison, the Fine and Gray cumulative incidence estimates deviated by 0.01, 0.05, 0.27, and 0.83 percentage points for donor ages 20, 40, 60 and 70 at five years, and by 0.37, 0.27, 0.85 and 1.64 percentage points at 10 years, respectively (fig 2 and table 1). Relatively, these differences amounted to 0.1%, 0.5%, 1.9% and 4.2% at five years, and to 2.8%, 1.6%, 3.4% and 4.9% at 10 years (fig 2 and table 1). Although the observed absolute differences were limited, the Cox model was generally more deviating than the Fine and Gray model. The observed relative differences of the Cox model increased steadily by time, while these of the Fine and Gray model remained relatively stable.

Fig 2
Fig 2

Cumulative incidence plots of graft failure after kidney transplantation, conditional on donor age of 20, 40, 60, and 70 years (top graph) and relative difference of the Cox and Fine and Gray model compared with the cause specific (unbiased) approach (bottom graph). Recipient age was kept fixed at 50 years (mean age in data from the Collaborative Transplant Study)

Interpretation and usefulness of hazard and subdistribution hazard ratios

Crucial to the correct interpretation of results from both the Cox and Fine and Gray model is the understanding of the hazard and subdistribution hazard rate, respectively. The rate of an event of interest is not equal to the risk of this event. More precisely, the hazard rate (Cox model) is the instantaneous rate (ie, probability per unit time)7 that the event of interest will occur, given that neither the event of interest nor any competing event has occurred before. The subdistribution hazard rate (Fine and Gray model) is the instantaneous rate of the event of interest occurring, given that the event of interest has not occurred before. Both rates differ in their risk set. The hazard rate is calculated for individuals who are free of all types of events (the event of interest as well as the competing events), whereas for the calculation of the subdistribution hazard rate, individuals who have previously experienced a competing event remain indefinitely at risk for the event of interest (see also box 2).

Similarly, hazard ratios and subdistribution hazard ratios are relative rates, not relative risks (see box 2).715 The magnitude of the hazard or subdistribution hazard ratio is not transferable to the relative risk. Even when the relative rate is constant, the relative risk most likely varies over time owing to its dependence on the baseline risk function.15 Nonetheless, when no competing risks are present, we can safely say that a positive hazard ratio corresponds to an increased risk (ie, increased cumulative incidence) of the event of interest. In the presence of competing risks, the same holds true for the subdistribution hazard ratio of the Fine and Gray model, but not necessarily for the hazard ratio of the Cox model.7

Consider, for example, a new preventive treatment for cardiovascular disease for patients who have received kidney transplants that, compared with patients who have received standard care, lowers the hazard rate of death with a functioning graft (hazard ratio <1 in the Cox model), but has no influence on the hazard rate of graft failure (hazard ratio=1 in the Cox model). Since fewer patients with the preventive treatment will die than with the standard care, more of them are susceptible to graft failure, thereby increasing the cumulative incidence of graft failure in the preventive treatment group. In the Fine and Gray competing risks model, this increase of the cumulative incidence of graft failure in the preventive treatment group would be evidenced by an increase in the subdistribution hazard rate of graft failure (subdistribution hazard ratio >1) owing to a decreased risk set for graft failure (fewer competing events=fewer individuals at indefinite risk).

By associating (hazard ratios) the preventive treatment with the hazard rate of both the event of interest and the competing events, the competing risk censored Cox models describe the causal relation between the treatment (covariate) and the different outcomes accurately. Hazard ratios therefore remain the recommended measure in studies focusing on causation. However, for prediction purposes and medical decision making, actual risks (ie, the cumulative incidence) are the main interest. The subdistribution hazard ratios from the Fine and Gray model describe relative covariate effects on the cumulative incidence scale; therefore, reporting of these subdistribution hazard ratios is recommended.67 Nevertheless, because the subdistribution hazard ratios can only be used to indicate that the cumulative incidence increases (subdistribution hazard ratio >1), decreases (subdistribution hazard ratio <1), or remains stable (subdistribution hazard ratio=1), reporting the corresponding estimated cumulative incidences is also advised. These cumulative incidences from the Fine and Gray model are accurate by taking the competing events into consideration and are as intuitive as the cumulative incidence plots of the non-parametric methods, but can also be adjusted for covariates. For the example of kidney transplantation, the subdistribution hazard ratios and their interpretation are reported in supplementary table S2, as were the cumulative incidence estimates in table 1 and fig 2A.

Parametric survival models and logistic regression as alternatives

Competing risks survival models are not necessarily restricted to non-parametric or semi-parametric approaches as described above. Parametric survival models exist that provide smooth cumulative incidence curves, thereby allowing for out-of-sample predictions. Their drawback is the mandatory choice of a specific form of the baseline hazard function. Royston and Parmar18 largely relaxed this restriction by introducing flexible parametric models, while additionally proposing the restricted mean lifetime as an alternative measure to the hazard ratio (also applicable under non-proportional hazards). Despite predictions that are attainable in one estimation step and easily incorporated time dependent effects, these models have their own technical challenges (eg, knot placement) and interpretational difficulties. More information on their advantages and disadvantages, and their extension to the competing risks framework, can be found in Mozumder et al.25

Finally, logistic regression models are also capable of correctly accounting for competing events. By contrast with the subdistribution hazard ratios from the Fine and Gray model, odds ratios from a logistic regression model allow for a direct interpretation in terms of relative risks. However, logistic regression models cannot account for censoring or for the time until an event, and can only compare cumulative incidences at the end of follow-up. When follow-up is (almost) complete, which was not the case in the example of kidney transplantation, logistic regression could be a valuable alternative for competing risks survival models.26

Conclusion

In survival analysis, when estimating cumulative incidences, censoring of competing events should be avoided (box 3). This article, which uses kidney transplantation as an example, advocates for the systematic replacement of the Kaplan-Meier method by the competing risks Aalen-Johansen method, and for the replacement of the standard Cox model by the competing risks Fine and Gray model, when cumulative incidence estimation is of interest. Similar differences in cumulative incidence estimates due to competing risk censoring were found in example studies on native (not transplanted) kidneys,27 breast cancer,28 stroke,29 showing the general applicability of these recommendations.

Box 3

Why competing risk censoring should be avoided in survival prediction

  • When competing events are censored, the estimator of the cumulative incidence is upward biased

  • This bias is not easily quantifiable, but increases with time and with the frequency of the competing event(s)1

  • In the presence of competing events, competing risks methods alleviate this bias and should therefore always be preferred when estimating cumulative incidences

RETURN TO TEXT

Ethics statements

Ethical approval

The work of the Collaborative Transplant Study is approved by the ethics committee of the Medical Faculty of Heidelberg University (No 083/2005).

Data availability statement

The raw data are available on request to the Collaborative Transplant Study in accordance with the consents of the patients, participating transplant centres, and registries.

Acknowledgments

We thank transplant registries Eurotransplant, Italian National Transplant Centre, Catalan Transplantation Organisation, Dutch Transplant Foundation, and UK Transplant for collaboration and data exchange with the Collaborative Transplant Study; the transplantation centres or hospitals that provided data for this study to the Collaborative Transplant Study for their generous support; and Ronald Geskus for his assistance with calculating the unbiased cumulative incidence functions according to the cause specific hazards approach.

Footnotes

  • Contributors: MC, GV, and MN designed the illustrative example. BD and CS were involved in the data collection. All authors have been working for several years in analysing and interpreting kidney transplantation data and provided statistical and clinical expertise. MN is the guarantor of this article. The corresponding author attests that all listed authors meet the authorship criteria and that no others meeting the criteria have been omitted.

  • Funding: There was no specific funding provided for the study. MC had financial support from the Fonds Wetenschappelijk Onderzoek (Research Foundation–Flanders) and the Agency for Innovation and Entrepreneurship by an “Applied Biomedical Research with a Primary Social Finality” project grant IWT.150199; MN is senior clinical investigator of the Fonds Wetenschappelijk Onderzoek (Research Foundation–Flanders) (grant 1844019N). The funding agencies had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. All authors had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at https://www.icmje.org/disclosure-of-interest/ and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

  • Provenance and peer review: Not commissioned; externally peer reviewed.

  • Patient and public involvement: No patients or public were involved in setting the research question, determining the sample, designing, or implementing the study. No patients or public were requested to advice on interpretation or writing up of results.

References