Development and Modeling of Decision Tree for Survival Data with Multiple Events Using Deviance and Cox-Snell Residuals within Node Homogeneity Technique
DOI:
https://doi.org/10.35882/ijahst.v2i3.9Keywords:
Martingale residual, cox-snell residual, deviance residual, CART, and within-node homogeneity.Abstract
It is very common in medical studies for a patient to experience more than one event rather than one of interest. This led to exposing an individual to multiple risks and medical practitioners need to account for these risks concerning some prognostic factors. There are many methods of dealing with multiple events in survival data classically, however, these methods break down when considering the top-down effect of the prognostic factors concurrently and when the risks of events are correlated (competing risks). This study aimed to develop a decision tree using a within-node homogeneity procedure in survival analysis with multiple events to classify individual risks for the competing risks. Since the CART methodology involves recursive portioning of covariates into different subgroups, this study considers the use of Deviance and Modified Cox-Snell residuals as a measure of impurity in the Classification Regression Tree (CART) during the process of partitioning. The flexibility and predictive accuracy of our learning algorithm would then be compared with other existing methods through simulation and the freely available online real-life data. The results of the simulation revealed that: using Deviance and Cox-Snell residuals as a response within the node homogeneity classification tree performs better than using other residuals irrespective of performance indices. Results from empirical studies of the two real-life data that the proposed model with Cox-Snell residual (Deviance=16.6498) performs better than both the Martingale residual (deviance=160.3592) and Deviance residual (Deviance=556.8822). Conclusively, using Cox-Snell residual (Mean Square Error (MSE)=0.01783563) as a measure of impurity in CART revealed improved performance than using any other residual methods (MSE=0.1853148, 0.8043366). This implies that the proposed methods have the capability of accounting for individual effects based on the prognostic biomarkers.
Downloads
References
K. E. Cevasco, A. A. Roess, H. M. North, S. A. Zeitoun, R. N. Wofford, G. A. Matulis, A. F. Gregory, M. H. Hassan, A. D. Abdo, and M. E. von Fricken, “Survival analysis of factors affecting the timing of COVID-19 non-pharmaceutical interventions by U.S. universities”. BMC Public Health 21, 1985 2021. https://doi.org/10.1186/s12889-021-12035-6
Y. Jeon and W. K. Lee, “Competing Risk Model in Survival Analysis,” Cardiovasc Prev Pharmacother. 2(3):77-84, 2020.
P. Macek, M. Biskup, M. Terek-Derszniak, M. Manczuk, H. Krol, E. Naszydlowska, J. Smok-Kalwat, S. Gozdz and M. Zak,” Competing Risks of Cancer and Non-Cancer Mortality When Accompanied by Lifestyle-Related Factors—A Prospective Cohort Study in Middle-Aged and Older Adults,” Frontiers in Oncology , 10, 2020.
V. Zuccaro, C. Celsa, M. Sambo, S. Battaglia, P. Sacchi, S. Biscarini, P. Valsecchi, T. C. Pieri, I. Gallazzi, M. Colaneri, M. Sachs, S. Roda, E. Asperges, M. Lupi, A. Di Filippo, E. Seminari, A. Di Matteo, S. Novati, L. Maiocchi, M. Enea, M. Attanasio, C. Cammà, and R. Bruno, “Competing-risk analysis of coronavirus disease 2019 in-hospital mortality in a Northern Italian centre from SMAtteo COvid19 REgistry (SMACORE),” Sci Rep. 2021 Jan 13;11(1):1137. doi: 10.1038/s41598-020-80679-2. PMID: 33441892; PMCID: PMC7806993.
G. Nijman, M. Wientjes, J. Ramjith, N. Janssen, J. Hoogerwerf, E. Abbink, M. Blaauw, T. Dofferhoff, M. van Apeldoorn, K. Veerman, Q. de Mast, J. Ten Oever, W. Hoefsloot, M. H. Reijers, R. van Crevel, and J. S. van de Maat , “Risk factors for in-hospital mortality in laboratory-confirmed COVID-19 patients in the Netherlands: A competing risk survival analysis,” PLoS One. 2021 Mar 26;16(3):e0249231. doi: 10.1371/journal.pone.0249231. PMID: 33770140; PMCID: PMC7997038.
M. Kojiro,“Introduction to Survival Analysis in the Presence of Competing Risks,” Annals of Clinical Epidemiology 2021;3(4):97–100
J. J. Liao, and G. F. Liu, “A flexible parametric survival model for fitting time to event data in clinical trials,” Pharm Stat 2019;18(5):555–567.
G. F. Liu, and J. J. Liao, “Analysis of time-to-event data using a flexible mixture model under a constraint of proportional hazards,” J Biopharm Stat 2020;30(5):783–796.
J. J. Liao, M. Z. Farooqui, P. Marinello, J. Hartzel, K. Anderson, J. Ma, C. K. Gause, “Using artificial intelligence tools in answering important clinical questions: the keynote-183 multiple myeloma experience,”Contemp Clin Trials 2020;p106179
Y. Tseng, H. Wang, T. Lin, J. Lu, C. Hsieh, and C. Liao, “Development of a Machine Learning Model for Survival Risk Stratification of Patients With Advanced Oral Cancer,”JAMA Netw Open. 2020;3(8):e2011768. doi:10.1001/jamanetworkopen.2020.11768.
S. Bussy, A. Guilloux, S. Gaïffas, A-S. Jannot, “C-mix: a high-dimensional mixture model for censored durations, with applications to genetic data,” Stat Methods Med Res 2019; 28(5):1523–1539.
K. A. Dauda, W. B. Yahya, and A. W. Banjoko, “Survival Analysis With Multivariate Adaptive Regression Splines Using Cox-Snell Residual,” Journal of Annals. Computer Science Series. 2015;13(2): 25-41.
K. A. Dauda, B. Pradhan, B. U. Shankar, and S. Mitra, “Decision tree for modeling survival data with competing risks”, Biocybernetics and Biomedical Engineering, 2019;39(3):697-708. https://doi.org/10.1016/j.bbe.2019.05.001
M. C. Fiona, “Classification Trees For Survival Data With Competing Risks,” Department of Biostatistics, University of Pittsburgh; 2008.
L. Breiman., J. Friedman., R. Olshen., and C. Stone, “Classification and Regression Trees,” Wadsworth, Belmont California, 1984.
A Triantafyllidis, H. Kondylakis, D. Katehakis, A. Kouroubali, L. Koumakis, K. Marias, A. Alexiadis, K. Votis, and D. Tzovaras, “Deep Learning in mHealth for Cardiovascular Disease, Diabetes, and Cancer,” Systematic Review JMIR Mhealth Uhealth 2022;10(4):e32344. doi: 10.2196/32344PMID: 35377325
M. Z. Alam, M. S. Rahman, and M. S. Rahman,” A Random Forest-based predictor for medical data classification using feature ranking.” Inform Med Unlocked. 2019; 15:1–12. doi: 10.1016/j.imu.2019.100180
K. A. Dauda, K. O. Olorede, and, S. A. Aderoju, “A novel hybrid dimension reduction technique for efficient selection of bio-marker genes and prediction of heart failure status of patients,” Scientific African, Volume 12, 2021,e00778, ISSN 2468-2276, https://doi.org/10.1016/j.sciaf.2021.e00778.
F. Jiang, Y. Jiang, H. Zhi, Y. Dong, H. Li, S. Ma, Y. Wang, Q. Dong, H. Shen, and Y. Wang, “Artificial intelligence in healthcare: past, present and future,” Stroke Vasc Neurol. 2017;2(4):230–43. 10.1136/svn-2017-000101
S. Barbieri, S. Mehta, B. Wu, C. Bharat, K. Poppe, L. Jorm, and R. Jackson, “Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach”, International Journal of Epidemiology, 2021;, dyab258, https://doi.org/10.1093/ije/dyab258
P. N. Srinivasu, J. G. SivaSai, M. F. Ijaz, A. K. Bhoi, W. Kim, and J. J. Kang, “Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM, Sensors”, 2021, (21)2852. https://doi.org/10.3390/s21082852
S. Piri, D. Delen, and T. Liu, “A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets”, Decis. Support Syst., 106 (2018), 15–29. https://doi.org/10.1016/j.dss.2017.11.006
J. P. Fine, and R. J. Gray, “A proportional hazards model for the subdistribution of a competing risk,” J Am Stat Assoc. 1999;94:496–509. doi: 10.1080/01621459.1999.10474144.
J. P. Klein, and M. L. Moeschberger, “Survival Analysis: Techniques for Censored and Truncated Data”, 2005.
G. L. Lu-Yao, P. C. Albertsen,, D. F. Moore, W. Shih, Y. Lin, R. S. DiPaola, M. J. Barry, A. Zietman, M. O'Leary, E. Walker-Corkery, S. L. Yao, “Outcomes of localized prostate cancer following conservative management,” Journal of the American Medical Association, 2009 302, 1202 - 1209.
R. Olshen, “Remembering leo breiman,” The Annals of Applied Statistics, 2010, 4(4):1644–1648.
L. Breiman, “Statistical modeling: The two cultures (with rejoinder),”Statistical Science, 2001b, 16(3):199–231.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Kazeem Dauda
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).