Forecasting Pediatric Medical Expenses using Machine Learning: A Case Study of the Toto Afya Card in Tanzania
Abstract
The study aimed to predict medical expenses using machine learning (ML) algorithms to improve accuracy and efficiency in healthcare cost estimation. Previously, medical expenses were determined through actuarial analyses, manual assessments, or linear models based on historical data. However, such methods often fail to account for complex relationships among the numerous variables involved in healthcare costs, leading to less precise predictions. The use of ML offers a potential solution by capturing these complex interactions. This study explored four ML models, namely Linear Regression, Random Forest, XGBoost, and CatBoost, - to predict medical expenses using a dataset that included socio-demographic and healthcare-related factors. The selection of these algorithms was based on their ability to handle large datasets, non-linear relationships, and categorical features. The results show that CatBoost and XGBoost perform better than traditional methods. The study also discussed challenges posed by socio-economic conditions, healthcare infrastructure, and demographic variations in modeling.