South China Journal of Preventive Medicine ›› 2026, Vol. 52 ›› Issue (3): 264-268.doi: 10.12183/j.scjpm.2026.0264

• Original Article • Previous Articles     Next Articles

An evaluation of tuberculosis incidence trends in Xinjiang using LSTM and LSTM-XGBoost models

Ma Xiaowei1, Gulina Badeerhan2, Yipaer Aiheiti2, Zulihumaer Aierken2, Wang Senlu2, Wang Xijiang2   

  1. 1. School of Public Health, Xinjiang Medical University, Urumqi, Xinjiang 830017, China;
    2. Xinjiang Uygur Autonomous Region Center for Disease Control and Prevention
  • Received:2025-04-21 Online:2026-03-20 Published:2026-04-07

Abstract: Objective To forecast the incidence trends of pulmonary tuberculosis in five counties and cities within the Xinjiang Uygur Autonomous Region (Xinjiang) utilizing Long Short-Term Memory (LSTM) network and LSTM-XGBoost models, with the aim of providing a scientific basis for tuberculosis prevention and control strategies in these localities. Methods This study first describes the epidemiological characteristics of tuberculosis in Habahe County, Nilek County, Korla City, Pishan County, and LuopuLuopu County from 2011 to 2023. Subsequently, an LSTM neural network model and an LSTM-XGBoost hybrid model were respectively established using the annual reported incidence rates of pulmonary tuberculosis for the period of 2011-2022 in the five selected localities. The predictive performance of these models for the years 2017-2023 was evaluated and compared. These models were then employed to project the incidence trends from 2024 to 2030. Results The average reported incidence rates from 2011 to 2023 were 112.21/100 000 in Habahe County, 101.85/100 000 in Nilek County, 56.86/100 000 in Korla City, 249.79/100 000 in Pishan County, and 359.78/100 000 in Luopu County. A comparison of the predictive accuracy for the 2017-2023 period revealed that the LSTM-XGBoost model demonstrated lower Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) values than the standalone LSTM model in several of the studied areas. The forecasted trends for 2024-2030 from the LSTM-XGBoost model were analogous to those of the LSTM model. By 2030, the projected incidence rates for the respective localities are as follows: Habahe County, 30.48/100 000 (95% CI: 25.0/100 000 - 36.0/100 000); Nilek County, 3.90/100 000 (95% CI: 3.2/100 000-4.6/100 000); Korla City, 24.46/100 000 (95% CI: 20.1/100 000 - 28.9/100 000); Pishan County, 89.43/100 000 (95% CI: 73.3/100 000 - 105.5/100 000); and Luopu County, 89.92/100 000 (95% CI: 73.7/100 000-106.1/100 000). Relative to the actual incidence rates in 2015, the anticipated reductions by 2030 are 75.2% for Habahe County, 96.7% for Nilek County, 61.8% for Korla City, 68.4% for Pishan County, and 78.8% for Luopu County. Conclusion Both the LSTM neural network model and the LSTM-XGBoost model are capable of predicting tuberculosis incidence trends. The LSTM-XGBoost model exhibited superior predictive performance across key metrics in the majority of the counties and cities. The projections indicate a general downward trend in the future annual incidence of pulmonary tuberculosis across the five localities. To achieve the 2030 planning objectives, it is imperative that each jurisdiction, particularly those with high epidemic burdens, implements more targeted and comprehensive prevention and control measures.

Key words: Pulmonary tuberculosis, LSTM neural network, LSTM-XGBoost model

CLC Number: 

  • R183.3