South China Journal of Preventive Medicine ›› 2019, Vol. 45 ›› Issue (1): 26-31.doi: 10.13217/j.scjpm.2019.0026

• Orginal Article • Previous Articles     Next Articles

Risk assessment of dengue fever based on random forest model

HUANG Yu-lin1, ZHAO Yong-qian1, CAO Zheng2, LIU Tao3, DENG Ai-ping4, XIAO Jian-peng3, ZHANG Bing3, ZHU Guang-hu3, PENG Zhi-qiang4, MA Wen-jun3   

  1. 1.Jinan University Faculty of Medical Science, Guangzhou 510632, China;
    2.School of Geographical Sciences, Guangzhou University;
    3.Guangdong Provincial Institute of Public Health,Guangdong Provincial Center for Disease Control and Prevention;
    4.Guangdong Provincial Center for Disease Control and Prevention
  • Received:2018-09-12 Published:2019-04-19

Abstract: Objective To construct a small spatial scale dengue risk assessment tool based on the random forest model,so as to provide scientific basis for the prevention and control of dengue fever. Methods Data of dengue case and related factors from February 2012 to September 2014 were used as the training set and random forest regression (RFR) models were constructed separately for frequency, duration and intensity of dengue fever. Data of dengue cases and related factors from October 2014 to March 2015 were used to as the testing set to verify the accuracy of the models. Results The correlation coefficients between incidence and frequency, duration, intensity of dengue fever were all higher than 0.7. Based on the training set, the pseudo R-squareds in the models of frequency, duration, and intensity were 96.72%, 91.98%, and 90.1%; the cross-validated mean square errors (MSEs) of the models were 0.001 9, 1.424 6, and 1.881 1, respectively. By comparing the accuracy of RFR, support vector regression (SVR), generalized linear model (GLM) and generalized additive model (GAM), the MSEs of RFR and SVR were much lower than those of GLM and GAM. Conclusion The RFR models constructed using the frequency, duration and intensity of dengue fever as outcome variables and the meteorological, environmental and socioeconomic characteristics as predictors have better accuracy and can be used as a risk assessment tool for preventing and control of the outbreak of dengue fever.

Key words: Dengue, Random forest regression, Risk assessment

CLC Number: 

  • R183.5