Skip directly to site content Skip directly to page options Skip directly to A-Z link Skip directly to A-Z link Skip directly to A-Z link
Volume 25, Number 12—December 2019

Predicting Dengue Outbreaks in Cambodia

Article Metrics
citations of this article
EID Journal Metrics on Scopus
Anthony Cousien, Julia Ledien, Kimsan Souv, Rithea Leang, Rekol Huy, Didier Fontenille, Sowath Ly, Veasna Duong, Philippe Dussart, Patrice Piola1Comments to Author , Simon Cauchemez1, and Arnaud Tarantola1
Author affiliations: Institut Pasteur du Cambodge, Phnom Penh, Cambodia (A. Cousien, J. Ledien, K. Souv, D. Fontenille, S. Ly, V. Duong, P. Dussart, P. Piola); Institut Pasteur, Paris, France (A. Cousien, S. Cauchemez); CNRS, Paris (A. Cousien, S. Cauchemez); National Center for Entomology, Parasitology and Malaria Control, Phnom Penh (R. Leang, R. Huy); Institut Pasteur, Noumea, New Caledonia (A. Tarantola)

Cite This Article


In Cambodia, dengue outbreaks occur each rainy season (May–October) but vary in magnitude. Using national surveillance data, we designed a tool that can predict 90% of the variance in peak magnitude by April, when typically <10% of dengue cases have been reported. This prediction may help hospitals anticipate excess patients.

Dengue is endemic to Cambodia; outbreaks are seasonal, occurring during the rainy season (May–October). However, the magnitude of outbreaks varies from year to year. When the epidemic is particularly large, the influx of patients with severe dengue in pediatric hospitals may saturate the healthcare system and negatively affect quality of care. However, adequate supportive care is crucial for patients with severe dengue and can decrease the fatality rate to <1% (1). Early prediction of the size of nascent dengue epidemics may improve healthcare planning and optimize allocation of healthcare resources. We used surveillance data to build a simple early warning tool based on the reported number of cases early in the season. Compared with other approaches used to predict dengue epidemics (26), this one is characterized by its simplicity because it relies only on the number of cases reported early in the season to predict the magnitude of the epidemic.

The Study

We used the monthly number of probable dengue cases reported by the National Dengue Surveillance System (NDSS) in Cambodia during 2004–2016. The NDSS includes passive surveillance of probable dengue pediatric inpatients reported by public hospitals to the Communicable Diseases Center of the Ministry of Health and a sentinel, pediatric hospital–based active surveillance system managed by the National Dengue Control Program of the National Center for Parasitology, Entomology and Malaria Control, Ministry of Health. A probable dengue case was defined as an acute febrile illness with >2 of the following: headache, retro-orbital pain, myalgia, arthralgia, rash, hemorrhage, and leukopenia, combined with either 1) a posteriori virologic confirmation, serologic confirmation, or both or 2) presence of >1 laboratory-confirmed case at the same location and time (7).

Figure 1

Thumbnail of Monthly number of probable dengue cases reported to the National Dengue Surveillance System in Cambodia, 2004–2016. Dark gray bars represent the 3 months (February, March, and April) used as predictors for the magnitude of the following peak. For each year, the month corresponding to the peak of the epidemic is indicated.

Figure 1. Monthly number of probable dengue cases reported to the National Dengue Surveillance System in Cambodia, 2004–2016. Dark gray bars represent the 3 months (February, March, and April) used as predictors for...

From January 2004 through December 2016, NDSS reported 215,574 probable dengue cases (Figure 1). During this period, we observed 2 outbreaks of particularly high magnitude, in 2007 (dengue virus serotype 3) and 2012 (dengue virus serotype 1). The magnitude of these outbreaks reached ≈10,000 cases versus the usual number of <5,000 cases. Incidence was always lowest during the dry season (i.e., November–April); the nadirs usually occurred in February and the peaks in July (8 times), August (4 times), and June (1 time, in 2007). On average, only 6.1% of the cases reported during a season (i.e., from February through January of the following year) are observed before the end of April (range 2.7%–9.0% of cases). We wanted to ascertain whether the small number of cases reported at the season’s onset (i.e., up to April) could be used as an early warning tool for predicting the magnitude of that season’s epidemic.

Figure 2

Thumbnail of Dengue cases in Cambodia, 2004–2016. A) Observed versus predicted magnitude of the peak for each dengue season. We used a simple linear regression model, M = α + βN, in which M indicates the magnitude of the peak and N the number of reported dengue-like cases in April. The black line represents the expected results with perfect prediction. B) Results for the leave-one-out cross-validation procedure.

Figure 2. Dengue cases in Cambodia, 2004–2016. A) Observed versus predicted magnitude of the peak for each dengue season. We used a simple linear regression model, M = α + βN, in which...

We observed a strong linear correlation between the magnitude of the peak and the number of cases reported at the beginning of the season, in February (Pearson correlation coefficient r = 0.78), March (r = 0.88), April (r = 0.95), February–March (r = 0.86), March–April (r = 0.95), and February–April (r = 0.94). Fitting a simple linear regression model to the data, we estimated that the number of cases reported explained the following parts of the variance in the peak magnitude for February (61%), March (78%), April (91%), February–March (73%), March–April (90%), and February–April (88%). The magnitude was therefore best predicted by the number of dengue cases reported in April. This simple model offered excellent accuracy for predicting the magnitude of the peak; mean absolute percentage error for 2007 was 2.5% and for 2012 was 1.9% (Figure 2, panel A). Predictions relying on data from March were also acceptably accurate; the error was larger, but the model was able to predict a larger than usual magnitude (Appendix Figure 1).

To evaluate the performance of our model in a real-life situation, when the outcome of the ongoing epidemic remains unknown, we used a leave-one-out cross-validation procedure (810). We obtained the predicted value for season s by fitting our regression model to the 12 other seasons (i.e., excluding season s from the set of observations used to fit the parameters of the model [the training dataset]). The predictive power of our best fitting model remained very high; it was able to explain 90% of the variance of the magnitude of epidemics (Figure 2, panel B).

Our dataset contains information for only 2 large epidemics (2007 and 2012). If we trained the model on these 2 large epidemics only, performance would remain very good (98% variance explained). In contrast, when both epidemics were excluded from the training dataset, their magnitude was underestimated by 35% (2007) and 32% (2012). As expected, to be properly calibrated, the model needs to be trained on a mix of small and large epidemics; if 1 category is excluded from the training dataset, performance may be substantially degraded.

Of note, this loss of accuracy is mostly an issue for large epidemics. Given the small number of such epidemics in our dataset, robustly demonstrating predictability from this dataset alone remains difficult. We therefore explored whether similar patterns could be observed in 4 other countries in South Asia: Thailand, Vietnam, Laos, and the Philippines (1114). To be comparable with our analysis for Cambodia, we used the month at which >5% of cases have been observed on average (Appendix Tables 4, 5). The results were promising for Vietnam (variance explained in the leave-one-out procedure was 64.3%), the Philippines (45.8%), and Thailand (33.4%) but bad for Laos (–53.5%) (Appendix Figures 3–6). This variability could be explained by several factors: national surveillance system characteristics, demographics, land cover, healthcare systems, or climate; all of these factors can affect dengue epidemiology and reporting. This analysis confirms the observation made for Cambodia that the number of dengue cases reported early in the epidemic year may provide early insight into the probable scale of the forthcoming epidemic.


The correlation between the number of patients hospitalized with probable dengue during the interepidemic period (i.e., the dry season) and the magnitude of the next outbreak peak during the rainy season was strong, even from February, which corresponds to the nadir of the incidence curves. Using dengue surveillance data for the end of the dry season (April), we were able to predict the magnitude of the peak for the next dengue outbreak, when typically <10% of cases have been observed and the peak is 2–3 months away. These results suggest that the intensity of rainfalls during the rainy season is not a major determinant of the occurrence of major outbreaks in Cambodia and that the outbreaks could be explained by conditions already present during the early stages of the outbreak (i.e., the part of the population immune to the circulating strains or weather conditions during the dry season). Our analysis is limited by the small number of epidemic seasons that are available to train our model for Cambodia (in particular, the small number of large epidemics), but similar patterns were observed in some other countries in South Asia.

In a setting where resources are limited and where pediatric hospitals face several other health issues (diarrheal diseases, other infectious diseases), the amount of available beds, medical supplies, and medical staff are usually appropriate for an average dengue outbreak. This simple and easy tool can help hospitals to plan in accordance with the predicted magnitude of the seasonal outbreak.

Dr. Cousien is a postdoctoral researcher at the Institut Pasteur. His research interests include the modeling of arbovirus epidemics.



This study was supported by the Agence Française de Développement (ECOMORE Project), the Investissement d’Avenir program, the Laboratoire d’Excellence Integrative Biology of Emerging Infectious Diseases program (grant ANR-10-LABX-62-IBEID), the Models of Infectious Disease Agent Study of the National Institute of General Medical Sciences, and the AXA Research Fund.



  1. World Health Organization. Dengue and severe dengue [cited 2017 Sep 7].
  2. Lowe  R, Gasparrini  A, Van Meerbeeck  CJ, Lippi  CA, Mahon  R, Trotman  AR, et al. Nonlinear and delayed impacts of climate on dengue risk in Barbados: A modelling study. PLoS Med. 2018;15:e1002613. DOIPubMedGoogle Scholar
  3. Lowe  R, Barcellos  C, Coelho  CAS, Bailey  TC, Coelho  GE, Graham  R, et al. Dengue outlook for the World Cup in Brazil: an early warning model framework driven by real-time seasonal climate forecasts. Lancet Infect Dis. 2014;14:61926. DOIPubMedGoogle Scholar
  4. Lauer  SA, Sakrejda  K, Ray  EL, Keegan  LT, Bi  Q, Suangtho  P, et al. Prospective forecasts of annual dengue hemorrhagic fever incidence in Thailand, 2010-2014. Proc Natl Acad Sci U S A. 2018;115:E217582. DOIPubMedGoogle Scholar
  5. Johansson  MA, Reich  NG, Hota  A, Brownstein  JS, Santillana  M. Evaluating the performance of infectious disease forecasts: A comparison of climate-driven and seasonal dengue forecasts for Mexico. Sci Rep. 2016;6:33707. DOIPubMedGoogle Scholar
  6. Reich  NG, Lauer  SA, Sakrejda  K, Iamsirithaworn  S, Hinjoy  S, Suangtho  P, et al. Challenges in real-time prediction of infectious disease: a case study of dengue in Thailand. PLoS Negl Trop Dis. 2016;10:e0004761. DOIPubMedGoogle Scholar
  7. Cambodian Ministry of Health. Standard Operating Procedures of the National Dengue Sentinel Surveillance System, National Dengue Control Program. Phnom Penh (Cambodia): The Ministry; 2010.
  8. Stone  M. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B. 1974;36:11133. DOIGoogle Scholar
  9. Allen  DM. The relationship between variable selection and data augmentation and a method of prediction. Technometrics. 1974;16:1257. DOIGoogle Scholar
  10. Geisser  S. The predictive sample reuse method with applications. J Am Stat Assoc. 1975;70:3208. DOIGoogle Scholar
  11. Van Panhuis  W, Cross  A, Burke  D, Choisy  M. Project Tycho: Philippines (dengue) dataset: 1955–2010: Project Tycho [cited 2019 Sep 13].
  12. Van Panhuis  W, Cross  A, Burke  D, Choisy  M. Project Tycho: Thailand (dengue) dataset: 1958–2010 [cited 2019 Sep 13].
  13. Van Panhuis  W, Cross  A, Burke  D, Choisy  M. Project Tycho: Lao People’s Democratic Republic (dengue) dataset 1979–2010: Project Tycho [cited 2019 Sep 13].
  14. Van Panhuis  W, Cross  A, Burke  D, Choisy  M. Project Tycho: Vietnam (dengue) dataset: 1960–2010 [cited 2019 Sep 13].




Cite This Article

DOI: 10.3201/eid2512.181193

Original Publication Date: October 31, 2019

1These senior authors contributed equally to this article.

Table of Contents – Volume 25, Number 12—December 2019

EID Search Options
presentation_01 Advanced Article Search – Search articles by author and/or keyword.
presentation_01 Articles by Country Search – Search articles by the topic country.
presentation_01 Article Type Search – Search articles by article type and issue.



Please use the form below to submit correspondence to the authors or contact them at the following address:

Patrice Piola, Institut Pasteur du Cambodge, Epidemiology and Public Health Unit, 5 Blvd Monivong, BP 983, Phnom Penh, Cambodia

Send To

10000 character(s) remaining.


Page created: November 18, 2019
Page updated: November 18, 2019
Page reviewed: November 18, 2019
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.