Zoonotic Source Attribution of Salmonella enterica Serotype Typhimurium Using Genomic Surveillance Data, United States
Shaokang Zhang, Shaoting Li, Weidong Gu, Henk den Bakker, Dave Boxrud, Angie Taylor, Chandler Roe, Elizabeth Driebe, David M. Engelthaler, Marc Allard, Eric Brown, Patrick McDermott, Shaohua Zhao, Beau B. Bruce, Eija Trees, Patricia I. Fields, and Xiangyu Deng
Author affiliations: University of Georgia Center for Food Safety, Griffin, Georgia, USA (S. Zhang, S. Li, H. den Bakker, X. Deng); Centers for Disease Control and Prevention, Atlanta, Georgia, USA (W. Gu, B.B. Bruce, E. Trees, P.I. Fields); Minnesota Department of Health, St. Paul, Minnesota, USA (D. Boxrud, A. Taylor); Translational Genomics Research Institute, Flagstaff, Arizona, USA (C. Roe, E. Driebe, D.M. Engelthaler); US Food and Drug Administration, College Park, Maryland, USA (M. Allard, E.W. Brown); US Food and Drug Administration, Laurel, Maryland, USA (P. McDermott, S. Zhao)
Figure 3. Source prediction by Random Forest classifier. A) Predicted source probabilities for zoonotic Salmonella enterica serotype Typhimurium isolates. Each vertical line in a panel is color coded by predicted source probabilities to proportion: cyan, bovine; yellow, poultry; blue, swine; light green, wild bird. B) Comparison of SDIs of predicted probabilities between BPSW and non-BPSW isolates. For each isolate, SDI was calculated among predicted probabilities of the 4 sources. Red horizontal lines indicate median SDI values; blue box tops and bottoms indicate interquartile ranges; whiskers indicate maximum and minimum SDI values. C) Receiver operating characteristics (ROC) curve of differentiating BPSW and non-BPSW isolates using SDI of predicted source probabilities. The AUC was 0.8, suggesting good binary classification. Red line indicates ROC curve; dotted line indicates diagonal line across the ROC space. D) Summary of source prediction results of 1,473 Salmonella Typhimurium isolates. Rectangles with solid and dashed lines represent precise (SDI <0.45) and imprecise (SDI >0.45) predictions, respectively. Dark gray rectangles, BPSW isolates; light gray rectangles, non-BPSW isolates. The number in each enclosed area is the number of isolates in the category. The sizes of enclosed and gray areas are in proportion to the numbers of isolates they represent. The 70 precisely but incorrectly predicted BPSW isolates are shown with outline. The 51 precisely predicted human isolates were attributed to zoonotic sources: cyan, bovine; yellow, poultry; blue, swine; light green, wild bird. The sizes of source colored rectangles are proportional to the numbers of isolates in the predicted source classes. AUC, area under the ROC curve; BPSW, bovine, poultry, swine, or wild bird; SDI, Simpson diversity index.
Page created: December 18, 2018
Page updated: December 18, 2018
Page reviewed: December 18, 2018
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.