Skip directly to site content Skip directly to page options Skip directly to A-Z link Skip directly to A-Z link Skip directly to A-Z link
Volume 14, Number 11—November 2008

Metagenomic Diagnosis of Bacterial Infections

Article Metrics
citations of this article
EID Journal Metrics on Scopus
Shota Nakamura, Norihiro Maeda, Ionut Mihai Miron, Myonsun Yoh, Kaori Izutsu, Chidoh Kataoka, Takeshi Honda, Teruo Yasunaga, Takaaki Nakaya, Jun Kawai, Yoshihide Hayashizaki, Toshihiro Horii, and Tetsuya IidaComments to Author 
Author affiliations: Osaka University, Suita, Japan (S. Nakamura, I.M. Miron, M. Yoh, K. Izutsu, C. Kataoka, T. Honda, T. Yasunaga, T. Nakaya, T. Horii, T. Iida); RIKEN Yokohama Institute, Yokohama, Japan (N. Maeda, J. Kawai, Y. Hayashizaki);

Cite This Article


To test the ability of high-throughput DNA sequencing to detect bacterial pathogens, we used it on DNA from a patient’s feces during and after diarrheal illness. Sequences showing best matches for Campylobacter jejuni were detected only in the illness sample. Various bacteria may be detectable with this metagenomic approach.

Infectious diseases are caused by various pathogens, including as-yet unidentified microorganisms. Because procedures for detecting and identifying pathogens vary according to the target microorganism, clinical examinations require a variety of media, reagents, and culture methods. In addition, conventional examination protocols usually require much labor, time, and skill, thus forming an obstacle to a prompt diagnosis.

Newly developed, “next-generation” DNA sequencers can determine >100 megabases of DNA sequences per run (1). These new technologies eliminate the bacterial cloning step used in traditional Sanger sequencing; instead, they amplify single isolated DNA molecules and analyze them with massively parallel processing. To develop a new system to promptly detect and identify various infectious pathogens, we tapped into the potential of these novel sequencers. We directly detected the causative pathogenic microbe in a clinical human sample (diarrheic feces) by means of unbiased high-throughput DNA sequencing.

The Study

A 34-year-old man had become ill after eating dinner out with his family. After 3 days, severe diarrhea, stomach ache, and shivering developed in the only 3 persons (the patient plus 2 family members) who had eaten undercooked chicken that night. Four days after onset of clinical signs, feces were collected from the patient and stored in a freezer at –80°C. At a clinical laboratory in Osaka, Japan, conventional culture methods were used to examine the sample for possible bacterial enteropathogens (2), and specific reverse transcriptase–PCR was used to test for norovirus (3); however, no candidate pathogens were detected.

We therefore analyzed this fecal sample for possible pathogens by means of high-throughput DNA sequencing. DNA was extracted from the diarrhea sample (hereafter referred to as the illness DNA sample) with a QIAamp DNA Stool Mini Kit (QIAGEN, Valencia, CA, USA). After the man had completely recovered 3 months later, another fecal sample was collected (hereafter referred to as the recovery DNA sample) and maintained at –80°C until DNA extraction. Both DNA samples were subjected to unbiased high-throughput DNA sequencing with a GS20 sequencer (454 Life Sciences, Branford, CT, USA) (4).


Thumbnail of Comparison of the organisms from which the best matches for the sequences were derived from a BLASTN ( search with an expect-value cutoff of 10–5. A) DNA from nondiarrheic fecal sample collected 3 months after patient had recovered. B) DNA from diarrheic fecal sample collected while patient was ill.

Figure. Comparison of the organisms from which the best matches for the sequences were derived from a BLASTN ( search with an expect-value cutoff of 10–5. A) DNA from nondiarrheic fecal...

Sequencing produced 96,941 effective sequences for the illness DNA sample and 106,327 for the recovery sample. The average length of the sequences was 102.1 bp. The DNA sequences obtained were searched with the BLASTN program for the National Center for Biotechnology Information nucleotide sequence database ( The BLASTN output was then analyzed by using a classification system consisting of the Center’s taxonomy database and its searching system. This system, devised with the aid of Perl language ( and the MySQL database (, facilitates the identification of scientific names and statistical analysis. The Figure shows the organisms from which the sequences in the database were derived that showed the best matches for the sequences queried (expect [E]-value <10–5). For both DNA samples, ≈20% of the total sequences showed the best matches for the currently reported bacterial DNA sequences. The Table shows the frequency distributions of species from which close matches for the sequences were derived (E-value <10–40). The most frequently detected bacterial species in both samples belonged to the phylum Bacteroidetes, the normal flora of the human intestine. No major differences were found in the frequency of the species between the illness and recovery DNA samples.

A striking difference between the 2 samples, however, was that 156 sequences of the illness DNA sample showed best matches for the sequences derived from Campylobacter jejuni, but no sequences of the recovery DNA sample showed any such significant matches. The C. jejuni sequences from the illness DNA sample included many housekeeping genes, such as the genes for the ribosomal RNAs and DNA polymerases (Appendix Table); thus, they strongly suggested the presence of C. jejuni in the illness fecal sample.

Because C. jejuni is a bacterium that causes acute gastroenteritis and is normally not present in the intestines of healthy persons (5,6), these results prompted us to reexamine the illness fecal sample for C. jejuni. For the illness sample but not the recovery DNA sample, Campylobacter-specific PCR (7) produced a typical banding pattern that is unique to C. jejuni (data not shown). The recovery rate of Campylobacter spp. from patient specimens substantially decreases when the specimens are frozen before isolation (8). To obtain higher recovery of Campylobacter spp. and thus validate the presence of C. jejuni in the illness sample, we performed cultures with enrichment and selective media again on the frozen illness fecal sample (5). C. jejuni–like bacteria with corkscrew motility grew on selective agar plates. Biochemical identification using the API Campy kit (API-bioMérieux, Marcy L’Etoile, France) demonstrated that the organism was C. jejuni, thus proving its presence in the illness fecal sample.


We directly detected a bacterial pathogen in a patient sample by using high-throughput DNA sequencing. This finding implies that basically any kind of bacterial pathogen may be detectable with a common procedure. The method is directly applicable not only to fecal samples but also to other types of clinical samples; it could detect and identify bacterial pathogens that are usually difficult to ascertain with conventional examination procedures. Because this novel approach can be expected to have major potential for detection of pathogens in various infectious diseases, it warrants further investigation.

The approach reported here also enabled us to directly analyze the ratio of pathogenic to commensal bacteria in the human intestine. Assessment of the relative population of intestinal bacteria would enable us to investigate the dynamics of bacterial pathogens in human intestines, in relation to associated intestinal microbial flora, during infectious disease processes.

Many causative agents of emerging infectious diseases are of animal origin, and many are previously identified microbes (9,10). Because a vast amount of genome information about various microorganisms is continually being accumulated in databases, the approach we used will become increasingly useful. Recent metagenomic studies have identified unknown virus pathogens (1113). Using the present approach to analyze various clinical cases, especially of outbreaks of infectious diseases with as-yet unidentified causative agents, may lead to the discovery of novel bacteria that are currently not known to be pathogenic to humans.

The current cost for high-throughput sequencing may limit the use of this method to specialized purposes, such as the hunt for novel pathogens for research or detection of bioterrorism (14). However, because the progress of DNA sequencing technology has been rapid (1), the cost, time, and labor for sequencing have been greatly reduced, and this trend will likely continue for the foreseeable future (15). Therefore, high-throughput DNA sequencing may soon be adopted as the main method for examining microorganisms in major clinical laboratories. The data presented here represent an example of this major innovation in the field of clinical examination for causative agents of infectious diseases.

Dr Nakamura is a researcher in the Section of Bioinformatics, Thailand-Japan Research Collaboration Center on Emerging and Reemerging Infections, Research Institute for Microbial Diseases, Osaka University. His research interests have included crystallographic analysis for biomacromolecules, which he currently applies to his work in bioinformatics.



We are grateful to Y. Nagai and Y. Okamoto for their help coordinating this study, to R. Dryselius and Y. Nishimune for their helpful suggestions, to M. Tagami and H. Sano for technical support, and to N.M.Q. Palacpac for valuable comments on the text.

This study was supported by the Program of Founding Research Centers for Emerging and Reemerging Infectious Diseases, by Grants-in-Aid for Scientific Research, and by a Research Grant for the RIKEN Genome Exploration Research Project (to Y. H.) from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

This study was approved by the ethical review committees of the Research Institute for Microbial Diseases, Osaka University, and RIKEN. The sequencing data reported here are available in the Short Read Archive database at the National Center for Biotechnology Information under accession no. SRA001127.



  1. Service  RF. Gene sequencing: the race for the $1000 genome. Science. 2006;311:15446. DOIPubMedGoogle Scholar
  2. Saidi  SM, Iijima  Y, Sang  WK, Mwangudza  AK, Oundo  JO, Taga  K, Epidemiological study on infectious diarrheal diseases in children in a coastal rural area of Kenya. Microbiol Immunol. 1997;41:7738.PubMedGoogle Scholar
  3. Sakon  N, Yamazaki  K, Yoda  T, Tsukamoto  T, Kase  T, Taniguchi  K, Norovirus storm in Osaka, Japan, last winter (2006/2007). Jpn J Infect Dis. 2007;60:40910.PubMedGoogle Scholar
  4. Margulies  M, Egholm  M, Altman  WE, Attiya  S, Bader  JS, Bemben  LA, Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:37680.PubMedGoogle Scholar
  5. Penner  JL. The genus Campylobacter: a decade of progress. Clin Microbiol Rev. 1988;1:15772.PubMedGoogle Scholar
  6. Young  KT, Davis  LM, DiRita  VJ. Campylobacter jejuni: molecular biology and pathogenesis. Nat Rev Microbiol. 2007;5:66579. DOIPubMedGoogle Scholar
  7. Fermér  C, Engvall  EO. Specific PCR identification and differentiation of the thermophilic campylobacters, Campylobacter jejuni, C. coli, C. lari, and C. upsaliensis. J Clin Microbiol. 1999;37:33703.PubMedGoogle Scholar
  8. Altekruse  SF, Stern  NJ, Fields  PI, Swerdlow  DL. Campylobacter jejuni—an emerging foodborne pathogen. Emerg Infect Dis. 1999;5:2835.PubMedGoogle Scholar
  9. Morens  DM, Folkers  GK, Fauci  AS. The challenge of emerging and re-emerging infectious diseases. Nature. 2004;430:2429. DOIPubMedGoogle Scholar
  10. Jones  KE, Patel  NG, Levy  MA, Storeygard  A, Balk  D, Gittleman  JL, Global trends in emerging infectious diseases. Nature. 2008;451:9903. DOIPubMedGoogle Scholar
  11. Cox-Foster  DL, Conlan  S, Holmes  EC, Palacios  G, Evans  JD, Moran  NA, A metagenomic survey of microbes in honey bee colony collapse disorder. Science. 2007;318:2837. DOIPubMedGoogle Scholar
  12. Palacios  G, Druce  J, Du  L, Tran  T, Birch  C, Briese  T, A new arenavirus in a cluster of fatal transplant-associated diseases. N Engl J Med. 2008;358:9918. DOIPubMedGoogle Scholar
  13. Finkbeiner  SR, Allred  AF, Tarr  PI, Klein  EJ, Kirkwood  CD, Wang  D. Metagenomic analysis of human diarrhea: viral detection and discovery. PLoS Pathog. 2008;4:e1000011. DOIPubMedGoogle Scholar
  14. Lim  DV, Simpson  JM, Kearns  EA, Kramer  MF. Current and developing technologies for monitoring agents of bioterrorism and biowarfare. Clin Microbiol Rev. 2005;18:583607. DOIPubMedGoogle Scholar
  15. von Bubnoff  A. Next-generation sequencing: the race is on. Cell. 2008;132:7213. DOIPubMedGoogle Scholar




Cite This Article

DOI: 10.3201/eid1411.080589

Table of Contents – Volume 14, Number 11—November 2008

EID Search Options
presentation_01 Advanced Article Search – Search articles by author and/or keyword.
presentation_01 Articles by Country Search – Search articles by the topic country.
presentation_01 Article Type Search – Search articles by article type and issue.



Please use the form below to submit correspondence to the authors or contact them at the following address:

Tetsuya Iida, International Research Center for Infectious Diseases, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan;

Send To

10000 character(s) remaining.


Page created: July 21, 2010
Page updated: July 21, 2010
Page reviewed: July 21, 2010
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.