Skip directly to site content Skip directly to page options Skip directly to A-Z link Skip directly to A-Z link Skip directly to A-Z link

Disclaimer: Early release articles are not considered as final versions. Any changes will be reflected in the online version in the month the article is officially released.

Volume 30, Number 8—August 2024
Research

Standardized Phylogenetic Classification of Human Respiratory Syncytial Virus Below the Subgroup Level

Author affiliations: University of Washington, Seattle, Washington, USA (S. Goya); University of Cambridge, Cambridge, UK (C. Ruis); University of Basel and SIB, Basel, Switzerland (R.A. Neher, C. Roemer, L. Urbanska); National Institute for Public Health and the Environment, Bilthoven, the Netherlands (A. Meijer, L.D. Presser); World Health Organization Collaborating Centre for Reference and Research on Influenza, Melbourne, Victoria, Australia (A. Aziz); University of California Santa Cruz, Santa Cruz, California, USA (A.S. Hinrichs, J. McBroome); National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa (A. von Gottberg, J.N. Bhiman, J. Everatt, N. Wolter); University of Witwatersrand, Johannesburg, South Africa (A. von Gottberg, J.N. Bhiman, N. Wolter); University of KwaZulu-Natal, Durban, South Africa (D.G. Amoako); Universidad Nacional de La Plata, Buenos Aires, Argentina (D. Acuña, M. Viegas); National Scientific and Technical Research Council, Buenos Aires, Argentina (D. Acuña, M. Viegas); Theiagen Genomics, Highlands Ranch, Colorado, USA (J.R. Otieno); Autonomous University of San Luis Potosí, San Luis Potosí, Mexico (J.C. Muñoz-Escalante); Rega Institute for Medical Research, Leuven, Belgium (K. Ramaekers); University of Edinburgh, Edinburgh, Scotland, UK (K. Duggan); University of Pretoria, Pretoria, South Africa (M. Venter); University of Texas Medical Branch, Galveston, Texas, USA (T.C.T. Peret); Tehran University of Medical Sciences, Tehran, Iran (V. Salimi); ICMR National Institute of Virology, Pune, India (V. Potdar); National Institute of Health Doutor Ricardo Jorge, Lisbon, Portugal (V. Borges)

Suggested citation for this article

Abstract

A globally implemented unified phylogenetic classification for human respiratory syncytial virus (HRSV) below the subgroup level remains elusive. We formulated global consensus of HRSV classification on the basis of the challenges and limitations of our previous proposals and the future of genomic surveillance. From a high-quality curated dataset of 1,480 HRSV-A and 1,385 HRSV-B genomes submitted to the National Center for Biotechnology Information Virus and GISAID (https://www.gisaid.org) public sequence databases through March 2023, we categorized HRSV-A/B sequences into lineages based on phylogenetic clades and amino acid markers. We defined 24 lineages within HRSV-A and 16 within HRSV-B and provided guidelines for defining prospective lineages. Our classification demonstrated robustness in its applicability to both complete and partial genomes. We envision that this unified HRSV classification proposal will strengthen HRSV molecular epidemiology on a global scale.

Human respiratory syncytial virus (HRSV) is a leading cause of acute lower respiratory tract infection in children, elderly, and immunocompromised persons. In 2023, the US Food and Drug Administration and the European Medicines Agency approved the first HRSV vaccines (1,2). Simultaneously, a monoclonal antibody was approved for widespread use in infants and not limited to high-risk and premature children (3). The availability of HRSV immunization highlights the role of molecular epidemiology as a tool to monitor their efficacy. Standards for HRSV nomenclature for sharing of viral isolates and sequences in databases have been published (4). Nevertheless, a standardized HRSV phylogenetic classification system has yet to be defined and implemented.

Figure 1

The structure and genome of human respiratory syncytial virus (HRSV). A) Schematic of the HRSV virion structure detailing the location of structural proteins. B) Schematic of the HRSV genome organization with the approximated location of genes highlighted; the exact location slightly differs between subgroups and strains. The location of the second hypervariable region in the G gene, used originally for molecular epidemiology classification, is detailed. Red arrow in panel B indicates location of the G gene 72-nt duplication in HRSV-A and 60-nt duplication in HRSV-B. Figure created with BioRender (https://www.biorender.com). ORF, open reading frame; NS, nonstructural protein; N, nucleocapsid; P, phosphoprotein; M, matrix protein; SH, small hydrophobic protein; G, attachment glycoprotein; F, fusion glycoprotein; M2, M2 protein; L, large polymerase protein

Figure 1. The structure and genome of human respiratory syncytial virus (HRSV). A) Schematic of the HRSV virion structure detailing the location of structural proteins. B) Schematic of the HRSV genome organization...

In 2022, HRSV was designated as Orthopneumovirus hominis species within the Pneumoviridae family. Below species level are 2 antigenic groups, known as HRSV subgroup A (HRSV-A) and B (HRSV-B), that were previously referred to as subtypes (46). Within each subgroup, genotypes were initially defined based on statistically supported phylogenetic clades inferred with the second hypervariable region of the G gene (Figure 1, panels A, B) (7). The G gene, encoding the attachment glycoprotein, exhibits the highest genetic and antigenic variability. Of note, the gene has undergone a duplication of a 72-nt fragment in HRSV-A and 60-nt fragment in HRSV-B (Figure 1, panel B) (8,9).

To identify emerging genotypes, researchers have used genetic distances between phylogenetic clades and distinctive genetic features, accompanied by variable nomenclature based on the gene (GA1–GA7 in HRSV-A and GB1–GB4 in HRSV-B), country and subgroup (SAB1–SAB4 for South African genotypes in HRSV-B), or city and province (NA1–NA2 [Niigata] and ON1 [Ontario] in HRSV-A, BA1–BA9 [Buenos Aires] in HRSV-B) (716). Since 2020, alternative phylogenetic reclassifications have been proposed; Goya et al. established a hierarchical classification system for HRSV phylogenies, comprising genotypes, subgenotypes, and lineages, using the G gene (17). That framework enabled laboratories without capacity for whole-genome sequencing to conduct molecular epidemiology studies. Independently, Ramaekers et al. (18) proposed reclassifications into lineages and Chen et al. (19) into genotypes using complete HRSV genomes. Those approaches support comprehensive monitoring of viral evolution across all genes, including the F gene encoding the fusion protein, a crucial target for monoclonal antibodies and the foundation of approved HRSV vaccines (Figure 1, panel A). Of note, challenges in HRSV molecular epidemiology persisted within the reclassification-defined categories because of reliance on genetic or patristic distances between tree tips or nodes.

The milestones achieved in HRSV interventions have renewed interest in addressing the challenge of classifying HRSV below the subgroup level. Those advances prompted establishment of the HRSV Genotyping Consensus Consortium (RGCC), formed by HRSV and virus evolution experts aiming to provide standardized criteria for harmonizing global HRSV molecular surveillance efforts. We present a novel framework for HRSV classification below the subgroup level, based on current knowledge of HRSV diversity and evolution, focused on practical implementation for molecular epidemiology.

Methods

HRSV Sequences Dataset

We downloaded HRSV complete genomes from the National Center for Biotechnology Information Virus (https://www.ncbi.nlm.nih.gov/labs/virus) and GISAID EpiRSV (https://www.gisaid.org) databases through March 11, 2023, using a filter for sequence length >14,000-nt, obtained from human hosts and including the year and country of the sample collection (Appendix 1 Figure 1). We reserved sequences containing nucleotide ambiguities, indicating inadequate sequencing depth, for epidemiologic analysis but excluded them from formal lineage definition (Appendix 1).

We aligned sequences with MAFFT version 7.490, and inspected and corrected alignment artifacts with Aliview version 1.28 (https://ormbunkar.se/aliview), mainly in the G gene (20,21). We trimmed alignment ends to encompass complete genomes from the first codon of the first gene (NS1) to the last codon of the last gene (L). We considered partial genomes if the lack of sequence was within 50 nt of the genome ends. We used RSVsurver (https://rsvsurver.bii.a-star.edu.sg) to identify and remove genomes with nucleotide insertions or deletions causing frameshift in any open reading frame. After alignment trimming, detection of identical sequences prompted redundancy removal using BBmap (https://jgi.doe.gov/data-and-tools/software-tools/bbtools), resulting in the final set of 1,538 HRSV-A and 1,387 HRSV-B genomes (Appendix 1 Figure 1).

Phylogenetic Analysis

We constructed maximum-likelihood phylogenetic trees with IQ-TREE version 2.2.0 (http://www.iqtree.org) (Appendix 1). We considered monophyletic clades statistically supported when SH-aLRT value was >80% and UFBoot2 value was >90% (22,23) (Appendix 1). We assessed temporal signal with TempEst version 1.5.3 (http://tree.bio.ed.ac.uk/software/tempest), and we inferred molecular-clock phylogenies with TreeTime (https://github.com/neherlab/treetime) (24).

We inferred the ancestral sequence reconstruction using Augur bioinformatic toolkit version 23.1.0 (https://docs.nextstrain.org/projects/augur/en/23.1.0) (25). We assessed recombination events by alignment-based method using RDP4 (http://web.cbio.uct.ac.za/~darren/rdp.html) and phylogenetic-based TreeKnit (https://pierrebarrat.github.io/TreeKnit.jl) (Appendix 1). We inferred the amino acid substitutions linked to the clades in the tree using Augur and automated the initial screening of lineages with Autolin (26). We manually curated amino acid comparison among monophyletic clusters to rectify conflicts arising from internal (nested) lineages and the confirmation of the lineage-defining amino acids in >90% of the clade’s sequences. Results are available at https://github.com/rsv-lineages/Classification_proposal.

Results

Baseline Agreements on the HRSV Classification Definition

Our proposed classification establishes HRSV lineages for viruses below subgroup level. Studies have shown that HRSV phylogenetic trees constructed with complete genomes exhibit superior resolution (1719). Therefore, we defined a classification system based on maximum-likelihood phylogenetic trees inferred from complete HRSV genomes. The maximum-likelihood algorithm formulates hypotheses about the evolutionary relationships among sequences; the implementation within IQ-TREE dealing with large datasets makes it particularly well suited to assert HRSV genomic phylogeny including sequences collected >50 years ago (22). We defined complete HRSV genomes to the nucleotide sequences spanning from the first codon of the first gene (NS1) to the last codon of the last gene (L). We considered almost-complete genomes if the sequence information gaps were within a 50-nt window at the genome ends. To define lineages, we only used genomes without nucleotide ambiguities (in accordance with the IUPAC code for nucleotide degeneracy).

Genomic Dataset Used for Lineages Definition

Figure 2

The global HRSV genomics surveillance landscape. HRSV genomes from National Center for Biotechnology Virus and GISAID (https://www.gisaid.org) databases through March 11, 2023, that met inclusion criteria used for classification are shown by year of sample collection and subgroup (A) and by country of origin (B). HRSV, human respiratory syncytial virus.

Figure 2. The global HRSV genomics surveillance landscape. HRSV genomes from National Center for Biotechnology Virus and GISAID (https://www.gisaid.org) databases through March 11, 2023, that met inclusion criteria used for...

Applying the established baseline agreements, we gathered 1,538 HRSV-A and 1,387 HRSV-B high-quality genomes from public databases. The dataset revealed a limited global HRSV genomic surveillance; <20 genomes deposited annually through 2007 (Figure 2, panel A; Appendix 1 Figure 2). Since 2008, the number of genomes and representation of countries improved; a surge occurred after 2021, probably driven by expansion of viral genomics since the SARS-CoV-2 pandemic and the approval of the HRSV prophylactic treatments (Figure 2, panel A; Appendix 1 Figure 2). Considering delays in genome deposition in public databases, the number of genomes in 2022 may be higher than those used in this study. Regarding geographic representation, 9 countries (Australia, United Kingdom, New Zealand, United States, Argentina, Kenya, Morocco, Netherlands, and Brazil) submitted >100 genomes; only the United Kingdom achieved uninterrupted surveillance since 2008, but Australia deposited the most genomes globally (Figure 2, panel B).

Accurate Root Placement in HRSV Phylogenetic Trees

We reconstructed maximum-likelihood phylogenetic trees for the HRSV-A and HRSV-B datasets. We used 2 approaches to root the trees: the use of an outgroup, a conventional method for inferring the tree root using sequences known to be evolutionarily distant; and phylodynamic analysis, integrating temporal and phylogenetic patterns in virus evolution (Appendix 1). Both approaches consistently identified the same root for each subgroup cluster (Appendix 1 Figure 3). Phylodynamic analysis also identified 58 outlier sequences for HRSV-A and 2 for HRSV-B that were excluded from lineage designation. The final dataset considered for lineage designation comprised 1,480 HRSV-A and 1,385 HRSV-B genomes (Appendix 2 Table).

HRSV Lineage Definition

We defined HRSV lineage as a statistically supported monophyletic cluster comprising >10 sequences and characterized by >5 aa substitutions, compared to the parental lineage. The lineage-defining amino acids, present in >90% of the sequences within the clade, may be found in any of the viral proteins.

Phylogenetic classifications vary among viral species aiming to define clusters reflecting the heterogeneity of the viral population, considering each virus unique evolutionary characteristics and using arbitrary thresholds for long-term applicability (2729). Inherent bias exists in any classification system because of availability and spatiotemporal representation sequences. Therefore, our HRSV lineage definition did not include criteria of sequences from different outbreaks or countries to enable early detection of novel lineages. However, we propose establishing a threshold of >10 genomes for defining a lineage to monitor HRSV strains circulating within communities.

We observed the presence of distinctive signature amino acids shared by sequences of a phylogenetic clade in comparison to the parental lineage is a simple method to identify a new lineage. Methods (i.e., average nucleotide genetic distances, average patristic distances, or patristic distances between nodes) need phylogenies with complete datasets to define new categories, becoming complex with rapid increases of available sequences (1619). In our proposal, we initially screened different amino acid thresholds in an automated manner, ranging from 1–10 lineage-defining amino acids (Appendix 1). The number of small lineages decreased as the number of lineage-defining amino acids increased, and 5 amino acids resulted in an intermediate complexity of lineages defined for both HRSV subgroups. Furthermore, we proposed that the lineage-defining amino acids should be conserved in >90% of the genomes within a clade, considering the potential reversion in some of the genomes within highly mutated hotspot sites. We acknowledged that other numbers of genomes or amino acids thresholds could be useful, but we emphasized that the key to establishing a global consensus is clear operational guidelines and a robust classification, 2 aspects that our proposal fulfills.

HRSV Lineage Nomenclature

Figure 3

Human respiratory syncytial virus A lineage classification and seasonality. A) HRSV-A maximum-likelihood phylogenetic tree (1,480 sequences), colored by lineage classification. Black star indicates A.D lineage, defined by the 72-nt duplication in the G gene. Scale bar indicates substitutions per site. B) Simplified scheme of the lineage designation to highlight the presence of nested lineages. The amino acid changes in the F glycoprotein are listed next to lineage name and colored according to their location in the fusion protein.

Figure 3. Human respiratory syncytial virus A lineage classification and seasonality. A) HRSV-A maximum-likelihood phylogenetic tree (1,480 sequences), colored by lineage classification. Black star indicates A.D lineage, defined by the 72-nt duplication...

Figure 4

Human respiratory syncytial virus B lineages classification and seasonality. A) HRSV-B maximum-likelihood phylogenetic tree (1,385 sequences), colored according to lineage classification. Black star indicates B.D lineage, defined by the 60-nt duplication in the G gene. Scale bar indicates substitutions per site. B) Simplified scheme of the lineage designation to highlight the presence of nested lineages. The amino acid changes in the F glycoprotein are listed next to lineage name and colored according to their location in the fusion protein.

Figure 4. Human respiratory syncytial virus B lineages classification and seasonality. A) HRSV-B maximum-likelihood phylogenetic tree (1,385 sequences), colored according to lineage classification. Black star indicates B.D lineage, defined by the 60-nt...

We defined the lineage nomenclature integrating the HRSV subgroup letter and ascending ordinal numbers, separated by dots to represent nested lineages (Figure 3, panels A, B; Figure 4, panels A, B). Furthermore, we assigned a distinct nomenclature to the 72-nt (24-aa) G-gene duplication within HRSV-A and 60-nt (20-aa) G-gene duplication within HRSV-B. Those genetic events are epidemiologically relevant, because only viruses with G-gene duplication have been detected since 2017 (3033). To track those viruses, we used the alias D, specifically A.D (historically, ON1 genotype) for HRSV-A and B.D (historically, BA genotype), for HRSV-B and nested lineages with increasing ordinal numbers. In summary, letters A and B indicate the HRSV subgroup at the beginning of the lineage name, C is unused, and D serves as an alias for 72-nt and 60-nt duplication within the G gene. In addition, aliases starting from E are limited to 3 numerical levels of nested lineages, preventing indefinite accumulation of numbers. For example, B.D.4.1.1 lineage has descendant lineages named B.D.E.1–B.D.E.4 instead of B.D.4.1.1.1–B.D.4.1.1.4, where E represents 4.1.1 (Figure 4, panels A, B). The nomenclature is based on the tree topology, reflecting the order of the nodes from the root to the tips, but it is unrelated to the sequence collection date or date of the most recent common ancestor of the lineage.

To remain functional, a nomenclature system requires periodic updates as new lineages emerge. Therefore, we have established 2 open repositories on GitHub containing definitions of each lineage, signature mutations, and representative sequences. The repositories are available at https://github.com/HRSV-lineages/lineage-designation-A and https://github.com/HRSV-lineages/lineage-designation-B; they are intended to provide up-to-date definitions and serve as a platform for discussion and designation of novel lineages.

Lineages within the HRSV-A and HRSV-B Rooted Trees

We reconstructed ancestral sequences at the root of the phylogenetic trees. Although the sequences are not biologically real, they served as surrogate parental lineages during initial classification. Identifying monophyletic clusters with >10 sequences and >5 aa changes compared with the reconstructed root sequence, we defined 3 HRSV-A lineages (A.1–A.3) and 4 HRSV-B lineages (B.1–B.4). We were unable to classify 2 sequences, EPI-ISL-15771600_USA_1956 (GISAID) and MG642074_USA_1980 (GenBank), perhaps because they belong to underrepresented extinct lineages.

We further analyzed the first lineages in an iterative manner to identify nested lineages; as a result, we identified a total of 24 lineages within HRSV-A, and 16 within HRSV-B (Figure 3; Figure 4). Close to the root of the HRSV-B tree, extinct lineages were underrepresented, comprising <10 sequences but featuring >5 distinct amino acids (B.1, B.3, B.4). Despite the low number of sequences, we included them as lineages to trace evolutionary branches that gave rise to currently circulating lineages. In addition, A.D.2 is slightly below the sequence threshold; nonetheless, we kept the lineage category to emphasize the common ancestor among A.D.2.1 and A.D.2.2.

We scrutinized the presence and absence of the duplication in the G gene across each tree. Although patterns were mostly as expected with a single historical duplication event, some genomes within the clade with the duplication in G lacked the duplication. The dispersed association of these sequences in the phylogenetic tree, rather than the monophyletic cluster we expected, suggests the virus did not lose the nucleotide duplication (Appendix 1 Figure 4). Instead, similar read length to the duplication region of certain short-read next-generation sequencing technologies potentially masked the presence of the duplication when used in the consensus genome assembly with reference sequences that do not possess the nucleotide duplication. Therefore, we recommend using such data with quality filtered reads of a length >150-nt to avoid this problem.

Lineage-defining amino acids were present in all HRSV proteins, primarily identified within the G protein (Tables 1, 2). Also, the lineage-defining amino acids at polymerase L protein were noteworthy, contributing to the distinction of 21 of 24 HRSV-A lineages and 15 of 16 HRSV-B lineages (Tables 1, 2). Of interest, the F protein contributed to define 14 lineages in HRSV-A and 13 in HRSV-B (Figure 3, panel B; Figure 4, panel B). The G and F surface glycoproteins are likely under selection pressure from antibody-mediated immunity and exhibit a robust phylogenetic signal (18,31). Whereas the G protein displays substantial nucleotide and amino acid sequence plasticity, the F protein experiences strong negative selection, likely attributed to functional or structural constraints (34). For instance, the fusion peptide is the only region in F without lineage-defining amino acids (Figure 3, panel B; Figure 4, panel B). Although the low diversity of the F protein is promising for HRSV interventions, monitoring the F protein during global implementation is essential to estimate the antigenic impact of amino acid substitutions.

Using G and F Sequences with the HRSV Lineage Classification System

The main challenge for global expansion of HRSV genomics is the absence of a cost-effective, globally standardized and validated methodology for sequencing, in contrast to SARS-CoV-2 or influenza virus (35,36). In addition, limited funding and infrastructure cause some laboratories to prefer sequencing the G gene only (3739). Although we highly recommend using complete genomes for HRSV lineage assignment to ensure the maximum accuracy of the classification and monitor the amino acid changes in all viral proteins, partial genomes covering the G and F genes can be used because overall they reproduce the topology of the HRSV tree (17,18). We do not recommend the use of smaller G gene regions such as the second hypervariable region (250-nt length at the 3′ gene end) (Figure 1) that was used historically for molecular epidemiology because previous reports showed a decreased phylogenetic signal (17). The use of G, F, or both genes for lineage classification should rely on phylogenetic associations with reference sequences. Of note, using only G and F genes is inadequate for defining novel lineages because of the inability to detect lineage-defining amino acids across all viral proteins. Our analysis showed minimal misclassification (1.2%) in HRSV-A and none in HRSV-B when using only the G gene (Appendix 1 Figure 5). However, the G ectodomain alone resulted in an 18.86% misclassification rate for HRSV-A and none for HRSV-B. The F gene alone had misclassification rates of 38.18% for HRSV-A and 1.23% for HRSV-B because of polytomies affecting lineage assignments within A.D.1 and A.D.5. Combining G and F gene fragments reduced misclassification to 0.07% for HRSV-A and none for HRSV-B, indicating that this approach provides optimal resolution for both subgroups (Appendix 1 Figure 5).

Prospective HRSV Lineage Assignment and Definition

Assigning sequences to the existing lineages can be automated using online tools such as NextClade (https://clades.nextstrain.org) (40), ReSVidex (https://cacciabue.shinyapps.io/resvidex_wg), INSaFLU (https://insaflu.insa.pt) (41), or UShER (https://usher.bio) (42). However, to define a novel lineage, we encourage users to follow our guidelines (Appendix 1), available on GitHub (https://github.com/orgs/rsv-lineages/repositories). We anticipate new lineages of HRSV-A/B will continue to emerge, and we envision updating our proposed nomenclature to incorporate new lineages. We encourage reporting of new HRSV lineages at the RGCC GitHub page as an issue within the corresponding repository for HRSV-A/B. The RGCC study group will evaluate the newly proposed lineage and update reference alignments if confirmed.

Importantly, assigning the lineage of a query sequence does not require the use of complete genomes or the absence of nucleotide ambiguities; rather, it requires a supported association within a phylogenetic clade. However, defining a new lineage requires the use of complete genomes without ambiguities, because amino acid characterization of all viral proteins is essential.

Molecular Epidemiology of HRSV with Proposed Classification

Figure 5

Temporal distribution of HRSV-A and HRSV-B lineages. A total of 2,744 HRSV-A genomes and 2,443 HRSV-B genomes available in public databases through March 2023 were included. HRSV, human respiratory syncytial virus.

Figure 5. Temporal distribution of HRSV-A and HRSV-B lineages. A total of 2,744 HRSV-A genomes and 2,443 HRSV-B genomes available in public databases through March 2023 were included. HRSV, human respiratory syncytial...

We described the HRSV molecular epidemiology including all available genomes, even those previously discarded during the dataset curation. We analyzed the seasonality of lineages using a dataset comprising 2,277 HRSV-A and 2,058 HRSV-B genomes, revealing notable co-circulation and lineage replacement over time (Figure 5). In HRSV-A, A.1 and A.2 lineages are extinct: the last detected sequences of A.1 were collected in 1995 and of A.2 in 2015. Since 2011, A.D and nested lineages continue to circulate; A.D.2.2 and A.D.4 were detected in 2013, indicating rapid divergence of the HRSV-A viruses with the 72-nt duplication in G gene. In HRSV-B, lineages B.1, B.2, B.3, and B.4 exhibited strong lineage replacement (Figure 5). Although the B.D lineage with a 60-nt duplication in the G gene (B.D lineage) was detected in 1999, complete genomes became available in 2005 (8). By 2009, only B.D and nested lineages were detected, and since 2017, only B.D.4 and nested lineages have been observed.

HRSV lineages may have been underrepresented before the COVID-19 pandemic because of limited genomic surveillance. However, our classification system allows for updates if prepandemic genomes meeting lineage criteria are shared. Some lineages, such as A.D.3.1, A.D.5.2, and A.D.5.3 in HRSV-A and B.D.E.1 and B.D.E.3 in HRSV-B, appear to be exclusive to the postpandemic period, although most of their lineage-defining amino acid were present in parental prepandemic strains. For instance, A.D.5.2 was recognized as a distinct lineage with the emergence of the C26Y substitution in M2–2 whereas other signature amino acids were present in a 2019 parental lineage genome (GenBank accession no. MZ515825). Detection of postpandemic lineages does not contradict studies reporting no new post-pandemic genotypes because those studies relied on earlier classification systems (4346). The possibility that these new lineages circulated before the pandemic depends on the deposition of genomes.

Some of the lineages were detected in specific countries (Appendix 1 Figure 6). For example, A.D.1 descendant lineages, A.D.5.3 and most of B.D.E.4 cases were identified in Australia or New Zealand. Contemporary lineages such as B.D.4.1.1 and descendants B.D.E.1 and B.D.E.3, predominantly consisted of sequences from the United Kingdom. Global genomic surveillance bias presents a major confounding factor in lineage geodetection; for instance, most of the earliest lineages were detected in the United States, the principal contributor of HRSV genomes until 2007 (Appendix 1 Figures 2, 6).

Discussion

Consensus classification of HRSV below the subgroup level has been a challenge for multiple decades. Collaboratively, the HRSV molecular evolution research community, along with experts in the evolution of other respiratory viruses, have worked toward establishing a unified global classification system in the initiative HRSV Genotyping Consensus Consortium (RGCC). Our proposal categorizes HRSV-A/B sequences into lineages based on phylogenetic associations and amino acid markers, relying on complete genomes. Partial or low-quality genomes can be assigned to the existing lineages, emphasizing the robustness of this system. We developed standard guidelines for lineage definition and assignment and created online resources for updates, ensuring long-term utility. Defining a viral category below species through a phylogenetic-based classification is challenging; the system must exhibit reproducibility, balance complexity, and be updatable to capture the level of heterogeneity useful for viral surveillance. Our proposal addresses those requirements comprehensively.

HRSV is not an emerging virus; it generates annual outbreaks with co-circulation and replacement in the prevalence of its antigenic subgroups. Although some RSV genomes were collected from clinical samples >50 years ago, the largest increase in the number of genomes has occurred since 2021. A limitation of our definition is the uncertainty of the antigenic effect of individual amino acid substitutions on lineages. Hence, whole-genome surveillance together with the study of lineage-phenotype association are essential, as observed in genetic and antigenic characterization in influenza to estimate the effectiveness of immunization (47). In 2023, recombinant F protein vaccines were approved; as their implementation progresses, we will learn how the vaccines affect viral evolution. We expect our unification proposal for the phylogenetic classification of HRSV to support spatiotemporal comparative lineage surveillance and detection of emerging lineages. In addition, we anticipate studies of association between lineages and the severity of HRSV disease, as well as associations of particular lineages with patients’ demographic characteristics.

Dr. Goya is a postdoctoral researcher in the Department Laboratory of Medicine and Pathology at the University of Washington. Her work focuses on respiratory virus evolution and interactions with the immune system.

Top

Acknowledgments

We acknowledge the authors who have shared HRSV genomes on the public databases National Center for Biotechnology Information Virus, GenBank, European Nucleotide Archive, DDBJ, and GISAID EpiRSV. We thank the researchers and public health scientists who provided valuable comments during the initial stages of the RSV Genotyping Consensus Consortium’s work.

Authors between the second and last were listed alphabetically by name.

R.A.N. consults for Moderna on matter in virus evolution. N.W. has received grant funding from the Bill and Melinda Gates Foundation and Sanofi. The authors received no financial support for the research, authorship, or publication of this article.

This article was preprinted at https://www.medrxiv.org/content/10.1101/2024.02.13.24302237v1.

Top

References

  1. European Medicine Agency. Arexvy. 2023 [cited 2024 Jun 28]. https://www.ema.europa.eu/en/medicines/human/EPAR/arexvy
  2. US Food and Drug Administration. Abrysvo. 2023 [cited 2024 Jun 28]. https://www.fda.gov/vaccines-blood-biologics/abrysvo
  3. US Food and Drug Administration. Nirsevimab. 2023 [cited 2023 Nov 12]. https://www.fda.gov/news-events/press-announcements/fda-approves-new-drug-prevent-rsv-babies-and-toddlers
  4. Salimi  V, Viegas  M, Trento  A, Agoti  CN, Anderson  LJ, Avadhanula  V, et al. Proposal for human respiratory syncytial virus nomenclature below the species level. Emerg Infect Dis. 2021;27:19. DOIPubMedGoogle Scholar
  5. Anderson  LJ, Hierholzer  JC, Tsou  C, Hendry  RM, Fernie  BF, Stone  Y, et al. Antigenic characterization of respiratory syncytial virus strains with monoclonal antibodies. J Infect Dis. 1985;151:62633. DOIPubMedGoogle Scholar
  6. Tian  D, Battles  MB, Moin  SM, Chen  M, Modjarrad  K, Kumar  A, et al. Structural basis of respiratory syncytial virus subtype-dependent neutralization by an antibody targeting the fusion glycoprotein. Nat Commun. 2017;8:1877. DOIPubMedGoogle Scholar
  7. Peret  TCT, Hall  CB, Schnabel  KC, Golub  JA, Anderson  LJ. Circulation patterns of genetically distinct group A and B strains of human respiratory syncytial virus in a community. J Gen Virol. 1998;79:22219. DOIPubMedGoogle Scholar
  8. Trento  A, Galiano  M, Videla  C, Carballal  G, García-Barreno  B, Melero  JA, et al. Major changes in the G protein of human respiratory syncytial virus isolates introduced by a duplication of 60 nucleotides. J Gen Virol. 2003;84:311520. DOIPubMedGoogle Scholar
  9. Eshaghi  A, Duvvuri  VR, Lai  R, Nadarajah  JT, Li  A, Patel  SN, et al. Genetic variability of human respiratory syncytial virus A strains circulating in Ontario: a novel genotype with a 72 nucleotide G gene duplication. PLoS One. 2012;7:e32807. DOIPubMedGoogle Scholar
  10. Venter  M, Madhi  SA, Tiemessen  CT, Schoub  BD. Genetic diversity and molecular epidemiology of respiratory syncytial virus over four consecutive seasons in South Africa: identification of new subgroup A and B genotypes. J Gen Virol. 2001;82:211724. DOIPubMedGoogle Scholar
  11. Cui  G, Zhu  R, Qian  Y, Deng  J, Zhao  L, Sun  Y, et al. Genetic variation in attachment glycoprotein genes of human respiratory syncytial virus subgroups a and B in children in recent five consecutive years. PLoS One. 2013;8:e75020. DOIPubMedGoogle Scholar
  12. Hirano  E, Kobayashi  M, Tsukagoshi  H, Yoshida  LM, Kuroda  M, Noda  M, et al. Molecular evolution of human respiratory syncytial virus attachment glycoprotein (G) gene of new genotype ON1 and ancestor NA1. Infect Genet Evol. 2014;28:18391. DOIPubMedGoogle Scholar
  13. Blanc  A, Delfraro  A, Frabasile  S, Arbiza  J. Genotypes of respiratory syncytial virus group B identified in Uruguay. Arch Virol. 2005;150:6039. DOIPubMedGoogle Scholar
  14. Dapat  IC, Shobugawa  Y, Sano  Y, Saito  R, Sasaki  A, Suzuki  Y, et al. New genotypes within respiratory syncytial virus group B genotype BA in Niigata, Japan. J Clin Microbiol. 2010;48:34237. DOIPubMedGoogle Scholar
  15. Shobugawa  Y, Saito  R, Sano  Y, Zaraket  H, Suzuki  Y, Kumaki  A, et al. Emerging genotypes of human respiratory syncytial virus subgroup A among patients in Japan. J Clin Microbiol. 2009;47:247582. DOIPubMedGoogle Scholar
  16. Muñoz-Escalante  JC, Comas-García  A, Bernal-Silva  S, Robles-Espinoza  CD, Gómez-Leal  G, Noyola  DE. Respiratory syncytial virus A genotype classification based on systematic intergenotypic and intragenotypic sequence analysis. Sci Rep. 2019;9:20097. DOIPubMedGoogle Scholar
  17. Goya  S, Galiano  M, Nauwelaers  I, Trento  A, Openshaw  PJ, Mistchenko  AS, et al. Toward unified molecular surveillance of RSV: A proposal for genotype definition. Influenza Other Respir Viruses. 2020;14:27485. DOIPubMedGoogle Scholar
  18. Ramaekers  K, Rector  A, Cuypers  L, Lemey  P, Keyaerts  E, Van Ranst  M. Towards a unified classification for human respiratory syncytial virus genotypes. Virus Evol. 2020;6:veaa052.
  19. Chen  J, Qiu  X, Avadhanula  V, Shepard  SS, Kim  DK, Hixson  J, et al. Novel and extendable genotyping system for human respiratory syncytial virus based on whole-genome sequence analysis. Influenza Other Respir Viruses. 2022;16:492500. DOIPubMedGoogle Scholar
  20. Katoh  K, Misawa  K, Kuma  K, Miyata  T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:305966. DOIPubMedGoogle Scholar
  21. Larsson  A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics. 2014;30:32768. DOIPubMedGoogle Scholar
  22. Minh  BQ, Schmidt  HA, Chernomor  O, Schrempf  D, Woodhams  MD, von Haeseler  A, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37:15304. DOIPubMedGoogle Scholar
  23. Hoang  DT, Chernomor  O, von Haeseler  A, Minh  BQ, Vinh  LS. UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35:51822. DOIPubMedGoogle Scholar
  24. Sagulenko  P, Puller  V, Neher  RA. TreeTime: Maximum-likelihood phylodynamic analysis. Virus Evol. 2018;4:vex042. DOIPubMedGoogle Scholar
  25. Huddleston  J, Hadfield  J, Sibley  TR, Lee  J, Fay  K, Ilcisin  M, et al. Augur: a bioinformatics toolkit for phylogenetic analyses of human pathogens. J Open Source Softw. 2021;6:2906. DOIPubMedGoogle Scholar
  26. McBroome  J, de Bernardi Schneider  A, Roemer  C, Wolfinger  MT, Hinrichs  AS, O’Toole  AN, et al. A framework for automated scalable designation of viral pathogen lineages from genomic data. Nat Microbiol. 2024;9:55060. DOIPubMedGoogle Scholar
  27. O’Toole  Á, Scher  E, Underwood  A, Jackson  B, Hill  V, McCrone  JT, et al. Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evol. 2021;7:veab064.
  28. World Health Organization/World Organisation for Animal Health/Food and Agriculture Organization (WHO/OIE/FAO) H5N1 Evolution Working Group. Revised and updated nomenclature for highly pathogenic avian influenza A (H5N1) viruses. Influenza Other Respir Viruses. 2014;8:3848. DOIGoogle Scholar
  29. Hassan  AS, Pybus  OG, Sanders  EJ, Albert  J, Esbjörnsson  J. Defining HIV-1 transmission clusters based on sequence data. AIDS. 2017;31:121122. DOIPubMedGoogle Scholar
  30. Streng  A, Goettler  D, Haerlein  M, Lehmann  L, Ulrich  K, Prifert  C, et al. Spread and clinical severity of respiratory syncytial virus A genotype ON1 in Germany, 2011-2017. BMC Infect Dis. 2019;19:613. DOIPubMedGoogle Scholar
  31. Goya  S, Lucion  MF, Shilts  MH, Juárez  MDV, Gentile  A, Mistchenko  AS, et al. Evolutionary dynamics of respiratory syncytial virus in Buenos Aires: viral diversity, migration, and subgroup replacement. Virus Evol. 2023;9:vead006.
  32. Liang  X, Liu  DH, Chen  D, Guo  L, Yang  H, Shi  YS, et al. Gradual replacement of all previously circulating respiratory syncytial virus A strain with the novel ON1 genotype in Lanzhou from 2010 to 2017. Medicine (Baltimore). 2019;98:e15542. DOIPubMedGoogle Scholar
  33. van Niekerk  S, Venter  M. Replacement of previously circulating respiratory syncytial virus subtype B strains with the BA genotype in South Africa. J Virol. 2011;85:878997. DOIPubMedGoogle Scholar
  34. Hause  AM, Henke  DM, Avadhanula  V, Shaw  CA, Tapia  LI, Piedra  PA. Sequence variability of the respiratory syncytial virus (RSV) fusion gene among contemporary and historical genotypes of RSV/A and RSV/B. PLoS One. 2017;12:e0175792. DOIPubMedGoogle Scholar
  35. Quick  J. nCoV-2019 sequencing protocol v1. 2020 Jan [cited 2023 Nov 12]. https://www.protocols.click/view/ncov-2019-sequencing-protocol-bbmuik6w
  36. Zhou  B, Wentworth  DE. Influenza A virus molecular virology techniques. In: Kawaoka Y, Neumann G, editors. Influenza virus: methods and protocols. Totowa, NJ: Humana Press; 2012. p. 175–92 [cited 2023 Nov 12].
  37. Dong  X, Deng  YM, Aziz  A, Whitney  P, Clark  J, Harris  P, et al. A simplified, amplicon-based method for whole genome sequencing of human respiratory syncytial viruses. J Clin Virol. 2023;161:105423. DOIPubMedGoogle Scholar
  38. Wang  L, Ng  TFF, Castro  CJ, Marine  RL, Magaña  LC, Esona  M, et al. Next-generation sequencing of human respiratory syncytial virus subgroups A and B genomes. J Virol Methods. 2022;299:114335. DOIPubMedGoogle Scholar
  39. Presser  LD, van den Akker  WMR, Meijer  A, for PROMISE investigators. Respiratory Syncytial Virus European Laboratory Network 2022 survey: need for harmonization and enhanced molecular surveillance. J Infect Dis. 2023 Aug 14;jiad341.
  40. Aksamentov  I, Roemer  C, Hodcroft  EB, Neher  RA. Nextclade: clade assignment, mutation calling and quality control for viral genomes. J Open Source Softw. 2021;6:3773. DOIGoogle Scholar
  41. Borges  V, Pinheiro  M, Pechirra  P, Guiomar  R, Gomes  JP. INSaFLU: an automated open web-based bioinformatics suite “from-reads” for influenza whole-genome-sequencing-based surveillance. Genome Med. 2018;10:46. DOIPubMedGoogle Scholar
  42. Turakhia  Y, Thornlow  B, Hinrichs  AS, De Maio  N, Gozashti  L, Lanfear  R, et al. Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic. Nat Genet. 2021;53:80916. DOIPubMedGoogle Scholar
  43. Redlberger-Fritz  M, Springer  DN, Aberle  SW, Camp  JV, Aberle  JH. Respiratory syncytial virus surge in 2022 caused by lineages already present before the COVID-19 pandemic. J Med Virol. 2023;95:e28830. DOIPubMedGoogle Scholar
  44. Goya  S, Sereewit  J, Pfalmer  D, Nguyen  TV, Bakhash  SAKM, Sobolik  EB, et al. Genomic characterization of respiratory syncytial virus during 2022‒23 outbreak, Washington, USA. Emerg Infect Dis. 2023;29:8658. DOIPubMedGoogle Scholar
  45. Adams  G, Moreno  GK, Petros  BA, Uddin  R, Levine  Z, Kotzen  B, et al. Viral lineages in the 2022 RSV surge in the United States. N Engl J Med. 2023;388:13357. DOIPubMedGoogle Scholar
  46. Dolores  A, Stephanie  G, Mercedes S  NJ, Érica  G, Mistchenko  AS, Mariana  V. RSV reemergence in Argentina since the SARS-CoV-2 pandemic. J Clin Virol. 2022;149:105126. DOIPubMedGoogle Scholar
  47. van Roekel  C, Poukka  E, Turunen  T, Nohynek  H, Presser  L, Meijer  A, et al.; PROMISE Investigators. Effectiveness of immunisation products against medically attended respiratory syncytial virus infection: generic protocol for a test-negative case-control study. J Infect Dis. 2024;229(Supplement_1):S929. DOIPubMedGoogle Scholar

Top

Figures
Tables

Top

Suggested citation for this article: Goya S, Ruis C, Neher RA, Meijer A, Aziz A, Hinrichs AS, et al. Standardized phylogenetic classification of human respiratory syncytial virus below the subgroup level. Emerg Infect Dis. 2024 Aug [date cited]. https://doi.org/10.3201/eid3008.240209

DOI: 10.3201/eid3008.240209

Original Publication Date: July 11, 2024

1These authors were co–principal investigators.

2These authors contributed equally to this article.

3Current affiliation: Department of Pathobiology, University of Guelph, Guelph, Ontario, Canada.

4Current affiliation: Sciensano, Infectious Diseases in Humans, Unit (Re)-Emerging Viruses, Brussels, Belgium.

Table of Contents – Volume 30, Number 8—August 2024

EID Search Options
presentation_01 Advanced Article Search – Search articles by author and/or keyword.
presentation_01 Articles by Country Search – Search articles by the topic country.
presentation_01 Article Type Search – Search articles by article type and issue.

Top

Comments

Please use the form below to submit correspondence to the authors or contact them at the following address:

Stephanie Goya, Department of Laboratory Medicine and Pathology, University of Washington Medical Center, 850 Republican St, Seattle, WA 98109, USA

Send To

10000 character(s) remaining.

Top

Page created: June 21, 2024
Page updated: July 11, 2024
Page reviewed: July 11, 2024
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.
file_external