Characterization of a Novel HIV-1 Circulating Recombinant Form (CRF80_0107) Among Men Who Have Sex with Men in China
Abstract
Since the emergence of CRF55_01B among men who have sex with men (MSM) group in China, more and more circulating recombinant forms (CRFs) and unique recombinant forms have been identified in the population in China. Here we characterize a novel CRF (CRF80_0107) consisted of CRF01_AE and CRF07_BC segments from three epidemiologically unlinked MSM. Two near full-length genome (NFLG) sequences were amplified and sequenced in two halves with RNA extracted from the plasma of two MSM in Beijing. Another gag-pol sequence was obtained from Los Alamos HIV Sequence Database with accession number KX198573, which was isolated from a man who has sex with men in Hebei province. Phylogenetic analysis based on NFLG sequences revealed that CRF80_0107 formed a monophyletic cluster with high bootstrap value of 100%. Recombination analysis demonstrated that the genome of CRF80_0107 was separated into eight segments by seven breakpoints. The subregion trees constructed by neighbor-joining method confirmed that those segments were originated from CRF01_AE and CRF07_BC strains circulating among MSM group in China. The emergence of CRF80_0107 indicates the frequent generation of novel recombinant forms and the increasing complication of HIV-1 epidemic among MSM group in China. This highlighted the importance of monitoring HIV-1 molecular epidemiological characteristics and the urgency for reducing HIV-1 epidemic among MSM in China.
Since the human immunodeficiency virus (HIV) was first reported in the United States in 1981, it has undergone extremely complicated molecular evolution. On the whole, HIV is divided into HIV-1 and HIV-2. The HIV-1 strains, which are driving the global pandemic, can be subdivided into four groups (M, O, N, and P). Within the highest prevalent group M, nine subtypes (A, B, C, D, F, G, H, J, and K) and two sets of sub-subtypes (A1, A2 and F1, F2) are further categorized. The high level of genetic variation, along with highly recombinogenic reverse transcriptase enzyme of HIV-1, contributes to the emergence of increasing number of circulating recombinant forms (CRFs). CRFs can only be nominated after being identified from at least three epidemiologically unlinked individuals. Otherwise, the recombinants are defined as unique recombinant forms (URFs).
Up to date, 96 CRFs have been assigned and 88 of them have been published with public sequence data, which is available on the Los Alamos HIV Sequence Database (www.hiv.lanl.gov). No CRFs were identified from men who have sex with men (MSM) until CRF51_01B, which was identified among the MSM group in Singapore in 2012.1 The first CRF in MSM in China, named CRF55_01B, was identified several months later in two different regions at the same time.2 Soon after that, a large-scale survey conducted on HIV-1 strains among MSM in 11 provinces in China identified another CRF designated as CRF59_01B in four different provinces: Liaoning, Hunan, Guangdong, and Yunnan.3 CRF67_01B and CRF68_01B were isolated subsequently in Anhui MSM,4 and quickly confirmed by Guo5 among Jiangsu MSM. In the early 2000s, CRF07_BC was introduced into MSM in China and quickly became one of the dominant subtypes in the population. Therefore, it is unsurprising to find the second-generation CRF (CRF79_0107) originated from CRF01_AE and CRF07_BC in MSM.6 Meanwhile, several URFs comprising CRF01_AE and CRF07_BC lineages are coming out among MSM in Beijing and Sichuan, respectively, nowadays,7–9 indicating the ongoing generation recombinant virus that complicated the genetic variation of the MSM group.
In this study, we characterize a novel recombinant strain consisted of eight segments from CRF01_AE and CRF07_BC. This strain was isolated from two epidemiologically unlinked HIV-1 positive plasma samples named YA285 and YA376, which were both collected from MSM in Beijing, China, on January 26, 2011, and February 23, 2012, respectively. Both participants signed informed consents before investigation and collection of peripheral blood. The patient YA285, a 25-year-old unmarried man, had the CD4+ T cell counts of 635 cells/μL and viral load of 16,600 copies/mL, whereas the patient YA376, a 37-year-old unmarried man, had the CD4+ T cell counts of 287 cells/μL and viral load of 39,400 copies/mL. The demographic characteristics of study subjects harboring CRF80_0107 recombinants are shown in Table 1.
| Strain name | Sampling year | Sampling region | Gender | Age | Ethnic group | Marriage | Risk factor | Accession no. |
|---|---|---|---|---|---|---|---|---|
| YA285 | 2011 | Beijing | Male | 25 | Han | Unmarried | MSM | MH843712 |
| YA376 | 2012 | Beijing | Male | 37 | Han | Unmarried | MSM | MH843713 |
| 13LF080 | 2013 | Langfang, Hebei | Male | 29 | Han | Married | MSM | KX198573 |
For the near full-length genome (NFLG) amplification and sequencing, RNA was extracted from the plasma sample utilizing the high-pure viral RNA kit (Roche) and reverse transcribed into cDNA by using Superscript IV First-strand synthesis system (Invitrogen). And then, the NFLG was amplified in two halves with High Fidelity Taq (Invitrogen) as described previously.10 PCR positive products were purified and sequenced by Tianyi huiyuan Life Science and Technology Company (Beijing, China) with several specific primers. All the sequence fragments were edited and assembled into contiguous sequences by using ContigExpress software. Finally, the YA285's NFLG of 8,966 bp (from 638 to 9,600 according to HXB2 calibrator) and the YA376's NFLG of 8,986 bp (from 636 to 9,600 according to HXB2 calibrator) were obtained.
These two NFLG sequences were submitted to a Basic Local Alignment Search Tool (Blast) to search for more similar sequences, but no sequence with high similarity(>95%) was found. Considering that more partial genomic sequences were submitted to Los Alamos database comparing with NFLG, we did more Blast analysis using the gag and pol gene region of YA285's NFLG sequence, respectively. The result showed that a 4,340 bp length sequence (KX198573) identified in our previous research11 showed 99% similarity with both gag and pol region sequences of YA285. After detailed quality control analysis on KX198573 sequence, the 3′ end 291 bases were deleted because of bad quality. Two NFLG sequences collected from this study together with KX198573 were further aligned with 29 reference sequences including 2 HIV-1 reference strains in each subtype and sub-subtype (A1, A2, B, C, D, F1, F2, G, H, J, and K), CRF01_AE, CRF07_BC, CRF08_BC, and 1 O group reference using HIVAlign tool (https://www.hiv.lanl.gov/content/sequence/VIRALIGN/viralign.html). This alignment was then manually edited using BioEdit software. A phylogenetic tree was constructed by the neighbor-joining method based on Kimura 2-parameter model with 1,000 bootstrap replicates in MEGA6 software. For the purpose of recombination analysis, the jumping profile hidden Markov model (jpHMM) (http://jphmm.gobics.de/submission_hiv) was applied. Subsequently, a subregion neighbor-joining tree was constructed to determine the origin of each segment according to the jpHMM result.
The phylogenetic tree showed that the three sequences from epidemiologically unlinked patients formed a distinct monophyletic cluster distinctly related to all known HIV-1 subtypes/CRFs with a high bootstrap value of 100% (Fig. 1). As shown in the recombination analysis result (Fig. 2), the two NFLG sequences were composed of CRF01_AE and subtypes B and C. The mosaic structure of the recombinant genome displayed as follows: ICRF01_AE (790–2,384 nt), IIC (2,385–3,445 nt), IIICRF01_AE (3,446–5,006 nt), IVC (5,007–5,648 nt), VCRF01_AE (5,649–6,382 nt), VIC (6,383–8,428 nt), VIICRF01_AE (8,429–8,828 nt), and VIIIBC (8,829–9,412 nt). The gag-pol partial sequence (KX198573) shared same breakpoints with YA285 and YA376. Subtypes C and BC segments were further submitted for BLAST and showed highest similarity to CRF07_BC strains. Subregion phylogenetic analyses of eight genomic segments were further conducted to explore their likely parental lineages. The high bootstrap values in phylogenetic tree support close relationship of our segments with CRF01_AE or CRF07_BC subtype references, respectively. Therefore, the detailed map of recombinant genome is as follows: ICRF01_AE (790–2,384 nt), IICRF07_BC (2,385–3,445 nt), IIICRF01_AE (3,446–5,006 nt), IVCRF07_BC (5,007–5,648 nt), VCRF01_AE (5,649–6,382 nt), VICRF07_BC (6,383–8,428 nt), VIICRF01_AE (8,429–8,828 nt), and VIIICRF07_BC (8,829–9,412 nt). Subregion phylogenetic analyses also indicated that CRF01_AE segments belonged to the CRF01_AE cluster 5, which is mainly circulating among the MSM population in China.12 The parental origin of CRF01_AE segments was most likely from MSM population rather than heterosexual or IDUs population. Similarly, subtype CRF07_BC segments analysis revealed that these segments may originate from CRF07_BC strains circulating among the MSM population in China (Fig. 3).

FIG. 1. Phylogenic analysis of the NFLG sequences. Sequences of YA285, YA376, and KX198573 were aligned with 29 reference sequences, including 2 HIV-1 reference strains in each subtypes and sub-subtypes (A1, A2, B, C, D, F1, F2, G, H, J, and K), CRF01_AE, CRF07_BC, CRF08_BC, and one O group reference using HIVAlign tool (www.hiv.lanl.gov), and then edited manually in BioEdit software. A neighbor-joining tree was constructed based on Kimura 2-parameter model of nucleotide substitution with 1,000 bootstrap replicates, and gamma distributed rates among sites were applied in MEGA6 software. The sequences of CRF80_0107 were marked with “•.” The scale bar represents 5% genetic distance. Only bootstrap values >70% are presented at the corresponding nodes of the tree. CRF, circulating recombinant form; NFLG, near full-length genome.

FIG. 2. Recombination breakpoint analyses of CRF80_0107. (A) Posterior probabilities of the subtypes. (B) Genomic map of CRF80_0107. All these two maps were generated using jumping profile hidden Markov model (jpHMM).

FIG. 3. Subregion tree analyses of CRF80_0107 genome. (A) Region I (HXB2: 790–2,384 nt) of the CRF80_0107 genome map is the representative of all CRF01_AE segments inserted into the mosaic structure. (B) Region II (HXB2: 2,385–3,445 nt) of the CRF80_0107 genome map is the representative of all CRF07_BC segments inserted into the mosaic structure. The subregion neighbor-joining tree was constructed based on Kimura 2-parameter model of nucleotide substitution with 1,000 bootstrap replicates, and gamma distributed rates among sites were applied in MEGA6 software. The sequences of CRF80_0107 were marked with “•.” The scale bar represents 5% genetic distance. Only bootstrap values >70% are presented at the corresponding nodes of the tree. Other subregion trees are available in the Supplementary Data. MSM, men who have sex with men.
In recent years, the rapid increase of HIV-1 among MSM is of most concern with regard to HIV-1 epidemic in China. Of new cases diagnosed each year, the male homosexual transmission rate had increased from 2.5% in 2006 to 25.8% in 2014. In addition to the rapid spreading in MSM, HIV diversity has also been increasing recently. Several subtypes, including B, CRF01_AE, and CRF07_BC, were found cocirculating in the MSM population.13,14 Prevalence of more than one subtype in the same population always predicts the emergence of new recombinant strains. Therefore, with high prevalent ratio and genetic diversity, the MSM population might become a new recombination hotspot. In previous studies, several CRFs have been reported among MSM population in China.2–4 CRF80_0107 is the second CRF identified among the MSM population with genome comprising CRF01_AE and CRF07_BC. With the rapid spreading of CRF07_BC in MSM in China, CRF01_AE and CRF07_BC have become two most dominant subtypes in the population. More and more novel CRFs comprising CRF01_AE and CRF07_BC will be identified.
In conclusion, we characterized a novel CRF (CRF80_0107) with genome composed of CRF01_AE and CRF07_BC with seven breakpoints. A series of recombinant forms among CRF01_AE, CRF07_BC, and subtype B have appeared among the MSM group in China recently. The emergence of CRF80_0107 indicates that CRF01_AE and CRF07_BC have spread into individuals with similar behavior, which will further complicate HIV-1 molecular epidemic among the MSM group in China. The result also highlighted the importance of monitoring HIV-1 molecular epidemiological characteristics and the urgent need to reduce HIV epidemic among MSM in China.
Sequences Data
The gene sequences of YA285 and YA376 were deposited in the GenBank with accession numbers MH843712 and MH843713, respectively.
Acknowledgments
This work was funded by the National Natural Science Foundation of China (no. 81773493), Beijing Municipal Science & Technology Project (D141100000314001), and the State Key Laboratory of Pathogen and Biosecurity (Academy of Military Medical Science). This work was supported by the grants 2017YFC1200800, 16CXZ030, BWS17J032, and 17-163-12-ZT-005-038-01.
Author Disclosure Statement
No competing financial interests exist.
Supplementary Material
References
- 1. , , : Identification of new CRF51_01B in Singapore using full genome analysis of three HIV type 1 isolates. AIDS Res Hum Retroviruses 2012;28:527–530. Link, Google Scholar
- 2. , , : Genome sequences of a novel HIV-1 circulating recombinant form, CRF55_01B, identified in China. Genome Announc 2013;1. DOI: 10.1128/genomeA.00050-12. Google Scholar
- 3. , , : Identification and characterization of a novel HIV-1 circulating recombinant form (CRF59_01B) identified among men-who-have-sex-with-men in China. PLoS One 2014;9:e99693. Crossref, Medline, Google Scholar
- 4. , , : New emerging recombinant HIV-1 strains and close transmission linkage of HIV-1 strains in the Chinese MSM population indicate a new epidemic risk. PLoS One 2013;8:e54322. Crossref, Medline, Google Scholar
- 5. , , : A novel HIV-1 CRF01_AE/B recombinant among men who have sex with men in Jiangsu Province, China. AIDS Res Hum Retroviruses 2014;30:706–710. Link, Google Scholar
- 6. , , : Genome sequence of a novel HIV-1 circulating recombinant form (CRF79_0107) identified from Shanxi, China. AIDS Res Hum Retroviruses 2017;33:1056–1060. Link, Google Scholar
- 7. , , : Identification of a novel CRF01_AE/CRF07_BC recombinant form in men who have sex with men in Sichuan, China. AIDS Res Hum Retroviruses. 2016;32:718–721. Link, Google Scholar
- 8. , , : Characterization of a novel HIV-1 second-generation recombinant form in men who have sex with men in Beijing, China. AIDS Res Hum Retroviruses 2017;33:1175–1179. Link, Google Scholar
- 9. , , : Characterization of a new HIV-1 CRF01_AE/CRF07_BC recombinant virus form among men who have sex with men in Beijing, China. AIDS Res Hum Retroviruses 2018;34:550–554. Link, Google Scholar
- 10. , , : Genetic characterization of 13 subtype CRF01_AE near full-length genomes in Guangxi, China. AIDS Res Hum Retroviruses 2010;26:699–704. Link, Google Scholar
- 11. , , : HIV-1 molecular epidemiology among newly diagnosed HIV-1 individuals in Hebei, a low HIV prevalence province in China. PLoS One 2017;12:e0171481. Crossref, Medline, Google Scholar
- 12. , , : The rapidly expanding CRF01_AE epidemic in China is driven by multiple lineages of HIV-1 viruses introduced in the 1990s. AIDS 2013;27:1793–1802. Crossref, Medline, Google Scholar
- 13. , , : Characterization of HIV-1 subtypes and viral antiretroviral drug resistance in men who have sex with men in Beijing, China. AIDS 2007;21:S59–SS65. Crossref, Medline, Google Scholar
- 14. , , : Identification of 3 distinct HIV-1 founding strains responsible for expanding epidemic among men who have sex with men in 9 Chinese cities. J Acquir Immune Defic Syndr 2013;64:16–24. Crossref, Medline, Google Scholar

