Less than two months after identification of the first cases in Wuhan, SARS-CoV-2, the responsible virus for the ongoing pandemic, was detected in Iran in February 2020 (1). To date, on January 23, 2022, about 6,250,490 cases and 132,230 deaths have occurred in Iran (2). SARS-CoV-2, the seventh identified human coronavirus, has 79% similarity in nucleotide sequence with SARS-CoV, and its similarity with SARS-like CoV ZXC21 and bat CoV RaTG13 reaches 89% and 96.3%, respectively. Nucleotide sequence similarities and Homology modeling studies on receptor-binding domain structure validate that this novel virus has a zoonotic origin (3, 4). This novel β-coronavirus is an enveloped, positive-strand RNA virus. Its genome size is about 30kb consisting of 11 open reading frames (ORFs) (5-7). The structural proteins of the virus are encoded by the S, E, M, and N genes, 16 nonstructural proteins (NSPs) are encoded by ORF1ab, and accessory proteins are encoded by ORF3a, ORF6, ORF7, ORF8, and ORF10 (8-10). The spike protein is composed of 1273 amino acids and two subunits, s1 and s2. This protein has several functional domains and multiple proteolytic cleavage sites between these two subunits and in the s2 site itself (7). The receptor-binding domain (RBD) is between amino acids 319 to 541 of the S1 subunit (11). The mutation rate of RNA viruses is millions of times higher than that of their hosts, which can play a role in changing evolvability and virulence (12). One of the main reasons for the ongoing pandemic is the mutations in the S protein, which are involved in binding to the ACE2 receptor and entering the cell (13). Some of these substitutions reduce the effect of antibodies produced against the previous variants by changing the key parts of the S protein. This allows new variants partially escape the immune response produced after a prior infection.
For instance, the N501Y mutation in the RBD region, seen in notable variants such as Alpha, Beta, Gamma, and Omicron, resulted in stronger binding of the virus to the ACE2 receptor (14). K417N/T mutation in this region can cause immunological escape and is associated with a conformational change in the S protein (15-17). The E484K mutation has been reported to be involved in increasing the infectivity of new variants, and it helps the virus escape from the immune system (15, 18). There is an association between L452R mutation and increased transmission rate, increased infectivity, and decreased neutralization with therapeutic antibodies (19). G339D, N440K, Q493K, along with S477N, T478K, and N501Y, were mutations that, unlike S371L, S373P, S375F, K417N, G446S, E484A, G496S, Q498R, and Y505H substitutions, increased the binding affinity in the Omicron variant, which made predicting its transmissibility and potential immune escape risk challenging (14).
Because of substitutions in new variants and mutations that may occur in the future, Sequence analysis of circulating variants in each region might be essential for the development of vaccines capable of eliciting neutralizing antibodies broadly. Detection of new mutations and observation of dominant mutations might be beneficial in designing effective drugs and thus reduce the prevalence of SARS-CoV-2 and control the spread of new variants. Also, in this study, the prevalence time of variants and mutations related to the RBD region has been investigated retrospectively.
Isolates Collection
41 Oro-nasopharyngeal swab samples were randomly selected from Covid-19 patients in Ahvaz, Iran, between September 2020 and May 2021.
Inclusion and Exclusion Criteria
Inpatients with respiratory symptoms in which the presence of SARS-CoV-2 in their oro-nasopharyngeal swab samples was confirmed by Real-Time PCR were included in our study. Samples with CT values above 35 were not included, and Those with weak bands after electrophoresis were excluded from the study.
cDNA Synthesis
Total RNA was extracted from 200μL oro-nasopharyngeal swab samples using the SINAPURE RNA extraction Kit (Sinaclon, Iran) as stated in the instructions and stored at −80 °C until further processing. SinaClon First Strand cDNA Synthesis Kit was used to produce cDNA from extracted RNA samples (Sinaclon, Iran). The primer set was designed using the Primer blast NCBI tool according to the SARS-CoV-2 reference sequence (GenBank: NC_045512.2) and the sequences of some other variants (including Alpha, Beta, Gamma, Delta, and Omicron). The primers sequences were F: 5′-GTACGTTGAAATCCTTCACTG-3′, R: 5′-CCTGATAAAGAACAGCAACCT-3′. These primers were intended to amplify a fragment of 939 nucleotides, including the RBD coding region and its surroundings which contained nucleotides 22464 to 23402 of the SARS-CoV-2 reference genome (NC_045512.2). The primers set were synthesized by Metabion International AG, Germany.
RBD Amplification and Sequencing
RBD coding fragments were amplified in 25μL PCR reaction volumes containing 2.5μL of template, 1μL forward, and 1μL reverse primer (10μM), 12.5μL 2X PCRBIO VeriFi Mix with Dye (containing PFU DNA polymerase), and 8μL Nuclease-free water. PCR Amplification was executed under the following settings: an initial 120s denaturation step at 95°C, followed by 40 amplification cycles, with each cycle including a 20 s denaturation step at 94°C, a 45s annealing step at 55°C, and a 45s elongation step at 72°C, and then 2 min final extension at 72°C, ended up and held at 25°C. After electrophoretic separation on 1.5% agarose gel, the products were examined under UV light. Approved PCR products were sent for Bi-Directional DNA Sequencing (Biocardiogenetic, Iran(.
Bioinformatics Analysis
Geneious prime (2019.1.3) software was used for trimming sequences, de novo assembling and generating consensus sequences. Obtained sequences were aligned to the RBD coding region of the reference sequence (GenBank: NC_045512.2), Alpha, Beta, Gamma, Delta, Omicron, and some other variants by MEGA X to identify SNPs, amino acid replacements, and phylogenetic analysis. The Interactive Tree of Life tool (iTOL) was used to beautify the phylogenetic tree. Statistical summaries of variation information were accomplished by using IBM SPSS Statistics 26.
In our study, 24 male and 17 female patients with a mean age of 54±14 participated. They were selected from inpatients with positive Real-Time-PCR test results.
No mutations were observed in the RBD region of 15 samples. A total of 35 SNPs resulting in 34 amino acid substitutions and a synonymous substitution were detected in the RBD region of 26 samples. All missense mutations were observed in the receptor-binding motif (RBM) region.
N501Y substitution, as the most frequent mutation, was seen in 13 samples (37.14% of observed mutations). L452R, T478K, and S477N substitutions were observed as subsequent frequent mutations in 8(22.86%), 7(20%), and 4(11.43%) samples, respectively. Subsequently, the S477G and Y449N mutations were found in only one sample (2.86%). The only synonymous substitution of this study was observed in C432 of one sample.
In all samples with the T478K mutation, the L452R mutation was also present. However, the L452R substitution was not always associated with T478K. The Y449N mutation was observed only in the presence of N501Y, while the N501Y substitution wasn’t associated with Y449N in most cases.
As shown in Figure 2, N501Y substitution was observed since December 2020, Which may indicate the beginning of the gradual dominance of the Alpha variant in the region. The L452R and T478K mutations were seen side by side since May 2021 and became dominant, which may suggest the time when the Delta variant prevailed in Khuzestan province.
As demonstrated in Table 1, sample No. 20, isolated in December 2020, prior to the emergence of the Delta variant, has the L452R mutation. This single mutation in the RBD region, exhibited in the phylogenetic tree (Figure 3), is mostly found in Delta and Epsilon variants.
Figure 1. Percentage of observed amino acid substitutions in the Receptor Binding Domain of SARS-CoV-2
Figure 2. Frequency of detected substitutions in different months of study
Figure 3. The neighbor-joining phylogenetic tree based on the RBD coding sequences of circulating SARS-CoV-2 variants in Khuzestan province and circulating variants worldwide was created by MEGA X using the maximum composite likelihood model with 1000 bootstrap replicates. Subsequently, the Newick format was exported and used to beautify the phylogenetic tree in the Interactive Tree of Life Tool. Bootstrap values less than 40% are not shown.
Table 1. Existence of predominantly detected substitutions of the study in different lineages and variants.
Lineages/Variants/Samples | Predominantly detected mutations | |||||
S477N | S477G | N501Y | L452R | T478K | ||
Wuhan-Hu-1, B.1.1.214, B.1.2, B.1.243, … | ||||||
Alpha | ✔ | |||||
Beta | ✔ | |||||
Gamma | ✔ | |||||
Delta | ✔ | ✔ | ||||
Omicron | ✔ | ✔ | ✔ | |||
Mu | ✔ | |||||
Theta | ✔ | |||||
Kappa | ✔ | |||||
Epsilon | ✔ | |||||
B.1.160, B.1.526, B.1.404, | ✔ | |||||
B.1.1, B.1.2, B.1.311,… | ✔ | |||||
Samples no.1, 2, 4, 6-10, 12-14, 17, 21-23 | ||||||
Samples no. 3, 11, 15, 19 | ✔ | |||||
Sample no. 5 | ✔ | |||||
Sample no. 16, 18, 24-33 | ✔ | |||||
Sample no. 20 | ✔ | |||||
Sample no. 34-39, 41 | ✔ | ✔ |
C432 synonymous substitution in sample no.37 and Y449N mutation in sample no.40 are not included in the table.
In this study, a total of 35 SNPs were identified in the RBD region of 41 samples. As shown in the phylogenetic tree (figure 3), no mutations were observed in the RBD region of 15 samples, which might demonstrate variants such as B.1.1.214, B.1.2, and B.1.243 that do not carry any amino acid substitution in their RBD region. The non-synonymous substitution was the predominant mutation in this study, as synonymous substitution accounted for only 3% of the observed mutations. All non-synonymous substitutions were detected in the receptor-binding motif (RBM).
According to studies, the mutation rate in the SARS-Cov-2 genome is ~3 nucleotides per month (20). It has been observed that the mutation rate is higher in ORF1ab, ORF3a, ORF8, N, and S coding regions of the genome (21, 22). About four months after the definition of the Variant of Concern for the novel coronavirus was proposed by the WHO, in late JUNE 2020, four variants of Concern were introduced: Alpha, Beta, Gamma, and Delta. The Alpha variant was first detected in the UK in September 2020 and spread to the world after becoming the dominant lineage in the country. This variant has the N501Y substitution in the RBD region and 13 other non-synonymous substitutions in other regions of its genome. We observed that the N501Y mutation, which may represent the Alpha variant, wasn’t detected in Khuzestan until December 2020. This mutation decreases the neutralization capacity of antibodies produced against the primary Wuhan-Hu-1 variant (23-25). Detection of this mutation in the RBD region indicates that the variant may contain mutations in other regions of the genome, which are associated with increased transmission rate and escape from the immune system.
The Beta variant, which has N501Y, E484K, and K417N mutations in its RBD region, and five other lineage-defining mutations in the S protein, was identified in November 2020 in South Africa (26). Investigations revealed that the Beta variant has higher transmissibility than the previous variants (27). The Gamma variant, which has 21 lineage-defining mutations, including seven mutations in the S protein coding region in addition to the RBD mutations, appeared in November 2020 in Brazil. The RBD region of the Gamma variant, like the Beta variant, has the N501Y and E484K mutations, although instead of the K417N mutation, it's carrying the K417T substitution (28). This variant also has higher transmissibility compared to the Alpha variant (29). The Delta variant, which has the L452R and T478K mutations in its RBD region, appeared in April 2021 in India. This variant was known as the Variant of Concern in May 2021 due to its high transmission rate and potential immune escape (30). As of May 2021, all but one of our samples had RBD mutations associated with the Delta variant, L452R, and T478K. This may indicate the beginning time of Delta variant prevalence in Khuzestan. The Epsilon variant, which appeared in the US in November 2020, had significantly higher infectivity than Alpha. The L452R mutation in its RBD region caused the stronger attachment of the spike to its receptor. This substitution was also associated with immune escape (31, 32). In December 2020 sample, prior to the emergence of the Delta variant, we identified a single L452R mutation on RBD, which, as shown in figure 3, exists in the Epsilon variant. Identification of this mutation may indicate the arrival of the Epsilon variant or other variants whose existence has not been registered in Iran.
The Omicron variant, which has 26 amino acid substitutions in the S protein, was detected in November 2021 in South Africa. A large number of Omicron mutations in the spike protein distinguish it from other variants of concern (33, 34). Mutations such as S371L, S373P, S375F, T478K, Q493R, Q498R, and N501Y in the RBD region increase affinity for binding to the ACE2 receptor, and substitutions in the Furin site facilitate cell entry and increase infectivity (35, 36). Reinfection risk, immune escape ability, and increased infectivity of the Omicron variant have caused a great deal of concern around the world (37). T478K, N501Y, and S477N substitutions were observed in this study as the three mutations which exist in the Omicron variant. As shown in Table 1, in addition to the Alpha, Beta, Gamma, and Epsilon, the N501Y mutation has also been identified in the Mu and Theta variants.
In our study, the first substitution we identified was the S477N in a sample from September 2020. This mutation was reported to have appeared in August 2020 in variants circulating in Australia. This mutation increases the fitness of the S protein in binding to the ACE2 receptor, and it's associated with a rise in the mortality rate around the world (38). This substitution was also widespread in variants circulating in Europe in the fall of 2020 (39). Observation of this mutation may be related to the entrance of variants such as B.1.160, B.1.526, and B.1.404 into the Khuzestan province. As shown in the Table 1, this mutation is one of the substitutions also seen in the Omicron variant. In the October 2020 sample, we identified the S477G, another common substitution occurring in the RBD region that increases the strength of the S protein binding to its receptor (40). Detection of this mutation may indicate the entry of variants such as B.1.1, B.1.2, and B.1.311 into the Khuzestan province. In this study, it was found that December 2020 and May 2021 could be the probable prevalence times of Alpha and Delta variants in Khuzestan, respectively. Also, in a December 2020 sample, a mutation was detected that was only seen in Epsilon among the variants of interest and Variants of concern; this could indicate the possible arrival of unconfirmed variants such as Epsilon to Iran.
In the RBD region of a sample from May 2021, we detected the Y449N substitution along with the N501Y, which might be related to some Alpha variants (GenBank: OD977282.1, OU325242.1). The Y449N mutation has a negative effect on spike binding to the receptor and reduces viral fitness, which is probably why this mutation is so rare in the sequences studied around the world (41).
The frequency of producing ESBL and the prevalence rate of blaSHV and blaCTX-M genes in K. pneumoniae and E. coli in the mentioned bacterial isolates are high in Kermanshah. However, unlike some of the previous reports from Kermanshah, the prevalence of ESBL-encoding genes in P. aeruginosa was low, and blaVEB gene was not found.
The authors appreciate all efforts made by the laboratory personnel of Razi's Hospital, including providing the samples for the research.
All the data used in this research are available and readu for recheck in case they are needed,
All the processes of this study were approved by the Iranian National Committee for Ethics in Biomedical Research (IR.AJUMS.REC.1400.416). Written consent forms were also obtained from all patients.
All authors meet the ICMJE authorship criteria as below. The conception and design of the study were carried out by N.N, M.R, and M.P. Measurement was done by M.R, N.N, R.P and M.P. The data were analyzed by M.B and interpreted by M.R. M.B and N.N contributed to the drafting of the paper and its revision and are responsible for the intellectual content and the final approval of the version to be published.
This study was supported by the Infectious and Tropical Diseases Research Center, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Iran with registration number U-00188.
Conflicts of Interest
The authors declare that no conflict of interest exists.
Rights and permissions | |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |