ZIKV-002 Challenge Stock Summary

2018-02-23

Virus stock info: Zika virus/R.macaque-tc/UGA/1947/MR766-3329

ZIKV strain MR766 (GenBank:LC002520), originally isolated from a sentinel Rhesus monkey on 20 April 1947 in Zika Forest, Entebbe, Uganda with 149 suckling mouse brain passages and two rounds of amplification on Vero cells, was obtained from Brandy Russell (CDC, Ft. Collins, CO).  Virus stocks were prepared by inoculation onto a confluent monolayer of C6/36 mosquito cells.

Harvest Date: 15 February 2016        Titer:  6.77 log10 PFU/ml

Unlike the challenge stock for ZIKV-001, where the consensus sequence of the challenge stock was identical to the Genbank sequence of the parental virus, there is one significant consensus-level change between the challenge stock and the Genbank sequence from ZIKV MR766. There are also several synonymous sites where variation is fixed in the challenge stock relative to the Genbank sequence. I do not know whether this is due to sequence changes in the source ZIKV MR766 virus that we received and amplified or whether these changes accumulated during in vitro expansion of the virus to prepare the challenge stock.

Analysis method

To map and call variants, I used a modified version of the Zequencer workflow I developed to analyze ZIKV-001 data. This version of the workflow:

- removes duplicate reads using the bbmap dedupe.sh script

- trims low quality sequence and removes adapter sequences from the ends of reads

- filters out reads shorter than 150bp after trimming

- maps reads to the reference sequence using the bbmap algorithm in local alignment mode, using the normal sensitivity preset

- calls variants supported by >5% of reads with a p-value < 10e-60 and a minimum strand bias P value of 10e-5 when exceeding 65% bias

This version of the workflow does not rely on any external plug-ins and can be run using only integrated plug-ins available in Geneious Pro 9.1.2.

Download the workflow

Assessment of challenge stock variants

The most interesting region of variability involves a sequence at position 1430-1441 (relative to Genbank LC002520) where the majority of sequences have a 4 amino acid in-frame deletion. Reads that do not have the deletion have two non-synonymous nucleotide changes in the same region. 

There are a number of putative variants in the 5' and 3' UTRs:

5' UTR



3' UTR

All variants

Here is a table of all of the variants observed in the challenge stock at >5% along with their predicted impact on protein function. Variants at sites within the region encoding the polyprotein are highlighted in yellow. Note that the only non-synonymous variants predicted to impact amino acid sequence are located in the same location as the deletion that is present in many reads. In other words, some reads have a deletion while others have two non-synonymous substitutions. 

NameTypeMinimumMaximumLengthAmino Acid ChangeCDS PositionCoverageProtein EffectVariant FrequencyVariant P-Value (approximate)
APolymorphism10101  51 76.50%7.80E-95
CGCPolymorphism15173  55 -> 59 69.5% -> 72.7%1.20E-99
TPolymorphism22221  150 28.70%1.30E-87
CPolymorphism27271  263 44.90%5.80E-254
APolymorphism28281  300 13.70%6.60E-65
GPolymorphism28281  300 25.70%1.50E-143
APolymorphism29291  300 25.70%1.50E-143
CPolymorphism29291  300 13.70%7.50E-61
TPolymorphism32321  311 24.80%3.60E-142
CGCPolymorphism35373  392 -> 39619.4% -> 19.6%4.00E-133
CPolymorphism45451  655 11.60%2.70E-143
APolymorphism3263250 22012,516Frame Shift6.60%0
CPolymorphism1,0991,0991 99319,229None82.00%0
 Polymorphism1,4301,44112TVND -> 1,32417679 -> 18169Deletion79.8% -> 82.0%0
TPolymorphism1,4311,4311T -> I1,32518,049Substitution18.80%0
CPolymorphism1,4431,4431I -> T1,33718,126Substitution16.50%0
TPolymorphism1,4441,4441 1,33818,155None82.60%0
CPolymorphism1,6691,6691 1,56319,889None99.70%0
CPolymorphism4,5194,5191 4,41339,374None99.20%0
TPolymorphism5,1775,1771 5,07122,582None10.20%0
CPolymorphism5,5125,5121 5,40625,111None99.80%0
APolymorphism7,0847,0841 6,97822,290None12.00%0
APolymorphism7,5317,5311 7,42523,541None11.40%0
TPolymorphism7,7417,7411 7,63522,615None8.10%0
TPolymorphism10,13810,1381 10,03232,760None11.00%0
CPolymorphism10,34310,3431 10,23725,396None81.30%0
CTPolymorphism10,61110,6122  5489 -> 555134.7% -> 34.9%0
CCAPolymorphism10,61510,6151  5,416 35.10%0
APolymorphism10,61810,6181  5,111 37.40%0
TCPolymorphism10,62010,6212  5066 -> 507437.6% -> 37.7%0
CTGPolymorphism10,62510,6273  4877 -> 490539.4% -> 39.6%0
TPolymorphism10,63510,6351  4,210 46.40%0
CAPolymorphism10,63710,6382  4145 -> 414847.00%0
APolymorphism10,64010,6401  4,051 48.20%0
CTPolymorphism10,64210,6432  3972 -> 397849.0% -> 49.1%0
GPolymorphism10,64510,6451  3,951 49.40%0
TTCPolymorphism10,65310,6553  3579 -> 363249.4% -> 50.0%0