Hi, there:
I used snp-sites to generate a VCF file for ~2 million SARS-COV-2 genome FASTA file that I downloaded from GISAID.
Below is the first record:

I also manually extracted this first record and run a tabulation and got the following:
343927 0
1682817 1
2 2
4 3
5 4
6 5
2 6
4 7
I think 0 is for the REF, while 1-7 are for the 7 ALT alleles (*, A, K, C, S, T, Y) respectively.
But I am not sure that is exactly the first ALT allele * ? It has a count of 1682817.
Thank you very much & best regadrs,
Jie
Hi, there:
I used snp-sites to generate a VCF file for ~2 million SARS-COV-2 genome FASTA file that I downloaded from GISAID.
Below is the first record:

I also manually extracted this first record and run a tabulation and got the following:
343927 0
1682817 1
2 2
4 3
5 4
6 5
2 6
4 7
I think 0 is for the REF, while 1-7 are for the 7 ALT alleles (*, A, K, C, S, T, Y) respectively.
But I am not sure that is exactly the first ALT allele * ? It has a count of 1682817.
Thank you very much & best regadrs,
Jie