How to filter biallelic SNP sites from VCF file

We can simply use bcftools to filter biallelic SNPs. I used bcftools v1.9 in this post.

bcftools view -m 2 -M 2 -v snps \
 -i 'INFO/AF[0]>0 & INFO/AF[1] < 1' \
 -Oz -o outfile.vcf.gz \
 infile.vcf.gz

With -m 2 -M 2 -v snps, biallelic SNPs are filtered, but these options do not remove sites that alleles of all samples are REF or ALT alleles. These SNPs can be removed by adding -i 'INFO/AF[0]>0 & INFO/AF[1] < 1'.

You can also use vcftools to do this, but I would not like to recommend it because it’s slower than using bcftools.

References

  • bcftools: https://samtools.github.io/bcftools/bcftools.html

Leave a Reply