We can simply use bcftools
to filter biallelic SNPs. I used bcftools
v1.9 in this post.
bcftools view -m 2 -M 2 -v snps \
-i 'INFO/AF[0]>0 & INFO/AF[1] < 1' \
-Oz -o outfile.vcf.gz \
infile.vcf.gz
With -m 2 -M 2 -v snps
, biallelic SNPs are filtered, but these options do not remove sites that alleles of all samples are REF or ALT alleles. These SNPs can be removed by adding -i 'INFO/AF[0]>0 & INFO/AF[1] < 1'
.
You can also use vcftools
to do this, but I would not like to recommend it because it’s slower than using bcftools.
References
- bcftools: https://samtools.github.io/bcftools/bcftools.html