Germline Variant Small Hard Filtering

DRAGEN provides post-VCF variant filtering based on annotations present in the VCF records. Default and non-default variant hard filtering are described below. However, due to the nature of DRAGEN’s algorithms, which incorporate the hypothesis of correlated errors from within the core of variant caller, the pipeline has improved capabilities in distinguishing the true variants from noise, and therefore the dependency on post-VCF filtering is substantially reduced. For this reason, the default post-VCF filtering in DRAGEN is very simple.

Default Small Variant Hard Filtering

The default filters in the germline pipeline are as follows:

##FILTER=<ID=DRAGENSnpHardQUAL,Description="Set if true:QUAL < 10.41">
##FILTER=<ID=DRAGENIndelHardQUAL,Description="Set if true:QUAL < 7.83">
##FILTER=<ID=LowDepth,Description="Set if true:DP <= 1">
##FILTER=<ID=PloidyConflict,Description="Genotype call from variant caller not consistent with chromosome ploidy">
##FILTER=<ID=base_quality,Description="Site filtered because median base quality of alt reads at this locus does not meet threshold">
##FILTER=<ID=lod_fstar,Description="Variant does not meet likelihood threshold (default threshold is 6.3)">
DRAGENSnpHardQUAL and DRAGENIndelHardQUAL: For all contigs other than the mitochondrial contig, the default hard filtering consists of thresholding the QUAL value only. A different default QUAL threshold value is applied to SNP and INDEL.
lod_fstar: For the mitochondrial contig, the default hard filtering consists of thresholding the LOD score only.
base_quality: For the mitochondrial contig, this filter applies to sites where the median base quality of alt reads falls below a threshold.
LowDepth: This filter is applied to all variants call with FORMAT/DP <= 1/.
PloidyConflict: This filter is applied to all variant calls on chrY of a female subject, if female is specified on the DRAGEN command line, of if female is detected by the Ploidy Estimator.

Non-Default Small Variant Hard Filtering

DRAGEN supports basic filtering of variant calls as described in the VCF standard. You can apply any number of filters with the --vc-hard-filter option, which takes a semicolon-delimited list of expressions, as follows:

<filter ID>:<snp|indel|all>:<list of criteria>,

where the list of criteria is itself a list of expressions, delimited by the || (OR) operator in this format:

<annotation ID> <comparison operator> <value>

The meaning of these expression elements is as follows:

filterID—The name of the filter, which is entered in the FILTER column of the VCF file for calls that are filtered by that expression.
snp/indel/all—The subset of variant calls to which the expression should be applied.
annotation ID—The variant call record annotation for which values should be checked for the filter. Supported annotations include FS, MQ, MQRankSum, QD, and ReadPosRankSum.
comparison operator—The numeric comparison operator to use for comparing to the specified filter value. Supported operators include <, ≤, =, ≠, ≥, and >.

For example, the following expression would mark with the label "SNP filter" any SNPs with FS < 2.1 or with MQ < 100, and would mark with "indel filter" any records with FS < 2.2 or with MQ < 110:

--vc-hard-filter=”SNP filter:snp:FS < 2.1 || MQ < 100; indel filter:indel:FS < 2.2 || MQ < 110”

This example is for illustration purposes only and is NOT recommended for use with DRAGEN V3 output. Illumina recommends using the default hard filters.

The only supported operation for combining value comparisons is OR, and there is no support for arithmetic combinations of multiple annotations. More complex expressions may be supported in the future.