Multisample CNV VCF Output
The records in a multisample CNV VCF differ slightly from the single sample case. The major differences are as follows:
• | The per-record entries are broken down into the segments among the union of all the input samples breakpoints, which means there are more entries in the overall VCF. |
• | The QUAL column is not used and its value is “.”. The per-sample quality is carried over into the SAMPLE columns with the QS tag. |
• | The FILTER column indicates PASS if any of the individual SAMPLE columns PASS. Otherwise, it indicates SampleFT. |
• | The per-sample annotations are carried over from their originating calls. The single sample filters are applied at the sample level and are emitted in the FT annotation. |
Additionally, if a valid pedigree is used, then de novo calling is performed, which adds the following two annotations to the proband sample.
##FORMAT=<ID=DQ,Number=1,Type=Float,Description="De novo quality">
##FORMAT=<ID=DN,Number=1,Type=String,Description="Possible values are ‘Inherited’, ‘DeNovo’ or ‘LowDQ’. Threshold for a passing de novo call is DQ > 0.100000">
While the VCF contains many entries, due to the joint segmentation stage, the number of de novo events can be found by extracting entries that have a DN and DQ annotation. These records are also extracted and are converted to GFF3 in the de novo calling case.