Target Counts

The target counts stage is the first processing stage for the DRAGEN CNV pipeline. This stage bins the alignments into intervals. The primary analysis format for CNV processing is the target counts file, which contains the feature signals that are extracted from the alignments to be used in downstream processing. The binning strategy, interval sizes, and their boundaries are controlled by the target counts generation options, and the normalization technique used.

When working with whole genome sequence data, the intervals are autogenerated from the reference hashtable. Only the primary contigs from the reference hashtable are considered for binning. You can specify additional contigs to bypass with the --cnv-skip-contig-list option.

With whole exome sequence data, the target BED file supplied with the --cnv-target-bed option is used to determine the intervals for analysis.

The target counts stage generates a .target.counts.gz file, which can be later used in place of any BAM or CRAM by specifying it with the --cnv-input option for the normalization stage. The .target.counts.gz file is an intermediate file for the DRAGEN CNV pipeline and should not be modified.

The .target.counts.gz file is a tab-delimited compressed text file with the following columns:

•

Contig identifier

•

Start position

•

End position

•

Target interval name

•

Count of alignments in this interval

•

Count of improperly paired alignments in this interval

An example of a *.target.counts.gz file is shown below.

contig  start   stop    name                 SampleName  improper_pairs
1       565480  565959  target-wgs-1-565480  7           6
1       566837  567182  target-wgs-1-566837  9           0
1       713984  714455  target-wgs-1-713984  34          4
1       721116  721593  target-wgs-1-721116  47          1
1       724219  724547  target-wgs-1-724219  24          21
1       725166  725544  target-wgs-1-725166  43          12
1       726381  726817  target-wgs-1-726381  47          14
1       753243  753655  target-wgs-1-753243  31          2
1       754322  754594  target-wgs-1-754322  27          0
1       754594  755052  target-wgs-1-754594  41          0