This release brings a major update to our experimental neural-network-based VariantRecalibrator
replacement, initial MAF
support in Funcotator
, as well as some updates to Mutect2
and the CNV
tools.
As usual, a docker image for this release can be downloaded from https://hub.docker.com/r/broadinstitute/gatk/
Summary of changes in this release:
-
A major update to our experimental neural-network-based suite of variant scoring tools, which will eventually replace the
VariantRecalibrator
(#4245)- The
NeuralNetInferenceTool
has been renamed toCNNScoreVariants
- Baseline models are now included in the distribution.
- Added additional tools to write tensors and to train your own models given a VCF of validated calls, an unfiltered VCF and a confident region:
CNNVariantTrain
,CNNVariantWriteTensors
andFilterVariantTranches
- Read-level 2D models are now supported via the tensor-type read_tensor argument. 2D models at present are significantly slower than the 1D models.
- The
-
Funcotator
:- Added prototype support for outputting
MAF
files (and many bug fixes) (#4472)
- Added prototype support for outputting
-
Mutect2:
-
CNV
tools:- Replaced
CollectFragmentCounts
withCollectReadCounts
. (#4564) - Allowed use of zero eigensamples in
DenoiseReadCounts
. (#4411) - Changed filtering of normal hets on overlap with copy-ratio intervals in
ModelSegments
to be consistent with filtering of case hets. (#4510) - Updated PostprocessGermlineCNVCalls (segments VCF writing, WDL scripts, unit tests, integration tests) (#4396)
- Replaced
-
Miscellaneous changes:
Concordance
: added option to analyze contributions of different filters (#4520)- Exposed the
-pairHMM
/--pair-hmm-implementation
argument inHaplotypeCaller
, which was previously hidden (#4494) - Set the default
samjdk.compression_level
to 2 (was previously 1) (#4547) - Upgraded to Spark 2.2.0 (#4314)
- Changed Spark sharding of queryname-sorted bams to better handle secondary and supplementary reads (#4473)
- Added logging output to the bam writing step for spark tools (#4501)
git-lfs
is now required to compile the GATK- Added a registry for deprecated/unported tools. (#4505)
- Updated the Hadoop GCS connector from 1.6.1 to 1.6.3. (#4590)
- Added a large runtime resource directory to
git-lfs
, and exposed it to the Docker build. (#4530) - We now include full tool documentation in the GATK binary distribution zip (#4377)
- Made our maven artifacts much smaller by preventing gradle uploadArchives from including distZip and distTar (#4569)
- Added chr20 and chr21 alt contigs to the
GRCh38
reference snippet used for testing (#4548)