github broadinstitute/gatk 4.0.3.0

latest releases: 4.6.1.0, 4.6.0.0, 4.5.0.0...
6 years ago

This release brings a major update to our experimental neural-network-based VariantRecalibrator replacement, initial MAF support in Funcotator, as well as some updates to Mutect2 and the CNV tools.

As usual, a docker image for this release can be downloaded from https://hub.docker.com/r/broadinstitute/gatk/

Summary of changes in this release:

  • A major update to our experimental neural-network-based suite of variant scoring tools, which will eventually replace the VariantRecalibrator (#4245)

    • The NeuralNetInferenceTool has been renamed to CNNScoreVariants
    • Baseline models are now included in the distribution.
    • Added additional tools to write tensors and to train your own models given a VCF of validated calls, an unfiltered VCF and a confident region: CNNVariantTrain, CNNVariantWriteTensors and FilterVariantTranches
    • Read-level 2D models are now supported via the tensor-type read_tensor argument. 2D models at present are significantly slower than the 1D models.
  • Funcotator:

    • Added prototype support for outputting MAF files (and many bug fixes) (#4472)
  • Mutect2:

    • CalculateContamination emits its segmentation and Mutect2 germline model uses it (#4509)
    • Option to emit (but still filter) all germline sites in Mutect2 (#4522)
    • Made number of samples to put variant site in Mutect2 PON adjustable (#4566)
    • Added Oncotator filtering enabled in Mutect2 WDL. (#4423)
  • CNV tools:

    • Replaced CollectFragmentCounts with CollectReadCounts. (#4564)
    • Allowed use of zero eigensamples in DenoiseReadCounts. (#4411)
    • Changed filtering of normal hets on overlap with copy-ratio intervals in ModelSegments to be consistent with filtering of case hets. (#4510)
    • Updated PostprocessGermlineCNVCalls (segments VCF writing, WDL scripts, unit tests, integration tests) (#4396)
  • Miscellaneous changes:

    • Concordance: added option to analyze contributions of different filters (#4520)
    • Exposed the -pairHMM/--pair-hmm-implementation argument in HaplotypeCaller, which was previously hidden (#4494)
    • Set the default samjdk.compression_level to 2 (was previously 1) (#4547)
    • Upgraded to Spark 2.2.0 (#4314)
    • Changed Spark sharding of queryname-sorted bams to better handle secondary and supplementary reads (#4473)
    • Added logging output to the bam writing step for spark tools (#4501)
    • git-lfs is now required to compile the GATK
    • Added a registry for deprecated/unported tools. (#4505)
    • Updated the Hadoop GCS connector from 1.6.1 to 1.6.3. (#4590)
    • Added a large runtime resource directory to git-lfs, and exposed it to the Docker build. (#4530)
    • We now include full tool documentation in the GATK binary distribution zip (#4377)
    • Made our maven artifacts much smaller by preventing gradle uploadArchives from including distZip and distTar (#4569)
    • Added chr20 and chr21 alt contigs to the GRCh38 reference snippet used for testing (#4548)

Don't miss a new gatk release

NewReleases is sending notifications on new releases.