Highlights of this release include a preview version of a future neural-network-based VQSR replacement, the ability to generate a VCF from the GermlineCNVCaller
output, allele-specific annotation support in GenomicsDBImport
, as well as a number of important post-4.0 bug fixes. See below for the full list of changes.
As usual, a docker image for this release can be downloaded from https://hub.docker.com/r/broadinstitute/gatk/
Changes in this release:
- New experimental tool
NeuralNetInference
(#4097)- An eventual VQSR replacement.
- Performs variant score inference with a 1D Convolutional Neural Network with a pre-trained model. This is faster but not as high quality the 2D model which is coming along with training and tranche-style filtering in the next GATK release (#4245).
- Tool name subject to change!
GenomicsDBImport
:- Add support for allele-specific annotations (#4261) (#3707)
- Allow sample names with whitespace in the sample name map file (#3982)
- Fix segfault crash on long path names (#4160)
- Allow multiple import commands to be run in the same workspace directory (#4106)
- Fix segfault crash during import when flag fields not declared in the VCF header (#3736)
- Improve warning message when PLs are dropped for records with too many alleles (#3745)
- CNV tools:
HaplotypeCaller
- Fix the
--min-base-quality-score
/-mbq
argument, which previously had no effect (#4128). This fix also affectsMutect2
. - Fix a "contig must be non-null and not equal to *, and start must be >= 1" error by patching an edge case in the ReadClipper code: when reverting soft-clipped bases of a read at the start of a contig, don't explode if you end up with an empty read (#4203)
- Fix the
Mutect2
:- Smarter contamination model (#4195)
- Removed the
--dbsnp
and--comp
arguments. The best practice now is to pass ingnomAD
as thegermline-resource
. - Removed a number of other arguments that were
HaplotypeCaller
-specific and not appropriate forMutect2
, such as--emit-ref-confidence
. - Mutect2 WDL: CRAM support (#4297)
- Mutect2 WDL: Compressed vcf output and Funcotator options (#4271)
- Miscellaneous WDL cleanup
HaplotypeCallerSpark
:- Fixes to the tool that make its output much closer to that of the non-Spark
HaplotypeCaller
(#4278). Note that this tool (unlike the non-SparkHaplotypeCaller
) is still in beta, and should not be used for any real work. There are still major performance issues with the tool that in practice prevent running on certain kinds of large data and in certain modes. - Disallow writing a
.vcf.gz
when in GVCF mode, as this combination currently doesn't work (#4277)
- Fixes to the tool that make its output much closer to that of the non-Spark
BwaSpark
:- set more reasonable default set of read filters (#4286)
PathSeq
:- Add WDL for running the
PathSeq
pipeline with a README and example JSON input. (#4143)
- Add WDL for running the
- Fix piping between Picard tools run via the GATK by changing logging output to stderr (#4167)
- Disallow unindexed block-compressed tribble files as input to walkers (#4240) (#4224). This works around a bug in HTSJDK that could cause such files to appear truncated. Until the HTSJDK bug is fixed, block-compressed
.vcf.gz
files (and similar files) will need to be accompanied by an index, which can be generated using theIndexFeatureFile
tool. - Restore
.list
as an allowed extension for files containing multiple values for command-line arguments (#4270). The previous extension.args
is also still allowed. This feature allows users to provide a file ending in.list
or.args
containing all of the values for an argument that accepts multiple values (for example: a list of BAM files), instead of typing all the values individually on the command line. - Fix conda environment creation to work better with the release distribution. (#4233)
IndexFeatureFile
: more informative error message when trying to index a malformed file (#4187)- Suggest using BED files as a way to resolve ambiguous interval queries. (#4183)
- Set Spark parameter userClassPathFirst = false #3933 (#3946)
- Update to HTSJDK 2.14.1 (#4210)