github broadinstitute/gatk 4.0.12.0

latest releases: 4.6.1.0, 4.6.0.0, 4.5.0.0...
5 years ago

Highlights of this release include support for outputting phased variants in HaplotypeCaller/Mutect2, restoring the --include-non-variant-sites argument to GenotypeGVCFs, a port of the GATK3 tool VariantEval, a new library (Disq, https://github.com/disq-bio/disq) for working with BAM/CRAM/VCF/etc. formats on Spark, and GCS (Google Cloud Storage) support in Funcotator.

As usual, a docker image for this release can be downloaded from https://hub.docker.com/r/broadinstitute/gatk/

Full list of changes in this release:

  • HaplotypeCaller/Mutect2

    • Output VCF spec-compliant phased variants in HaplotypeCaller and Mutect2
    • Added an experimental adaptive pruning option for local assembly (#5473)
    • Improved implementation of allele-specific new qual (#5460)
    • Use cigar complexity to break ties in uninformative reads' best haplotypes (#5359)
    • Improved handling of regions that are too short after trimming in HaplotypeCaller and in Mutect2 (Closes issue #5079)
    • Optimization in CigarUtils to shortcut to M-only CIGAR when provably optimal (#5466)
    • Changed SUPPORTED_ALLELES_TAG from SA to XA (#5418)
  • HaplotypeCaller

    • Fixed bug in GGA mode caused by split multallic sites with genotypes (#5365)
    • The debug command line argument is now passed correctly in HaplotypeCaller (fixed issue #4943) (#5455)
  • Mutect2

    • Big improvements to CalculateContamination's model for determining hom alt sites (#5413)
    • Reduce false negatives from mapping quality filter on long indels in Mutect2 (#5497)
    • Added a mismatch ratio option in realignment filter (#5501)
    • Made Mutect2 read position filter default much less stringent (#5487)
    • Fixed M2 bug for germline resources with AF=. (#5442)
    • Fix read position annotation bug in M2 filter (#5495)
    • Cleaner Mutect2 VCF fields (#5510)
    • Moved PerAlleleAnnotations to the INFO field (#5518)
    • Removed unnecessary inheritance of M2 filtering arguments collection (#5498)
  • GenotypeGVCFs

    • Restored the --include-non-variant-sites argument from GATK3 to GenotypeGVCFs (#5219)
  • Ported the GATK3 tool VariantEval to GATK4 (#5043)

  • Replaced the Hadoop-BAM library with the newly-developed Disq library (https://github.com/disq-bio/disq) for efficiently working with BAM/CRAM/VCF/etc. formats on Spark (#5138)

    • Improves Spark performance across-the-board, and fixes many edge-case bugs in Hadoop-BAM
  • Funcotator

    • Added GCS support to Funcotator data sources, so that data sources can now be accessed directly from GCS buckets (#5425)
    • Added support for annotating 5'/3' flanks (#5403)
    • Funcotator now creates default annotations for difficult variants. (#5374)
    • Funcotator now can create annotations for symbollic alleles and masked alleles (#5406)
    • Funcotator now can match between hg19 and b37 data sources. (#5491)
    • Added in regression tests and fixes for correctness of many annotations (#5302)
    • Now DE_NOVO_START_IN_FRAME and DE_NOVO_START_OUT_FRAME are correct. (#5357)
    • Added cDNA Strings for Intronic Variants (#5321)
    • VCF data sources create an ID field for the ID of the variant
      used for the annotation (#5327)
    • Funcotator now computes MT protein changes. (#5361)
    • Funcotator now correctly populates transcript position. (#5380)
    • Added a script that can create data sources from BED files. (#5438)
    • Updated testing Gencode data sources to fully exercise test data set (#5423)
    • Moved validation test data out of large files area. (#5381)
    • Updated top-level class documentation for Funcotator. (#4655)
    • Added scripts to liftover gnomAD. Also bugfixes for Funcotator NIO. (#5514)
  • HaplotypeCallerSpark

    • Added a "strict mode" that allows HaplotypeCallerSpark to closely match the output of the regular HaplotypeCaller (#5416)
    • Now extends AssemblyRegionWalkerSpark (#5386)
  • MarkDuplicatesSpark: Added a few of the remaining unimplemented useful features from Picard (#5377)

  • CNV workflows

    • Changed FilterIntervals to operate on the intersection of intervals in all inputs. (#5408)
    • Fixed RAM usage parameter error in combine_tracks.wdl (#5358)
    • Various other improvements to combine_tracks.wdl (#5384)
    • Fixed gCNV WDL broken by Cromwell update on FireCloud. (#5407)
    • Replaced bash script in gCNV ScatterIntervals task with updated version of IntervalListTools. (#5414)
  • CNNScoreVariants

    • Check for and require hardware AVX support (#5291)
  • Changed SelectVariants so that it can handle multiple rsIDs separated by ';' in a VCF file (#5464)

  • Miscellaneous Changes

    • Added setIsUnplaced() to the GATKRead API to distinguish reads with no mapping information (#5320)
    • Fixed an integer overflow bug in the RMSMappingQuality annotation (#5435)
    • Fixed floating-point bug in MannWhitneyU on some JVMs. (#5371)
    • Standardized the output argument for LeftAlignIndels (#5474)
    • SplitIntervals now produces an .interval_list file (#5392)
    • Fixed a bug with GATK_GCS_STAGING in the GATK launcher script #1338 (#5452)
    • Added ExampleReadWalkerWithVariantsSpark.java and tests (#5289)
    • Add description getter and javadoc in GATKReportTable (#5443)
    • Fixed message in GATKAnnotationPluginDescription (#5444)
    • Replaced some uses of PrintWriter (#5461)
    • Refactor GVCFWriter to allow push/pull iteration. (#5311)
    • Add scripts/dataproc-cluster-ui to release bundle. (#5401)
    • Marked VariantAnnotator as a @DocumentedFeature (#5480)
    • Removed obsolete intel conda environment references. (#5482)
    • Deleted the CountSet class (#5467)
    • Test framework: disabled gcloud login on travis for non-cloud non-wdl tests (#5335)
    • Updated Spark scripts to reflect changes from #5386 and #5127. (#5415)
    • Fixed jexl logging and updated VariantFiltration doc. (#5422)
    • Fixed some dead links in the README (#5405)
  • Dependencies

    • Updated htsjdk to 2.18.1 (#5486)
    • Updated Picard to 2.18.16. (#5412)
    • Updated Intel-GKL dependency to 8.6 (#5463)

Don't miss a new gatk release

NewReleases is sending notifications on new releases.