This release brings a critical bug fix to the GenomicsDBImport
tool related to sample ordering, plus a new tool FixCallSetSampleOrdering
to repair vcfs generated using the pre-4.beta.6
version of the tool. See the description of the bug in #3682 to determine whether you are affected. Do not run FixCallSetSampleOrdering
unless you are sure that you are affected by the bug in #3682.
Other highlights include upgrading to the latest version of the Picard tools, and adding engine support for reading Gencode GTF files.
A docker image for this release can be found in the broadinstitute/gatk
repository on dockerhub. Within the image, cd into /gatk
then run gatk-launch
commands as usual.
Note: Due to our current dependency on a snapshot of google-cloud-java
, this release cannot be published to maven central.
Full list of changes for this release:
- Fixed sample name reordering bug in GenomicsDBImport (#3667)
- New tool FixCallSetSampleOrdering to repair vcfs affected by #3682 (#3675)
- Integrate latest Picard tools via Picard jar. (#3620)
- Adding in codec to read from Gencode GTF files. Fixes #3277 (#3410)
- Upgrade to HTSJDK version 2.12.0 (#3634)
- Upgrade to GKL version 0.7 (#3615)
- Upgrade to GenomicsDB version 0.7.0 (#3575)
- Upgrade Mockito from 1.10.19 -> 2.10.0. (#3581)
- Add GVCF support to VariantsSparkSink (#3450)
- Fix writing variants to GCS buckets (#3485)
- Support unmapped reads in Spark. (#3369)
- Correct gVCF header lines (#3472)
- Dump more evidence info for SV pipeline debugging (#3691)
- Add omitFromCommandLine=true for example tools (#3696)
- Change gatkDoc and gatkTabComplete build tasks to include Picard. (#3683)
- Adding data.table R package. (#3693)
- Added a missing newline in ParamUtils method. (#3685)
- Fix minor HTML issues in ReadFilter documentation (#3654)
- Add CRAM integration tests for HaplotypeCaller. (#3681)
- Fix SamAssertionUtils SortSam call. (#3665)
- Add ExtremeReadsTest (#3070)
- removing required FASTA reference input that was needed before (for its dict) for sorting variants in output VCF, now using header in input SAM/BAM (#3673)
- re-enable snappy use in htsjdk (#3635)
- fix 3612 (#3613)
- pass read metadata to all code that needs to translate contig ids using read metadata (#3671)
- quick fix for broken read (mapped to no ref bases) (#3662)
- Fix log4j logging by removing extra copy from the classpath.#2622 (#3652)
- add suggestion to regularly update gcloud to README (#3663)
- Automatically distribute the BWA-MEM index image file to executors for BwaSpark (#3643)
- Have PSFilter strip mate number from read names (#3640)
- Added the tool PreprocessIntervals that bins the intervals given by the user to be used for coverage collection. (#3597)
- Cpx SV PR serisers, part-4 (#3464)
- fixed bug in which F1R2 and F2R1 annotation kept discarded alleles (#3636)
- imprecise deletion calling (#3628)
- Significant improvements to CalculateContamination (#3638)
- Adds supplementary alignment info into fastq output, also additional… (#3630)
- Adding tool to annotate with pair orientation info (#3614)
- add elapsed time to assembly info in intervals file (#3629)
- Created a VariantAnnotationArgumentCollection to reduce code duplication and added a StandardM2Annotation group (#3621)
- Docs for turning assembled haplotypes into variant alleles (#3577)
- Simplify spark_eval scripts and improve documentation. (#3580)
- Renames StructuralVariantContext to SVContext. (#3617)
- Added KernelSegmenter. (#3590)
- Fix bug in for allele order independant comparison (#3616)
- Docs for local assembly (#3363)
- Added a method to VariantContextUtils which supports allele alt allele order independant comparison of variant contexts. (#3598)
- Fixed incorrect logger in CollectAllelicCounts and RecalibrationReport. (#3606)
- updating to newer htsjdk snapshot (#3588)
- clear diffuse high frequency kmers (#3604)
- update SmithWatermanAligner in preparation for native optimized aligner (#3600)
- added spark tool for extracting original SAM records based on a file containning read names (#3589)
- update README with correct path to install_R_packages.R #3601 (#3602)
- HostAlignmentReadFilter and PSScorer use only identity scores and exp… (#3537)
- Fixed alt-allele count in AllelicCountCollector and changed unspecified alleles in AllelicCount to N. (#3550)
- Fix bad version check in manage_sv_pipeline.sh (#3595)
- Use a handmade TestReferenceMultiSource in tests instead of a mock. (#3586)
- Repackage ReadFilter plugin tests (#3525)
- BamOut in M2 WDL and unsupported version with NIO for SpecOps Team (#3582)
- Changed the path for posting the test reports
- updates sv manager and cluster creation scripts to utilize dataproc cluster timed self-termination feature (#3579)
- Implemented watershed algorithm for finding local minima in 1D data based on topological persistence. (#3515)
- Reduce number of output partitions in PathSeqPipelineSpark (#3545)
- add gathering of imprecise evidence links and extend evidence intervals to make links coherent in most cases (#3469)
- Refactor PrimaryAlignmentReadFilter to PrimaryLineReadFilter (#3195)
- Update ReadFilters documentation (#3128)
- Changes in BwaMemIntegrationTest to avoid a 3-4 minutes runtime. (#3563)
- Make error informative for non-diploid family likelihoods #3320 (#3329)
- TableFeature javadoc and more tests (#3175)
- Re-enable ancient BED test in IndexFeatureFile. (#3507)
- add external evidence stream for CNVs (#3542)
- clip M2 alleles before emitting in case some alleles were dropped (#3509)
- Docs for M2 filtering (#3560)
- Fix static test blocks and @BeforeSuite usages to prevent excessive code execution when tests aren't included in a suite. (#3551)
- hide prototyping tools in sv package from help message (but still runnable if knowing their existence) (#3556)
- Add support for running tools with omitFromCommandLine=true (#3486)
- Adds utility methods to ReadUtils and CigarUtils. (#3531)
- Cpx SV PR serisers, part-3 (#3457)