github broadinstitute/picard 3.2.0

8 days ago

What's Changed

Important CRAM writing bug fix

A serious bug which can cause corrupted reads in CRAM files was discovered and fixed in this release. This bug was introduced in HTSJDK 3.0.0, and affects Picard versions 2.27.3 through 3.1.1 and GATK versions 4.3 through 4.5.

The bug occurs in cases where there is a read aligned starting at exactly position 1 on a reference contig. This means that the bug doesn't generally impact human autosome and X/Y contigs because they tend to start with a large number of N bases and reads are not aligned at exactly position 1. The exceptions to this would be T2T references and things like mitochondrial calling.

For more information on the conditions that trigger this bug, see this post.

GATK 4.6 includes a tool called CRAMIssue8768Detector that can scan a CRAM file and report whether it is affected, and if so which regions in the file are corrupt. If you suspect that some of your CRAM files may have been affected, please run this tool on them for confirmation!

See also samtools/htsjdk#1708 for more information.

Better support for remote files

Improvements to allow direct access to remote files continue. It's now possible to use a remote reference file without localizing it in many cases. Files which are available through http URLs are now accessible directly as well. (ex: https://example.com/my.bam). This allows direct access to signed URLs, although index and supporting files may not be discoverable automatically.

New features for flow based reads

  • MarkDuplicates strategy of flow based reads that looks only at the qualities close to the end of the read by @ilyasoifer in #1942
  • CollectQualityYieldMetricsFlowSpace tool by @dror27 in #1932

New Options

  • KEEP_ZERO_LENGTH_INTERVALS flag when converting bed -> interval_list by @rickymagner in #1928
  • Make the VCF option in CollectSamErrorMetrics optional. by @nh13 in #1476
  • Add the EXT argument to CollectSamErrorMetrics. by @nh13 in #1478

Bug fixes

Bug fixes to several tools as well as important CRAM fixes from an updated htsjdk

  • MarkDuplicates: Add read group ID instead of string "RG" by @michaelgatzen in #1937
  • Fix CollectHSMetrics - Don't use Coverage Cap on fold score by @JoeVieira in #1913
  • Fix for order flipping in SortingCollection used for MarkDuplicates by @wook-choi in #1945
  • Allow fingerprinting of SAM files that only have a partial dictionary match to the haplotype map by @yfarjoun in #1955
  • Fix a bug in the liftover logic by @yfarjoun in #1956

Documentation and improved error messages

  • Reject piped input (/dev/stdin) for BedToIntervalList by @kockan in #1918
  • Update AbstractAlignmentMerger.java Warning Message for Cross Species Contamination by @gokalpcelik in #1960
  • MergeBamAlignment documentation by @kachulis in #1922
  • Updated SamToFastq documentation by @kockan in #1920

Maintenance and dependency updates

New Contributors

Full Changelog: 3.1.1...3.2.0

Don't miss a new picard release

NewReleases is sending notifications on new releases.