MultiQC/MultiQC v1.14 on GitHub

MultiQC new features

Rewrote the Dockerfile to build multi-arch images (amd64 + arm), run through a non-privileged user and build tools for non precompiled python binaries (#1541, #1541)
Add a new lint test to check that colour scale names are valid (#1835)
Update github actions to run tests on a single module if it is the only file affected by the PR (#915)
Add CI testing for Python 3.10 and 3.11
Optimize line-graph generation to remove an n^2 loop (#1668)
Parsing output file column headers is much faster.

MultiQC code cleanup

Remove Python 2-3 compatability from __future__ imports
Remove unused #!/usr/bin/env python hashbangs from module files
Add new code formatting tool isort to standardise the order and formatting of Python module imports
Add Pycln pre-commit hook to remove unused imports

MultiQC updates

Bugfix: Make config.data_format work again (#1722)
Bump minimum version of Jinja2 to >=3.0.0 (#1642)
Disable search progress bar if running with --quiet or --no-ansi (#1638)
Allow path filters without full paths by trying to prefix analysis dir when filtering (#1308)
Fix sorting of table columns with text values
Don't crash if a barplot is given an empty list of categories (#1540)
New logos! MultiQC is now developed and maintained at Seqera Labs. Updated logos and email addresses accordingly.

New Modules

Anglerfish
- A tool designed to assess pool balancing, contamination and insert sizes of Illumina library dry runs on Oxford Nanopore data.
BBDuk
- Combines most common data-quality-related trimming, filtering, and masking operations via kmers into a single high-performance tool.
Cell Ranger
- Works with data from 10X Genomics Chromium. Processes Chromium single cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.
- New MultiQC module parses Cell Ranger quality reports from VDJ and count analysis
DIAMOND
- A high-throughput program for aligning DNA reads or protein sequences against a protein reference database.
DRAGEN-FastQC
- Illumina Bio-IT Platform that uses FPGA for accelerated primary and secondary analysis
- Finally merged the epic 2.5-year-old pull request, with 3.5k new lines of code.
- Please report any bugs you find!
Filtlong
- A tool for filtering long reads by quality.
GoPeaks
- GoPeaks is used to call peaks in CUT&TAG/CUT&RUN datasets.
HiFiasm
- A haplotype-resolved assembler for accurate Hifi reads
HUMID
- HUMID is a tool to quickly and easily remove duplicate reads from FastQ files, with or without UMIs.
mOTUs
- Microbial profiling through marker gene (MG)-based operational taxonomic units (mOTUs)
Nextclade
- Tool that assigns clades to SARS-CoV-2 samples
Porechop
- A tool for finding and removing adapters from Oxford Nanopore reads
PRINSEQ++
- PRINSEQ++ is a C++ of prinseq-lite.pl program for filtering, reformating or trimming genomic and metagenomic sequence data.
UMI-tools
- Work with Unique Molecular Identifiers (UMIs) / Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes.

Module updates

Bcftools stats
- Bugfix: Do not show empty bcftools stats variant depth plots (#1777)
- Bugfix: Avoid exception when PSC nMissing column is not present (#1832)
BclConvert
- Handle single-end read data correctly when setting cluster length instead of always assuming paired-end reads (#1697)
- Handle different R1 and R2 read-lengths correctly instead of assuming they are the same (#1774)
- Handle single-index paired-end data correctly
- Added a config option to enable the creation of barplots with undetermined barcodes (create_unknown_barcode_barplots with False as default) (#1709)
BUSCO
- Update BUSCO pass/warning/fail scheme to be more clear for users
Bustools
- Show median reads per barcode statistic
Custom content
- Create a report even if there's only Custom Content General Stats there
- Attempt to cooerce line / scatter x-axes into floats so as not to lose labels (#1242)
- Multi-sample line-graph TSV files that have no sample name in row 1 column 1 now use row 1 as x-axis labels (#1242)
fastp
- Add total read count (after filtering) to general stats table (#1744)
- Don't crash for invalid JSON files (#1652)
FastQC
- Report median read-length for fastqc in addition to mean (#1745)
Kaiju
- Don't crash if we don't have any data for the top-5 barplot (#1540)
Kallisto
- Fix ZeroDivisionError when a sample has zero reads (#1746)
Kraken
- Fix duplicate heatmap to account for missing taxons (#1779)
- Make heatmap full width
- Handle empty kreports gracefully (#1637)
- Fix regex error with very large numbers of unclassified reads (#1639)
malt
- Fixed division by 0 in malt module (#1683)
miRTop
- Avoid KeyError - don't assume all fields present in logs (#1778)
Mosdepth
- Don't pad the General Stats table with zeros for missing data (#1810)
Picard
- HsMetrics: Allow custom columns in General Stats too, with HsMetrics_genstats_table_cols and HsMetrics_genstats_table_cols_hidden
Qualimap
- Added additional columns in general stats for BamQC results that displays region on-target stats if region bed has been supplied (hidden by default) (#1798)
- Bugfix: Remove General Stats rows for filtered samples (#1780)
RSeQC
- Update geneBody_coverage to plot normalized coverages using a similar formula to that used by RSeQC itself (#1792)
Sambamba Markdup
- Catch zero division in sambamba markdup (#1654)
Samtools
- Added additional column for flagstat that displays percentage of mapped reads in a bam (hidden by default) (#1733)
VEP
- Don't crash with ValueError if there are zero variants (#1681)

MultiQC/MultiQC v1.14 MultiQC Version 1.14 on GitHub

MultiQC new features

MultiQC code cleanup

MultiQC updates

New Modules

Module updates

MultiQC/MultiQC v1.14
MultiQC Version 1.14

on GitHub