MultiQC new features
- Rewrote the
Dockerfileto build multi-arch images (amd64 + arm), run through a non-privileged user and build tools for non precompiled python binaries (#1541, #1541) - Add a new lint test to check that colour scale names are valid (#1835)
- Update github actions to run tests on a single module if it is the only file affected by the PR (#915)
- Add CI testing for Python 3.10 and 3.11
- Optimize line-graph generation to remove an n^2 loop (#1668)
- Parsing output file column headers is much faster.
MultiQC code cleanup
- Remove Python 2-3 compatability
from __future__imports - Remove unused
#!/usr/bin/env pythonhashbangs from module files - Add new code formatting tool isort to standardise the order and formatting of Python module imports
- Add Pycln pre-commit hook to remove unused imports
MultiQC updates
- Bugfix: Make
config.data_formatwork again (#1722) - Bump minimum version of Jinja2 to
>=3.0.0(#1642) - Disable search progress bar if running with
--quietor--no-ansi(#1638) - Allow path filters without full paths by trying to prefix analysis dir when filtering (#1308)
- Fix sorting of table columns with text values
- Don't crash if a barplot is given an empty list of categories (#1540)
- New logos! MultiQC is now developed and maintained at Seqera Labs. Updated logos and email addresses accordingly.
New Modules
- Anglerfish
- A tool designed to assess pool balancing, contamination and insert sizes of Illumina library dry runs on Oxford Nanopore data.
- BBDuk
- Combines most common data-quality-related trimming, filtering, and masking operations via kmers into a single high-performance tool.
- Cell Ranger
- Works with data from 10X Genomics Chromium. Processes Chromium single cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.
- New MultiQC module parses Cell Ranger quality reports from VDJ and count analysis
- DIAMOND
- A high-throughput program for aligning DNA reads or protein sequences against a protein reference database.
- DRAGEN-FastQC
- Illumina Bio-IT Platform that uses FPGA for accelerated primary and secondary analysis
- Finally merged the epic 2.5-year-old pull request, with 3.5k new lines of code.
- Please report any bugs you find!
- Filtlong
- A tool for filtering long reads by quality.
- GoPeaks
- GoPeaks is used to call peaks in CUT&TAG/CUT&RUN datasets.
- HiFiasm
- A haplotype-resolved assembler for accurate Hifi reads
- HUMID
- HUMID is a tool to quickly and easily remove duplicate reads from FastQ files, with or without UMIs.
- mOTUs
- Microbial profiling through marker gene (MG)-based operational taxonomic units (mOTUs)
- Nextclade
- Tool that assigns clades to SARS-CoV-2 samples
- Porechop
- A tool for finding and removing adapters from Oxford Nanopore reads
- PRINSEQ++
- PRINSEQ++ is a C++ of
prinseq-lite.plprogram for filtering, reformating or trimming genomic and metagenomic sequence data.
- PRINSEQ++ is a C++ of
- UMI-tools
- Work with Unique Molecular Identifiers (UMIs) / Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes.
Module updates
- Bcftools stats
- BclConvert
- Handle single-end read data correctly when setting cluster length instead of always assuming paired-end reads (#1697)
- Handle different R1 and R2 read-lengths correctly instead of assuming they are the same (#1774)
- Handle single-index paired-end data correctly
- Added a config option to enable the creation of barplots with undetermined barcodes (
create_unknown_barcode_barplotswithFalseas default) (#1709)
- BUSCO
- Update BUSCO pass/warning/fail scheme to be more clear for users
- Bustools
- Show median reads per barcode statistic
- Custom content
- fastp
- FastQC
- Report median read-length for fastqc in addition to mean (#1745)
- Kaiju
- Don't crash if we don't have any data for the top-5 barplot (#1540)
- Kallisto
- Fix
ZeroDivisionErrorwhen a sample has zero reads (#1746)
- Fix
- Kraken
- malt
- Fixed division by 0 in malt module (#1683)
- miRTop
- Avoid
KeyError- don't assume all fields present in logs (#1778)
- Avoid
- Mosdepth
- Don't pad the General Stats table with zeros for missing data (#1810)
- Picard
- HsMetrics: Allow custom columns in General Stats too, with
HsMetrics_genstats_table_colsandHsMetrics_genstats_table_cols_hidden
- HsMetrics: Allow custom columns in General Stats too, with
- Qualimap
- RSeQC
- Update
geneBody_coverageto plot normalized coverages using a similar formula to that used by RSeQC itself (#1792)
- Update
- Sambamba Markdup
- Catch zero division in sambamba markdup (#1654)
- Samtools
- Added additional column for
flagstatthat displays percentage of mapped reads in a bam (hidden by default) (#1733)
- Added additional column for
- VEP
- Don't crash with
ValueErrorif there are zero variants (#1681)
- Don't crash with