github nextstrain/nextclade_data 2022-09-28--16-01-10--UTC
2022-09-27

latest releases: 2024-08-31--20-44-06Z, 2024-08-27--21-28-04Z, 2024-08-08--05-08-21Z...
24 months ago

New dataset version (tag 2022-09-27T12:00:00Z)

All SARS-CoV-2 datasets

  • Data update: New Pango lineages are included, see cov-lineages/pango-designation@efabcb6...cfe736 for new desigantions that are included
  • Identical sequences have been removed from B.1* lineages to reduce size of that part of the tree from ~1.6k to ~800.
BA.2 dataset (experimental)

Monkeypox datasets

hMPXV B.1 dataset
  • Mutations to a genotype found in MPXV-UK_P2 or MPXV-M5312_HM12_Rivers are now "labelled" as rev (reversion to reference). This should help identify wrong calls to reference when using the B.1 dataset. Until now, these artefacts were only visible as reversions when using the hMPXV or all-clades datasets.
MPXV (All clades)
  • Frame shifts and stop codons that are encountered in a majority of sequences from clades IIa or I are now annotated as "known" mutations, which means that they do not influence the quality score. This should help increase the signal to noise ratio when uploading sequences from either of the clades.

Don't miss a new nextclade_data release

NewReleases is sending notifications on new releases.