Monkeypox datasets
New dataset version (tag 2022-08-19T12:00:00Z)
All monkeypox datasets
Clade names now follow the convention agreed during WHO consultation:
- Clade 1 -> Clade I
- Clade 2 -> Clade IIa
- Clade 3 -> Clade IIb
The common ancestor of clade IIa and clade IIb is called clade II.
The clade/lineage hierarchy now has a middle level called outbreak. For now there is just one outbreak called hMPXV-1 but in the future other clusters that may be worth naming may get an outbreak name - even if they don't get lineages of their own.
This middle level is output into Nextclade web and TSV/CSV files in the same way as lineages. The field is called outbreak. If a sequence does not belong to an outbreak, the field will be empty.
Sequences released by Genbank up to 2022-08-18 are included in the new dataset.
MPXV (All clades)
The reconstructed ancestor is now assigned clade outgroup - until now the clade field was empty.
The reference.fasta ID has been renamed to reconstructed_ancestral_mpox_in_NC_063383_coordinates as requested in issue #35. This should not impact most users - the change only affects the name of the reference sequence as output to the alignment if you use Nextclade-CLI and include the --include-reference flag.
hMPXV-1 B.1 dataset
The reference.fasta ID has been renamed to MPXV_USA_2022_MA001_in_NC_063383_coordinates, see above for a description of the impact (for most people none).