stanza 1.9.1 on Python PyPI

multilingual coref!

Added models which cover several different languages: one for combined Germanic and Romantic languages, one for the Slavic languages available in UDCoref #1406

streamlit visualizer for semgrex/ssurgeon #1396
updates to the constituency parser ensemble #1387
accuracy improvements to the IN_ORDER oracle #1391
Split-only MWT model - cannot possibly hallucinate, as sometimes happens for OOV words. Currently for EN and HE #1417 #1419
download_method=None now turns off HF downloads as well, for use in instances with no access to internet #1408 #1399

update tqdm usage to remove some duplicate code: #1413 3de69ca
long list of incorrectly tokenized Spanish words added directly to the combined Spanish training data to improve their tokenization: #1410
Occasionally train the tokenizer with the sentence final punctuation of a batch removed. This helps the tokenizer avoid learning to tokenize the last character regardless of whether or not it is punctuation. This was also related to the Spanish tokenization issue 56350a0
actually include the visualization: #1421 thank you @bollwyvl