0.13.2, 2016-08-19
- wordtopics has changed to word_topics in ldamallet, and fixed issue #764. (@bhargavvader, #771)
- assigning wordtopics value of word_topics to keep backward compatibility, for now
- topics, topn parameters changed to num_topics and num_words in show_topics() and print_topics()(@droudy, #755)
- In hdpmodel and dtmmodel
- NOT BACKWARDS COMPATIBLE!
- Added random_state parameter to LdaState initializer and check_random_state() (@droudy, #113)
- Topic coherence update with
c_uci
,c_npmi
measures. LdaMallet, LdaVowpalWabbit support. Addtopics
parameter to coherencemodel. Can now provide tokenized topics to calculate coherence value. Faster backtracking. (@dsquareindia, #750, #793) - Added a check for empty (no words) documents before starting to run the DTM wrapper if model = "fixed" is used (DIM model) as this causes the an error when such documents are reached in training. (@Eickho, #806)
- New parameters
limit
,datatype
for load_word2vec_format();lockf
for intersect_word2vec_format (@gojomo, #817) - Changed
use_lowercase
option in word2vec accuracy tocase_insensitive
to account for case variations in training vocabulary (@jayantj, #804 - Link to Doc2Vec on airline tweets example in tutorials page (@544895340 , #823)
- Small error on Doc2vec notebook tutorial (@charlessutton, #816)
- Bugfix: Full2sparse clipped to use abs value (@tmylk, #811)
- WMD docstring: add tutorial link and query example (@tmylk, #813)
- Annoy integration to speed word2vec and doc2vec similarity. Tutorial update (@droudy, #799,#792 )
- Add converter of LDA model between Mallet, Vowpal Wabit and gensim (@dsquareindia, #798, #766)
- Distributed LDA in different network segments without broadcast (@menshikh-iv , #782)
- Update Corpora_and_Vector_Spaces.ipynb (@megansquire, #772)
- DTM wrapper bug fixes caused by renaming num_words in #755 (@bhargavvader, #770)
- Add LsiModel.docs_processed attribute (@hobson, #763)
- Dynamic Topic Modelling in Python. Google Summer of Code 2016 project. (@bhargavvader, #739, #831)