piskvorky/gensim 1.0.0 on GitHub

1.0.0, 2017-02-24

Deprecated methods:

In order to share word vector querying code between different training algos(Word2Vec, Fastext, WordRank, VarEmbed) we have separated storage and querying of word vectors into a separate class KeyedVectors.

Two methods and several attributes in word2vec class have been deprecated. The methods are load_word2vec_format and save_word2vec_format. The attributes are syn0norm, syn0, vocab, index2word . They have been moved to KeyedVectors class.

After upgrading to this release you might get exceptions about deprecated methods or missing attributes.

DeprecationWarning: Deprecated. Use model.wv.save_word2vec_format instead.
AttributeError: 'Word2Vec' object has no attribute 'vocab'

To remove the exceptions, you should use
KeyedVectors.load_word2vec_format instead of  Word2Vec.load_word2vec_format
word2vec_model.wv.save_word2vec_format instead of  word2vec_model.save_word2vec_format
model.wv.syn0norm instead of  model.syn0norm
model.wv.syn0 instead of  model.syn0
model.wv.vocab instead of model.vocab
model.wv.index2word instead of  model.index2word

Changelog of this release:

New features:

Add Author-topic modeling (@olavurmortensen,#893)
Add FastText word embedding wrapper (@jayantj,#847)
Add WordRank word embedding wrapper (@parulsethi,#1066, #1125)
Add Varembed word embedding wrapper (@anmol01gulati, #1067))
Add sklearn wrapper for LDAModel (@AadityaJ,#932)

Deprecated features:

Move load_word2vec_format and save_word2vec_format out of Word2Vec class to KeyedVectors (@tmylk,#1107)
Move properties syn0norm, syn0, vocab, index2word from Word2Vec class to KeyedVectors (@tmylk,#1147)
Remove support for Python 2.6, 3.3 and 3.4 (@tmylk,#1145)

Improvements:

Python 3.6 support (@tmylk #1077)
Phrases and Phraser allow a generator corpus (ELind77 #1099)
Ignore DocvecsArray.doctag_syn0norm in save. Fix #789 (@accraze,#1053)
Fix bug in LsiModel that occurs when id2word is a Python 3 dictionary. (@cvangysel,#1103
Fix broken link to paper in readme (@bhargavvader,#1101)
Lazy formatting in evaluate_word_pairs (@akutuzov,#1084)
Deacc option to keywords pre-processing (@bhargavvader,#1076)
Generate Deprecated exception when using Word2Vec.load_word2vec_format (@tmylk, #1165)
Fix hdpmodel constructor docstring for print_topics (#1152) (@toliwa, #1152)
Default to per_word_topics=False in LDA get_item for performance (@menshikh-iv, #1154)
Fix bound computation in Author Topic models. (@olavurmortensen, #1156)
Write UTF-8 byte strings in tensorboard conversion (@tmylk,#1144)
Make top_topics and sparse2full compatible with numpy 1.12 strictly int idexing (@tmylk,#1146)

Tutorial and doc improvements:

Clarifying comment in is_corpus func in utils.py (@greninja,#1109)
Tutorial Topics_and_Transformations fix markdown and add references (@lgmoneda,#1120)
Fix doc2vec-lee.ipynb results to match previous behavior (@bahbbc,#1119)
Remove Pattern lib dependency in News Classification tutorial (@luizcavalcanti,#1118)
Corpora_and_Vector_Spaces tutorial text clarification (@lgmoneda,#1116)
Update Transformation and Topics link from quick start notebook (@mariana393,#1115)
Quick Start Text clarification and typo correction (@luizcavalcanti,#1114)
Fix typos in Author-topic tutorial (@Fil,#1102)
Address benchmark inconsistencies in Annoy tutorial (@droudy,#1113)
Add note about Annoy speed depending on numpy BLAS setup in annoytutorial.ipynb (@greninja,#1137)
Add documentation for WikiCorpus metadata. (@kirit93, #1163)

piskvorky/gensim 1.0.0 1.0.0 Author-Topic modelling on GitHub

piskvorky/gensim 1.0.0
1.0.0 Author-Topic modelling

on GitHub