github jina-ai/jina v1.2.0
πŸŽ‰ Jina 1.2

latest releases: v3.25.1, v3.25.0, v3.24.1...
3 years ago

Jina 1.2

DevRel - Release Note Banner 1-2

We are excited to release Jina 1.2. Jina is the easier way to do neural search in the cloud. Highlights of this release include:

  • Improve the performance when handling sparse embeddings.
  • Add support to Hugging Faces πŸ€— API
  • Add support to spell checking

Release 1.2

⬆️ Major Features and Improvements

Build your search system with sparse embeddings

Here at Jina, our primary goal is to develop a universal framework to support all your neural search use cases. From Jina 1.2 onwards, you can create a neural search application with s p a r s e embeddings (see what I did there?). This is especially handy in certain use cases like product catalogs which are normally encoded in a one-hot vector format. If you are interested in deploying a sparse vector app check out our documentation guide here. The related pull requests can be found here #2207, #2233, #2239, #2240, #2271, #2296, #2297, #2309, #2316.

100x performance gain for encoding your data with Hugging FacesπŸ€— ' API 🏎️

Every machine learning engineer knows the pain of lying awake at night worrying if their data is slowly being encoded on their laptop nearby. Make this experience a distant memory of the past (like those days when you could hug your friends), and check out Hugging Face new inference API! They've done some fascinating work on speeding up inference from models in the transformers library. You can now benefit from this performance gain by using the TransformerTorchEncoder hub module in your Jina Flow and plugging in your Hugging Face API key. Check out the details here.

Handle misspelling in search queries

We all know that computers can be a little picky. We humans would know that Jan Solo is just a misspelling for the famous Star Wars character Han Solo. Being able to handle these misspelling queries is a complex topic. This release includes a basic solution for this problem. You can now implement a crafter executor which will train a machine learning auto-correction model on your corpus to handle simple misspellings. Find out more PySpellChecker and jina-hub/crafters/nlp/SpellChecker.

⚠️ Breaking Changes

  • Combine batching_multi_input decorator into batching. #2269
  • Make the IDs of Peas start at index 0. #2243
  • Improve the APIs on the executors level. #2313

πŸ“— Documentation

🐞 Bug Fixes and Other Changes

Flow

  • Add CANCEL command to ControlRequestProto` in order to remove the dealer from the router. #2257
  • Add reload API to Flow. #2278, #2280, #2285
  • Improve the flaky logging when creating a Flow. #2279 @mohamed--abdel-maksoud
  • Fix the wrong assignment to sibling. #2300
  • Add reload API to the RESTful APIs. #2301
  • Fix the flushing issue when a Flow is interrupted by KeyboardInterrupt. #2353

Executors

  • Refactor the evaluator's name #1570
  • Remove the deprecated codes related to training #2311
  • Improve the usability of CompoundPod. #2329
  • Add PodFactory for abstracting the Pod construction. #2346
  • [Experimental] Split the indexer into dump indexers and query indexers. Introduce BaseDBMSIndexer, BinaryPbDBMSIndexer, KeyValueDBMSIndexer as dump indexers. Introduce BaseQueryIndexer, NumpyQueryIndexer, BinaryPbQueryIndexer, BinaryPbQueryIndexer as query indexers. #2260, #2310, #2312, #2307
  • [Experimental] Introduce replicas. #2224, #2338
  • [Experimental] Improve the APIs on the executors level. This refactoring greatly improves the usability of Jina when users want to implement customized executors. Check out more details at #2313, #2317, #2327, #2351

Tests

  • Adapt the unit tests to the latest RESTful APIs #2251
  • Add unit tests for plot function #2245
  • Replace the DocumentProto with Document in test_eval_collect_driver.py #2276
  • Add unit tests for zmqlet #2361 @winstonww

Others

  • Remove unnecessary codes in BaseAggregateMatchesRankerDriver #2258
  • Improve the get_public_ip function by using multithreads #2267, #2272, #2277
  • Add WhooshDriver for celebrating April Fools' Day. 😜 #2273
  • Fix scipy version #2293
  • Fix the parameter passing during CI #2328
  • Add EmbeddingClsType for using different embedding types #2318

πŸ™ Thanks to our Contributors

This release contains contributions from @hanxiao @florian-hoenicke @alexcg1 @davidbp @Yongxuanzhang @bwanglzu @FionnD @Kelton8Z @nan-wang @JoanFM @cristianmtr @deepankarm @mohamed--abdel-maksoud @carlosb1 @winstonww

πŸ™ Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🀝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

Don't miss a new jina release

NewReleases is sending notifications on new releases.