Jina 0.5.0 Release
We are excited to release Jina 0.5.0. Jina is the easier way to do neural search on the cloud. Highlights of this release include:
- Recursive Document structure
- Native data querying capabilities
- Migration of Executors to Jina Hub
- Support for Mindspore
β¬οΈ Major Features and Improvements
Completeness
- Introduce recursive Document structure. In short, the protobuf definition of
Document
andChunk
are unified. In this new representation,Document
has a recursive structure and the deprecatedChunk
is now a nestedDocument
one level deeper. This new proto enables cleaner driver design, yields more consistent low-level APIs, and provides great extensibility on future features. #652, #684, #700, #709 #729 #726
This is a breaking change. If you started using Jina before
0.4.1
, we highly suggest you read our migration guide.
- Add native data querying capabilities. With the new family of Drivers based on
BaseQueryLangDriver
, you can perform standard query operations on theDocument
. Here is a list of the new drivers:
Name | Description | Counterpart in other query languages |
---|---|---|
FilterQL
| Filter the Document/Chunk by its attributes | filter /where
|
SelectQL , SelectRegQL , ExcludeQL , ExcludeRegQL
| Select attributes | select /exclude
|
SliceQL
| Take the first k doc/chunk | limit /take /slicing
|
SortQL
| Sort a list of Document s
| sort /order_by
|
ReverseQL
| Reverse the list of collections | reverse
|
Check more details at New Query Language Driver.
Usability
- Migrate executors to Jina Hub. Jina Hub is an open registry for hosting Jina executors via container images. It enables users to ship and exchange reusable components across various Jina search applications. Jina Hub is referred as a Git Submodule in Jina. The Jina team will maintain the executors on Jina Hub. You can build your own executors as well. #852, #842, #848, #855, #857, #861, #860, #871, #872, #879, #880, #854
Check more details at Jina Hub.
Universal
β οΈ Breaking Changes
- Unify
yaml_file
and image withuses
. You can use: a YAML file path, a supported Executor's class name, the content of a YAML config, or a Docker image. Check more details by runningjina pod --help
or in the Jina docs #684
v0.4.0 | v0.5.0 |
f = (Flow()
.add(name='from_class', yaml_file='_pass')
.add(name='from_yaml', yaml_file='mwu.yml')
.add(name='from_str', yaml_file='!OneHotTextEnocoder')
.add(name='from_docker', image='jinaai/hub.examples.mwu_encoder')) |
f = (Flow()
.add(name='from_class', uses='_pass')
.add(name='from_yaml', uses='mwu.yml')
.add(name='from_str', uses='!OneHotTextEnocoder')
.add(name='from_docker', uses='jinaai/hub.examples.mwu_encoder')) |
- Replace the
replicas
argument withparallel
to avoid misunderstanding.parallel
indicates how many Peas are running in parallel. #700
v0.4.0 | v0.5.0 |
!Flow
pods:
encode:
uses: helloworld.encoder.yml
replicas: 2 |
!Flow
pods:
encode:
uses: helloworld.encoder.yml
parallel: 2 |
- Replace
join
withneeds
to improve readability. #762
v0.4.0 | v0.5.0 |
f = (Flow()
.add(name='p1', uses='_pass')
.add(name='p2', uses='_pass', needs='p1')
.add(name='p3', uses='_pass', needs='p1')
.needs(['p2', 'p3'])) |
f = (Flow()
.add(name='p1', uses='_pass')
.add(name='p2', uses='_pass', needs='p1')
.add(name='p3', uses='_pass', needs='p1')
.join(needs=['p2', 'p3'])) |
- Introduce recursive Document structure. This affects a wide range of drivers and executors. Please refer to the full list at #702
π Bug Fixes and Other Changes
Flow
- Refactor and improve the code for building the Flow. #685
- Fix
export_api
. #695 - Fix the Pea name. #698
- Fix the bug of two
join
operations in the same Flow. #730 - Add an alias
_pass
for_forward
; add an argument,name
, forFlow.join()
so that one can customize the name of the Pods; add an argument,uses
, forFlow.join()
, which unifies the usage ofyaml_path
andimages
. #748 - Improve URL regex pattern matching #780
Executors
- Add
FeatureAgglomeration
,TSNEEncoder
,RandomSparseEncoder
,RandomGaussianEncoder
in the numeric encoders. #567, #838 - Fix multiple bugs in
MilvusIndexer
#677 #679 - Support full range of models from π€Transformers. #701
- Fix the type bug in
NgtIndexer
. #742 - Refactor the image crafter. #759
- Refactor the framework-based executors to make it easier to build executors from various DL frameworks. #771, #800
- Add
ImageFlipper
. #777 - Fix
cached_property
. #785 - Add
TorchObjectDetectionSegmenter
in the crafters for object detection. #770, #784, #788 - Fix the bug in cropping the image. #769
- Add a
query_by_id
function for BaseVectorIndexer so that we can query by Document id. #827 - Refactor
FaissIndexer
#825 - Fix a bug in serialization of the indexer. #874
Drivers
- Fix the slicing bug in the
QueryLang
and improve the documents. #696, #714, #822 - Add
ConcateEmbedDriver
for concatenate vectors. #748 - Fix the default value issue of the
level_depth
. #817
Documentation
- Add a shortcut for search in the docs. You can start searching by hitting the
/
key. #683 - Add section on common practices. #812
- Add a wall of contributors. For our awesome contributors, we've now put your profiles on our README Thanks to all of you! #832, #835
- Add more explanations for commit messages to make it easier to contribute. #826
- Rephrase and fix typos #722, #731, #740, #768, #818, #820, #821, #837, #849
- Improve visualization and fix cluttered TOC. #801
Protos
- Refactor
tags
frommap
toStruct
. #719
Tests
- Add unit test for
QueryLang
. #710 - Add tests for
VectorSearchDriver
andKVSearchDriver
. #733 - Add tests for
EncodeDriver
. #734 - Add tests for
CraftDriver
. #737 - Add tests for
SegmentDriver
. #738 - Add tests for
SliceQL
. #782 - Add tests for
Chunk2DocRankDriver
. #813 - Improve the unit tests for indexers and add type checking. #838, #844
Others
- Add tests and coverage report in CI. Jina's current test coverage is 76.52% #713 #682
- Add typing to Jina. #761
- Fix the broken labeling action. #787
- Support ignoring packages on the dependency list. #859
- Add missing
Pillow
dependency. #858
π Thanks to our Contributors
This release contains contributions from Alex C-G, Andrey Vasnetsov, Anish Pawar, BingHo1013, Emmanuel Adesile, Eric Shen, Han Xiao, JamesTang616, Joan Fontanals Martinez, Kavan72, Maanav Shah, Morry Wang, Nan Wang, Rohan Chaudhari, Shivam Ra, Shivam Raj, Yue Liu, Zenahr Barzani, coolmian, dima, fhaase2, hanxiao, joanna350, roccia, shivam-raj.
π Thanks to our Community
And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.
π€ Work with Jina
Want to work with Jina full-time? Check out our openings on our website.