Jina v0.7.0

We are excited to release Jina v0.7.0. Jina is an easier way to do a neural search on the cloud. Highlights of this release include:

Flow evaluation support
Support for preventing duplicates Documents in the index
Flow visualization support

Release v0.7.0

⬆️ Major Features and Improvements

Completeness

Evaluation is fully supported by Jina. jina.executors.evaluators and jina.drivers.evaluate have been introduced to make this happen. Now you can use different metrics to evaluate the Flow. No matter whether you want to evaluate the whole Flow or just part of it, the evaluation can be done smoothly without stopping the running Flow. #1043, #1086, #1087, #1090, #1092, #1099, #1100, #1102, #1114, #1134

Click here to see the example codes

code

index-doc.yml

eval.yml

_{from jina.flow import Flow
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
import numpy as np

def get_index_docs():
doc0 = jina_pb2.Document()
doc0.tags['id'] = '0'
doc0.embedding.CopyFrom(array2pb(np.array([1, 1])))
doc1 = jina_pb2.Document()
doc1.tags['id'] = '1'
doc1.embedding.CopyFrom(array2pb(np.array([1, -1])))
return [doc0, doc1]

# indexed two docs
f_index = (Flow().add(uses='index-doc.yml'))
with f_index:
f_index.index(input_fn=get_index_docs)

def get_eval_docs():
doc = jina_pb2.Document()
doc.embedding.CopyFrom(array2pb(np.array([1, 1])))
groundtruth = jina_pb2.Document()
match0 = groundtruth.matches.add()
match0.tags['id'] = '0'
match1 = groundtruth.matches.add()
match1.tags['id'] = '2'
return [(doc, groundtruth), ]

def validate(resp):
# retrieved docs with id `0` and `1`
# relevant docs with id `0` and `2`
# Precision@2 = 0.5
assert resp.docs[0].evaluations[0].value == 0.5

# evaluate Precision@2
f_eval = (Flow()
.add(uses='index-doc.yml')
.add(uses='eval.yml'))
with f_eval:
f_eval.search(
input_fn=get_eval_docs,
output_fn=validate,
callback_on_body=True)}

_{!CompoundIndexer
components:
- !NumpyIndexer
metas:
name: vecidx
- !BinaryPbIndexer
metas:
name: docidx
requests:
on:
IndexRequest:
- !VectorIndexDriver
with:
executor: vecidx
traversal_paths: ['r']
- !KVIndexDriver
with:
executor: docidx
traversal_paths: ['r']
SearchRequest:
- !VectorSearchDriver
with:
executor: vecidx
traversal_paths: ['r']
- !KVSearchDriver
with:
executor: docidx
traversal_paths: ['m']}

_{!PrecisionEvaluator
with:
eval_at: 2
id_tag: 'id'}

To prevent duplicates in the index, UniquePbIndexer and UniqueVectorIndexer are introduced together with the corresponding drivers in jina.drivers.cache. Please refer to docs.jina.ai for more details. #1064, #1081, #1147

Click here to see the example codes

_{from jina.flow import Flow
from jina.proto import jina_pb2

doc_0 = jina_pb2.Document()
doc_0.text = f'I am doc0'
doc_1 = jina_pb2.Document()
doc_1.text = f'I am doc1'

def assert_num_docs(rsp, num_docs):
assert len(rsp.IndexRequest.docs) == num_docs

f = Flow().add(
uses='NumpyIndexer', uses_before='_unique')

with f:
f.index(
[doc_0, doc_0, doc_1],
output_fn=lambda rsp: assert_num_docs(rsp, num_docs=2))}

Usability

Add visualization for Flow. Calling plot() function of Flow gives a better view of how the Flow looks. #1002, #1116

Click here to see the example codes

⚠️ Breaking Changes

Document.id, Document.parent_id and Relevance.ref_id are now string types instead of int. Please refer to docs.jina.ai for more details. #1005, #1034, #1136 Accordingly, the following changes are made,
- SortQL.field now uses dunder_get syntax rather than . expansion (e.g. a.b.c -> a__b__c, score.value -> score__value) and now supports dict and list access.
- first_doc_id, random_doc_id and override_doc_id have been removed from CLI.
Refactor logger config into YAML. Add --log-config to jina pea CLI, by default it points to logging.default.yml. --log-sse, --log-profile, --log-with-own-name are deprecated. #1031

Click here to check how the loggers are mapped to different resource files:

Filename	Logger in the code
logging.default.yml	`default_logger` and any logger defined with `JinaLogger()`
logging.docker.yml	`logger` used in the `ContainerPea`
logging.profile.yml	`profile_logger`
logging.remote.yml	`logger` used in the `RemotePea`

Refactor the codes for traversing recursive Documents. Replaced by traversal_paths, granularity_range, adjacency_range, recur_on and recursion_order are deprecated. This allows us to specify where the traversal should happen in an exact way. #995, #998, #1001, #1003, #1006, #1007, #1027, #1036, #1044
Protobuf request_id is now string type. --first-request-id removed from client CLI. --query-uses and --index-uses from hello-world CLI now renamed to --uses-query and --uses-index. #1049

🐞 Bug Fixes and Other Changes

Flow

Refactor log stream server with fluentd. Flunetd acts as a daemon collecting logs from different parts of Jina and forwarding them to a specific output. Check out more details at docs.jina.ai #1002, #999
Add ordinal_idx_arg for batching decorator to support passing ordinal index to indexers #1089
Refactor request_id to uuid #1049
Refactor logger wrapper #1029
Add ssh tunneling for Pod. You can specify ssh information #1018
Switch to hash function for generating ids #1005, #1034
Support to use --uses-before and --uses-after when --parallel=1. Both options only act on when parallel > 1. _pass and _forward are using RouteDriver by default. #1112
Rename replica_id to pea_id and fix the PeaRoleType #1015
Fix the bug in setting top_k #1133 #1138 #1145

Executors

Add checking for the existence of model paths #1077
Improve exception handling for the failure of loading pre-trained models #1065
Fix typing of indexers #1053
Fix the no attribute error for BaseOnnxEncoder #1107

Drivers

Fix bug in QueryDriver when passing dictionary argument. #1080

CLI

Improve the hubio module. jina hub login supports to login with the OAuth authentification. jina hub list is for list the available pods in the jina-hub. jina hub push support to build and push the pod images via Hubapi deployed on AWS API Gateway #1022, #1041, #1118, #1120, #1135
Add the update checking for jina cli #1117

Tests & CICD

Refactor test for Python client #1095
Add tests for including examples during ci #1088
Fix dependency conflicts in ci by replacing [match-py-ver] with [cicd] #1101
Improve PR review process by adding CODEOWNERS #1108
Refactor to pytest in testing request #1045
Add unit test for helper #1046
Fix io test #1052
Fix test coverage #1054, #1056
Use pytest fixture to remove tmp files #1021
Refactor the unit tests to pytest style in test_protobuf #1121
Add docker helper test #1115
Add test in the ci for testing examples #1142
Add test in the ci for testing hello-world in docker with no devel installed #1139

Documentation

Add Portuguese translation for README #1097
Add Ukrainian translation for README.md #1124
Fix Russian README #1057
Fix broken links in README #1033, #1037, #105
Fix links in CHANGELOG and CONTRIBUTING #1032
Improve the docstring for rank drivers #1143

Others

Fix duplicate lines in cookiecutter #1063
Fix conflicts between copyright adding action and typing #1023
Move numpy importing inside function #1019
Rename jina_cli to cli #1017
Fix typing error in mypy #1009
Fix line spaces in code #1105

🙏 Thanks to our Contributors

This release contains contributions from Alex C-G, Alex McKenzie, CatStark, Christopher Lennan, Deepankar Mahapatro, Fernanda Kawasaki, Han Xiao, Joan Fontanals Martinez, Ján Jendrušák, Maximilian Werk, Nan Wang, Oleh Yaroshchuk, Pratik Bhavsar, RenrakuRunrat, Rutuja Surve, Sai Sandeep Mutyala, Sergei Averkiev, Susana Guzman, Wang Bo, jancijen, pswu11

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you, Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

jina-ai/serve v0.7.0 🎉 release v0.7.0 on GitHub

Jina v0.7.0

Release v0.7.0

⬆️ Major Features and Improvements

Completeness

Usability

⚠️ Breaking Changes

🐞 Bug Fixes and Other Changes

Flow

Executors

Drivers

CLI

Tests & CICD

Documentation

Others

🙏 Thanks to our Contributors

🙏 Thanks to our Community

🤝 Work with Jina

jina-ai/serve v0.7.0
🎉 release v0.7.0

on GitHub