github deepset-ai/haystack v1.19.0

latest releases: v2.2.4, v2.2.4-rc1, v1.26.3-rc1...
11 months ago

⭐️ Highlights

🔎 Elasticsearch 8 support

We are thrilled to share that Haystack now supports the latest version of Elasticsearch, Elasticsearch 8, as Document Store backend. To use Haystack with Elasticsearch 8, just install the new elasticsearch8 extra:

pip install farm-haystack[elasticsearch8]

Importing ElasticsearchDocumentStore from haystack.document_stores will automatically choose the correct Document Store based on the version of the installed Elasticsearch client.

🗂️ RecentnessRanker

We're excited to introduce a new feature to Haystack – a document recentness ranking component! We recognized the importance of ranking documents based on their recentness, especially in scenarios where timely information is critical. For instance, when searching through technical documentation for software releases or news articles, it's essential to prioritize the most up-to-date information. 👇

from haystack.nodes import RecentnessRanker

ranker = RecentnessRanker(
    date_meta_field="date",  # Key pointing to the date field in the metadata.
    ranking_mode="score",
    weight=0.5,  # A 0.5 weight means content relevance and age are averaged.
)

For more details, check out the documentation.

🧠 Improved support for Anthropic Claude

We're thrilled to announce an important update to Haystack's Anthropic Claude support! This update follows the latest improvements in Anthropic Claude models, notably support for Claude 2 and their humongous context window sizes.

Moreover, we've integrated Claude models into our example scripts, making it easier for users to test these cutting-edge models. For instance, check out the updated examples/link_content_blog_post_summary.py script for a demo of Claude summarizing blog posts directly from hyperlinks.

We still support the old models (i.e., claude-v1) and the new Claude models. For more details, see the Anthropic Claude documentation.

🚀 Support for Llama 2 on AWS SageMaker

We are excited to share that Haystack now supports models of the Llama 2 family deployed to AWS SageMaker! Once you’ve deployed your Llama 2 models (including the chat variant) in AWS SageMaker, use them with PromptNode by simply providing the inference endpoint name, your aws_profile_name and aws_custom_attributes👇

from haystack.nodes import PromptNode

prompt_node = PromptNode(
    model_name_or_path="sagemaker-llama-2-endpoint-name", 
    model_kwargs={"aws_profile_name": "my_aws_profile_name", 
                                      "aws_custom_attributes":{"accept_eula": True}}
)
result = prompt_node("Berlin is the capital of")
print(result)

# or the Llama 2 chat model
prompt_node = PromptNode(
    model_name_or_path="sagemaker-llama-2-chat-endpoint-name", 
    model_kwargs={"aws_profile_name": "my_aws_profile_name", 
                                      "aws_custom_attributes":{"accept_eula": True}}
)
chat_conversation = [[
    {"role": "user", "content": "what is the recipe of mayonnaise?"},
]]
result = prompt_node(chat_conversation)
print(result)

For more details on model deployment, check out the documentation.

🎉 Now using transformers 4.31.0

With this release, Haystack depends on the latest version of the transformers library, allowing support for Llama 2.

🚫 SklearnQueryClassifier deprecation

Starting from version 1.19, SklearnQueryClassifier is being deprecated and will be removed from Haystack as of version 1.21. We recommend using the more powerful TransformersQueryClassifier instead. See the announcement for more details.

What's Changed

Pipeline

  • feat: globally disable progress bars by @ZanSara in #5207
  • Add cpu-remote-inference Docker image by @vblagoje in #5225
  • fix: Support isolated node eval in run_batch in Generators by @bogdankostic in #5291
  • feat: support OpenAI-Organization for authentication by @anakin87 in #5292
  • docs: Small documentation updates to dense.py by @sjrl in #5305
  • test: Refactor some retriever tests into unit tests by @sjrl in #5306
  • feat: Add support for meta fields that are lists when using embed_meta_fields by @sjrl in #5307
  • refactor: Extract link retrieval from WebRetriever, introduce LinkContentFetcher by @vblagoje in #5227
  • fix: update WebRetriever docstrings and default mode by @dfokina in #5352
  • added hybrid search example by @nickprock in #5376

DocumentStores

  • fix: Allow filtering on list fields in InMemoryDocumentStore with all operators by @bogdankostic in #5208
  • Fix: FAISSDocumentStore - make write_documents properly work in combination w update_embeddings by @anakin87 in #5221
  • bug: fix for pinecone not working for per document updates by @vblagoje in #5110
  • fix: avoid conflicts with opensearch / elasticsearch magic attributes during bulk requests by @tstadel in #5113
  • ci: Add unit test for Elasticsearch8 by @bogdankostic in #5300
  • feat: Check version of Elasticsearch server and add support for Elasticsearch <= 7.5 by @bogdankostic in #5320

Documentation

  • feat: BM25 retrieval for MemoryDocumentStore by @vblagoje in #5151
  • fix: install inference in REST API tests by @ZanSara in #5252
  • fix: import_utils fetch_archive_from_http - improve url parsing for fetching archive from http by @malte-aws in #5199
  • fix: Improve robustness of get_task HF pipeline invocations by @MichelBartels in #5284
  • feat: introduce Store protocol (v2) by @ZanSara in #5259
  • fix: num_return_sequences should be less than num_beams, not top_k by @faaany in #5280
  • Revert "fix: num_return_sequences should be less than num_beams, not top_k" by @julian-risch in #5434
  • chore: deprecate SklearnQueryClassifier by @anakin87 in #5324
  • fix: Run HFLocalInvocationLayer.supports even if inference packages are not installed by @MichelBartels in #5308
  • fix: a small bug in StopWordsCriteria by @faaany in #5316
  • chore: fix typo in base.py by @eltociear in #5356
  • feat: extend pipeline.add_component to support stores by @ZanSara in #5261
  • proposal: Add RecentnessRanker component by @elundaeva in #5289
  • feat: Add embed_meta_fields to Ranker nodes by @sjrl in #5361
  • feat: Recentness Ranker by @elundaeva in #5301
  • feat: Update Anthropic Claude support with the latest models, new streaming API, context window sizes by @vblagoje in #5406
  • feat: Enable Support for Meta LLama-2 Models in Amazon Sagemaker by @vblagoje in #5437

Other Changes

New Contributors

Full Changelog: v1.18.1...v1.19.0

Don't miss a new haystack release

NewReleases is sending notifications on new releases.