github deepset-ai/haystack v2.0.0-beta.3

latest releases: v1.26.4, v2.7.0, v2.7.0-rc1...
pre-release11 months ago

Release Notes

v2.0.0-beta.3

⬆️ Upgrade Notes

  • If you are using AzureOCRDocumentConverter or TikaDocumentConverter, you need to change paths to sources in the run method.

    An example: `python from haystack.components.converters import TikaDocumentConverter converter = TikaDocumentConverter() converter.run(paths=["paths/to/file1.pdf", "path/to/file2.pdf"])`

    The last line should be changed to: `python converter.run(sources=["paths/to/file1.pdf", "path/to/file2.pdf"])`

⚡️ Enhancement Notes

  • Adds markdown mimetype support to the file type router i.e. FileTypeRouter class.

  • Refactor Answer dataclass and classes that inherited it. Now Answer is a Protocol, classes that used to inherit it now respect that interface. We also added a new ExtractiveTableAnswer to be used for table question answering.

    All classes now are easily serializable using to_dict() and from_dict() like Document and components.

  • Make all Converters accept meta in the run method, so that users can provide their own metadata. The length of this list should match the number of sources.

  • Make all the Converters accept the sources parameter in the run method. sources is a list that can contain str, Path or ByteStream objects.

  • Renamed the confidence_threshold parameter of the ExtractiveReader to score_threshold as ExtractedAnswers have a score and this is what the threshold is for. For consistency, the term confidence is not mentioned anymore in favor of score.

  • Include 'boilerpy3' in the 'haystack-ai' dependencies.

Known Issues

  • Make connect idempotent, allowing connecting the same components more than once. Specially useful in Jupiter notebooks. Fixes #6359.
  • Fix "TypeError: descriptor '__dict__' for 'XXX' objects doesn't apply to a 'XXX' object" when running pipelines with debug=True by removing the graph image from the debug payload.

🐛 Bug Fixes

  • Make TransformersSimilarityRanker run with a list containing a single document as input.

Don't miss a new haystack release

NewReleases is sending notifications on new releases.