What's Changed
Since the release of v0.20.4 we have dramatically improved the quality of extractions, significantly reduced the number of runtime errors when adding an episode, and improved consistency and reliability of graph building.
- OpenSearch Integration for Neo4j by @prasmussen15 in #896
- OpenSearch updates by @prasmussen15 in #906
- Embedding fix by @prasmussen15 in #917
- fix-fulltext-syntax-error by @galshubeli in #914
- Graph quality updates by @prasmussen15 in #922
- Skip entity attribute extraction when no fields defined by @danielchalef in #924
- pre5 by @prasmussen15 in #926
- don't save duplicate edges by @prasmussen15 in #927
- Improve node deduplication w/ deterministic matching, LLM fallbacks by @danielchalef in #929
- Bump v0.30.0pre0 by @danielchalef in #932
- Refactor batch deduplication logic to enhance node resolution and track duplicate pairs (#929) by @danielchalef in #936
- Update pyproject.toml to 0.30.0pre1 by @danielchalef in #938
- Fix index out of range errors in LLM deduplication responses by @danielchalef in #939
- chore: Bump version by @paul-paliychuk in #940
- Improve node dedup prompts by @danielchalef in #942
- bump 0.30.0pre3 by @danielchalef in #946
- fix: Add edge type validation based on node labels by @danielchalef in #948
- Allow Edge extraction to keep discovered edge labels by @danielchalef in #950
- Improve JSON entity extraction prompt by @jackaldenryan in #949
- Make natural language extraction configurable by @danielchalef in #943
- 21 pre 7 by @prasmussen15 in #954
- fix: Prevent duplicate edge facts within same episode by @danielchalef in #955
- bump pre8 by @danielchalef in #956
- chore: Update edge extraction prompt to paraphrase instead of quote by @danielchalef in #957
- Bump version to 0.21.0pre9 by @danielchalef in #958
- fix: Fix typo in JSON entity extraction prompt by @jackaldenryan in #953
- Update Claude review prompt to focus on critical feedback by @danielchalef in #960
- feat: Add optional callback to control node summary generation by @danielchalef in #959
- Bump version to 0.21.0pre10 by @danielchalef in #962
- Refactor issue workflows for improved automation by @danielchalef in #964
- fix: Improve deduplication ID validation and logging by @danielchalef in #965
- filter out falsey values before creating embeddings by @prasmussen15 in #966
- Remove ensure_ascii configuration parameter by @danielchalef in #969
- Optimize edge deduplication prompt for caching and clarity by @danielchalef in #970
- fix: Improve edge extraction entity ID validation by @danielchalef in #968
- Bump version to 0.21.0pre12 by @danielchalef in #967
- validate nodes and edges aren't falsey by @prasmussen15 in #973
- Add group_id parameter to language extraction function by @danielchalef in #952
- Update issue triage workflow to allow non-write users for duplicate checks by @danielchalef in #974
- remove generic aoss_client interactions for release build by @prasmussen15 in #975
Full Changelog: v0.20.4...v0.21.0