Tantivy 0.24.1
- fix: Set rust-version to 1.81
Tantivy 0.24
Tantivy 0.24 will be backwards compatible with indices created with v0.22 and v0.21. The new minimum rust version will be 1.75. Tantivy 0.23 will be skipped.
Bugfixes
- fix potential endless loop in merge #2457(@PSeitz)
- fix bug that causes out-of-order sstable key. #2445(@fulmicoton)
- fix ReferenceValue API flaw #2372(@PSeitz)
- fix
OwnedBytes
debug panic #2512(@b41sh) - catch panics during merges #2582(@rdettai)
- switch from u32 to usize in bitpacker. This enables multivalued columns larger than 4GB, which crashed during merge before. #2581 #2586(@fulmicoton-dd @PSeitz)
Breaking API Changes
Features/Improvements
-
Aggregation
- Support for cardinality aggregation #2337 #2446 (@raphaelcoeffic @PSeitz)
- Support for extended stats aggregation #2247(@giovannicuccu)
- Add Key::I64 and Key::U64 variants in aggregation to avoid f64 precision issues #2468(@PSeitz)
- Faster term aggregation fetch terms #2447(@PSeitz)
- Improve custom order deserialization #2451(@PSeitz)
- Change AggregationLimits behavior #2495(@PSeitz)
- lower contention on AggregationLimits #2394(@PSeitz)
- fix postcard compatibility for top_hits, add postcard test #2346(@PSeitz)
- reduce top hits memory consumption #2426(@PSeitz)
- check unsupported parameters top_hits #2351(@PSeitz)
- Change AggregationLimits to AggregationLimitsGuard #2495(@PSeitz)
- add support for counting non integer in aggregation #2547(@trinity-1686a)
-
Range Queries
- Support fast field range queries on json fields #2456(@PSeitz)
- Add support for str fast field range query #2460 #2452 #2453(@PSeitz)
- modify fastfield range query heuristic #2375(@trinity-1686a)
- add FastFieldRangeQuery for explicit range queries on fast field (for
RangeQuery
it is autodetected) #2477(@PSeitz)
-
make find_field_with_default return json fields without path #2476(@trinity-1686a)
-
Make
BooleanQuery
supportminimum_number_should_match
#2405(@LebranceBW) -
RegexPhraseQuery
RegexPhraseQuery
supports phrase queries with regex. E.g. query "b.* b.* wolf" matches "big bad wolf". Slop is supported as well: "b.* wolf"~2 matches "big bad wolf" #2516(@PSeitz) -
Optional Index in Multivalue Columnar Index
For mostly empty multivalued indices there was a large overhead during creation when iterating all docids (merge case).
This is alleviated by placing an optional index in the multivalued index to mark documents that have values.
This will slightly increase space and access time. #2439(@PSeitz) -
Store DateTime as nanoseconds in doc store DateTime in the doc store was truncated to microseconds previously. This removes this truncation, while still keeping backwards compatibility. #2486(@PSeitz)
-
Performace/Memory
- lift clauses in LogicalAst for optimized ast during execution #2449(@PSeitz)
- Use Vec instead of BTreeMap to back OwnedValue object #2364(@fulmicoton)
- Replace TantivyDocument with CompactDoc. CompactDoc is much smaller and provides similar performance. #2402(@PSeitz)
- Recycling buffer in PrefixPhraseScorer #2443(@fulmicoton)
-
Json Type
-
QueryParser
- fix de-escaping too much in query parser #2427(@trinity-1686a)
- improve query parser #2416(@trinity-1686a)
- Support field grouping
title:(return AND "pink panther")
#2333(@trinity-1686a) - allow term starting with wildcard #2568(@trinity-1686a)
-
Change in Executor API #2391(@fulmicoton)
-
Removed usage of num_cpus #2387(@fulmicoton)
-
use bingang for agg and stacker benchmark #2378#2492(@PSeitz)
-
make convert_to_fast_value_and_append_to_json_term pub #2370(@PSeitz)
-
Fix trait bound of StoreReader::iter #2360(@adamreichold)