[7.1.0] - 2025-09-06
🇮🇹 csv,conf,v9 edition 🍝
![]() | Just in time for csv,conf,v9, we're Bologna-bound and will be talking all things qsv, CSV, open data, metadata standards, AI, POSE and CKAN! For this feature release, we polished describegpt a bit more for the occassion...Towards the "People's API!"! Verso l'API del Popolo! (Answering People/Policymaker Interface) |
🚀 Enhanced describegpt
Command
- Configurable Frequency Limits: Make frequency distribution limit configurable for better control over data analysis
- Few-shot Learning: Add
--fewshot-examples
option to improve LLM response quality with contextual examples - Advanced SQL Generation: Fine-tuned SQL generation guidance for better date handling and query optimization
- Conditional SQL Results: Implement conditional
--sql-results
format for more efficient "SQL RAG" processing - i.e. if the generated SQL query executes successfully - the results are saved to the specified file with a.csv
extension. If a "SQL hallucination" fails, the file is saved with a.sql
extension instead for the user to tweak and edit. - TogetherAI Support: Add support for TogetherAI models endpoint, expanding LLM provider options
- Enhanced Error Handling: Improved SQL parsing error handling and more informative error messages
- Disk Cache by Default: The disk cache is now enabled by default for better performance
- TOML Configuration: Migrate from JSON to more readable TOML format for more easily modifiable prompt files.
(see https://github.com/dathere/qsv/blob/master/resources/describegpt_defaults.toml) - Better Local LLM Support:
--api-key
can now be set to NONE for local LLM configurations that may not necessarily run onlocalhost
(e.g. a shared Local LLM service running on the local network)
partition
Command Enhancements
- New
--limit
Option: Implement--limit
option to set the maximum number of open files - Streaming to Enhanced Batching Logic: Convert from streaming to a simplified, two-pass batched approach designed to partition on columns with high cardinality for very large datasets
Added
describegpt
: add configurable frequency limit #2950describegpt
: migrate prompt file from JSON to more easier to edit TOML format #2954describegpt
: refactor default prompt file; add--fewshot-examples
option #2955describegpt
: add TogetherAI support for models endpoint #2965partition
: add--limit
option #2960- added Windows ARM64 prebuilt binaries
Changed
describegpt
: enable disk cache by default #2951describegpt
: Polars SQL generation tweaks #2958python
: replace deprecatedwith_gil
withattach
#2949. This sets the stage for "free-threaded" Python 3.14 support when its released in October 2025. Buh-bye GIL!- deps: bump embedded Luau from 0.688 to 0.690 #2967
- deps: bump Polars to 0.50.0 at py-1.33.0 tag
- build(deps): bump actions/setup-python from 5.6.0 to 6.0.0 by @dependabot[bot] in #2962
- build(deps): bump actions/stale from 9 to 10 by @dependabot[bot] in #2963
- build(deps): bump log from 0.4.27 to 0.4.28 by @dependabot[bot] in #2961
- build(deps): bump mlua from 0.11.2 to 0.11.3 by @dependabot[bot] in #2948
- build(deps): bump pyo3 from 0.25.1 to 0.26.0 by @dependabot[bot] in #2946
- build(deps): bump uuid from 1.18.0 to 1.18.1 by @dependabot[bot] in #2956
- build(deps): bump zip from 4.5.0 to 4.6.0 by @dependabot[bot] in #2952
- applied select clippy lints
- updated indirect dependencies
Full Changelog: 7.0.1...7.1.0