github jqnatividad/qsv 0.109.0

latest releases: 0.138.0, 0.137.0, 0.136.0...
16 months ago

This is a monstrous👹 release with lots of new features and improvements!

The biggest new feature is the describegpt command which allows you to use OpenAI's Large Language Models to generate extended metadata from a CSV. We created this command primarily for CKAN and Datapusher+ so we can infer descriptions, tags and to automatically created annotated data dictionaries using the CSV's summary statistics and frequency tables. In that way, it works even for very large CSV files without consuming too many Open AI tokens. This is a very powerful feature and we are looking forward to seeing what people do with it. Thanks @rzmk for all the work on this!

This release also features major improvements in the sqlp and joinp commands thanks to all the new capabilities of Polars 0.31.1.

Polars SQL's capabilities have been vastly improved in 0.31.1 with numerous new SQL functions and operators, and they're all available with the sqlp command.

The joinp command has several new options for CSV parsing, for pre-join filtering (--filter-left and --filter-right), and pre-join validation with the --validate option. Two new asof join variants (--left_by and --right_by) were also added.

Added

  • describegpt command by @rzmk in #1036
  • describegpt: minor refactoring in #1104
  • describegpt: --key & QSV_OPENAI_API_KEY by @rzmk in #1105
  • describegpt: add --user-agent in help message by @rzmk in #1095
  • describegpt: json output format for redirection by @rzmk in #1107
  • describegpt: add testing (resolves #1114) by @rzmk in #1115
  • describegpt: add --model option (resolves #1101) by @rzmk in #1117
  • describegpt: polishing #1122
  • describegpt: add --jsonl option (resolves #1086) by @rzmk in #1127
  • describegpt: add --prompt-file option (resolves #1085) by @rzmk in #1120
  • joinp: added asof_by join variant; added CSV formatting options consistent with sqlp CSV format options #1090
  • joinp: add --filter-left and --filter-right options #1146
  • joinp: add --validate option #1147
  • fetch & fetchpost: add --no-cache option #1112
  • sniff: detect file kind along with mime type #1137
  • user-agent metadata now contains the current command's name #1093

Changed

Fixed

  • fmt: Quote ASCII format differently by @LemmingAvalanche in #1075
  • apply: make dynfmt subcommand case sensitive. Fixes #1126 #1130
  • applydp: make dynfmt case-sensitive #1131
  • describegpt: docs/Describegpt.md: typo 'a' --> 'an' by @rzmk in #1135
  • tojsonl: support snappy-compressed input. Fixes #1133 #1145
  • security.md: fix mailto text by @rzmk in #1079

New Contributors

Full Changelog: 0.108.0...0.109.0

Don't miss a new qsv release

NewReleases is sending notifications on new releases.