github dathere/qsv 4.0.0

latest releases: 11.0.2, 11.0.1, 11.0.0...
8 months ago

[4.0.0] - 2025-04-13

Highlights:

This is a major release with numerous improvements!

  • qsv can now read more file formats by leveraging the Polars engine:
    • Arrow/IPC, Avro, Parquet, JSON (JSON array) and JSONL
    • Automatic decompression support for compressed CSV file dialects (csv, tsv/tab & csv) using gzip (.gz), zlib (.zlib), zstd (.zst) compression formats. (e.g. data.csv.gz, data.tsv.zst, data.ssv.zlib)
      qsv lens data.csv.gz
      qsv sample 1000 data.parquet | qsv stats | qsv lens
      qsv frequency data.tab.zlib | qsv lens
      qsv search Waldo data.ssv.zst | qsv table
      qsv select 2-5 data.jsonl | qsv lens
      
  • New geoconvert command for converting spatial formats to CSV:
    • GeoJSON
       # convert TX_cities.geojson to CSV, filter out the geometry column and browse with lens
       qsv geoconvert TX_cities.geojson geojson csv | qsv select '!geometry' | qsv lens
      
    • Shapefile (SHP)
  • Enhanced split command with new --filter option:
    • Similar to GNU split
    • Spawns a subprocess for each chunk
      # split input.csv into outdir, each chunk having 100,000 rows, gzip compressing each chunk
      qsv split --size 100000 outdir data.csv --filter 'gzip $FILE'
      
  • Expanded to command:
    • added LibreOffice/OpenOffice Calc (ODS) support
    • re-enabled parquet generation now that it's using Arrow instead of DuckDB (which made for very long compiles)
  • New uniqueCombinedWith JSON Schema custom keyword in validate command:
    • Allows validating uniqueness across multiple columns
    • Useful for composite key validation
  • QSV_DOTENV_PATH now supports the sentinel value "<NONE>" to disable dotenv processing altogether.

Added

  • geoconvert: new command to convert spatial formats to CSV by @rzmk in #2681 & #2688
  • split: add --filter options #2660
  • sqlp: add decimal type support #2646
  • to: add back to parquet support #2665
  • feat: Extended auto decompression support. In addition to snappy auto-decompression, auto-decompress CSV dialects (tsv/tab & ssv files) using gzip, zlib and zstd compression formats #2671
  • to: add ODS support #2674
  • validate: add uniqueCombinedWith custom JSON Schema Validation keyword #2636
  • feat: prompt add file formats supported to dialog box filter when polars feature is enabled #2667
  • feat: add QSV_POLARS_FLOAT_PRECISION env var #2678
  • tests: add tests for https://100.dathere.com/lessons/3 by @rzmk in #2638

Changed

  • qsvdp binary variant can now use the geocode & geoconvert commands 50f0046
  • geocode feature now gates the geocode & geoconvert command 9d046e8
  • stats: made stdin handling more robust by adding delimiter inferencing ddecd98
  • feat: setting QSV_DOTENV_PATH to sentinel value "<NONE>" disables dotenv processing #2684
  • refactor: polars special formats support #2683
  • contrib(completions): update completions to v3.3.0 by @rzmk in #2626
  • contrib(completions): update completions for qsv v4.0.0 by @rzmk in #2677
  • deps: bump polars to 0.46.0 at py-1.27.1 tag #2675 and e5d29d7
  • build(deps): bump actions/setup-python from 5.4.0 to 5.5.0 by @dependabot in #2627
  • build(deps): bump arboard from 3.4.1 to 3.5.0 by @dependabot in #2653
  • build(deps): bump chrono-tz from 0.10.2 to 0.10.3 by @dependabot in #2623
  • build(deps): bump crossbeam-channel from 0.5.14 to 0.5.15 by @dependabot in #2672
  • build(deps): bump csvs_convert from 0.11.0 to 0.11.1 by @dependabot in #2686
  • build(deps): bump data-encoding from 2.8.0 to 2.9.0 by @dependabot in #2685
  • build(deps): bump flate2 from 1.1.0 to 1.1.1 by @dependabot in #2649
  • build(deps): bump flexi_logger from 0.29.8 to 0.30.0 by @dependabot in #2650
  • build(deps): bump flexi_logger from 0.30.0 to 0.30.1 by @dependabot in #2651
  • build(deps): bump governor from 0.8.1 to 0.9.0 by @dependabot in #2625
  • build(deps): bump governor from 0.9.0 to 0.10.0 by @dependabot in #2631
  • build(deps): bump jsonschema from 0.29.0 to 0.29.1 by @dependabot in #2635
  • build(deps): bump log from 0.4.26 to 0.4.27 by @dependabot in #2622
  • build(deps): bump mimalloc from 0.1.44 to 0.1.45 by @dependabot in #2652
  • build(deps): bump minijinja from 2.8.0 to 2.9.0 by @dependabot in #2643
  • build(deps): bump minijinja-contrib from 2.8.0 to 2.9.0 by @dependabot in #2642
  • build(deps): bump pyo3 from 0.24.0 to 0.24.1 by @dependabot in #2645
  • build(deps): bump qsv-dateparser from 0.12.1 to 0.13.0 by @dependabot in #2639
  • build(deps): bump qsv-sniffer from 0.10.3 to 0.11.0 by @dependabot in #2640
  • build(deps): bump redis from 0.29.2 to 0.29.4 by @dependabot in #2663
  • build(deps): bump redis from 0.29.4 to 0.29.5 by @dependabot in #2666
  • build(deps): bump smallvec from 1.14.0 to 1.15.0 by @dependabot in #2656
  • build(deps): bump sysinfo from 0.34.0 to 0.34.1 by @dependabot in #2637
  • build(deps): bump sysinfo from 0.34.1 to 0.34.2 by @dependabot in #2648
  • build(deps): bump titlecase from 3.4.0 to 3.5.0 by @dependabot in #2669
  • build(deps): bump tokio from 1.44.1 to 1.44.2 by @dependabot in #2662
  • applied select clippy lint suggestions
  • bumped indirect dependencies to latest version

Fixed

  • fix: select panic when idx is out of bounds #2670
  • fix: correct link to qsv-dateparser accepted date formats #2632
  • fix: reset SIGPIPE handling #2664
  • docs: fix typo it's -> its by @rzmk in #2680

Full Changelog: 3.3.0...4.0.0

Don't miss a new qsv release

NewReleases is sending notifications on new releases.