github dathere/qsv 13.0.0

5 days ago

[13.0.0] - 2026-01-06 🦾 "The Statistical Data-Wrangling Agent Release" 🤖

We welcome 2026 with qsv 13.0.0 - a major milestone that transforms qsv into an AI-native Agent!

This is in addition to the online AI-Chatbot for CKAN portals we released last September and the expanded describegpt command we released last month as we continue our march towards even more AI/ML/Graph/FAIR and Data Librarian/Concierge/Advisor/Analyst capabilities across the datHere suite in the coming months as we embark on a strategic partnership with the Open Knowledge Foundation to Strengthen Open, FAIR, AI-Ready Data Infrastructure powered by CKAN.

This release introduces first-class support for AI agents through three major new capabilities:

MCP Server - Model Context Protocol Integration

qsv now ships with a built-in Model Context Protocol (MCP) Server enabling seamless integration with AI Chatbots starting with Claude Desktop.

  • Local Data - Its "zero-copy" inspired approach allows you to wrangle very large datasets - WITHOUT sending raw data, only sending statistical metadata to Claude! This is not only good for security and privacy reasons - it overcomes Claude's upload size limit, saves tokens and improves performance!
  • 22 MCP Tools: 20 common qsv commands as individual tools + 1 generic tool to access all other 46 commands + 1 pipeline tool
  • Natural Language Interface: No need to remember command syntax
  • Pipeline Support: Chain multiple operations together seamlessly

See the MCP documentation for detailed setup instructions.

Claude Agent SDK Helper Utilities

New Agent Skills infrastructure provides:

  • qsv-skill-gen CLI - Generate skill definitions for AI agents
  • Parses qsv USAGE text using qsv-docopt to generate JSON skill definitions. This allows quick update of Agent Skills as commands and options are added & modified.
  • Shell-safe example generation with proper quoting
  • Comprehensive documentation for AI agent integration to integrate qsv into your own AI solutions!

moarstats - Massive Statistical Expansion

The moarstats command received substantial enhancements, adding 24+ MOAR statistical measures:

Advanced Univariate Statistics:

  • Bimodality Coefficient - Detect multimodal distributions
  • Normalized Entropy - Scaled information content measure (0-1)
  • Atkinson Index - Inequality measure with configurable epsilon parameter

Bivariate Statistics:

  • Pearson's correlation - Linear correlation coefficient
  • Spearman's rank correlation - Monotonic relationship measure
  • Kendall's tau - Concordance-based correlation
  • Covariance - Joint variability measure
  • Mutual Information - Information-theoretic dependency
  • Normalized Mutual Information - Scaled mutual information (0-1)
  • Multi-dataset joins - --join-inputs for bivariate analysis ACROSS datasets

XSD Type Mapping:

  • Automatic inference of W3C XML Schema Definition (XSD) datatypes
  • Smart XSD Gregorian date type inferencing with "quick" and "thorough" modes (#3259)
  • Support for gYear, gMonth, gDay, gMonthDay, gYearMonth validation

See STATS_DEFINITIONS.md for a comprehensive list of the ~100 statistical metrics qsv compiles!


Breaking Changes

  • lens: Default behavior changed to NOT stream from stdin (use explicit flag if needed)
  • moarstats: Output now includes additional columns (xsd_type, bivariate stats)

Added

  • feat: qsv MCP server #3269
  • feat: MCP - expanded file selector for more supported tabular file formats; auto index for files larger than 10mb #3278
  • feat: added Claude Agent Skills SDK support 🤖 #3264
  • feat: moarstats add "xsd_type" column #3242
  • feat: moarstats add Atkinson Index with configurable inequality aversion parameter, Normalized Entropy & Bimodal Coefficient #3243
  • feat: moarstats add bivariate stats #3247
  • feat: moarstats add normalized mutual info #3256
  • feat: moarstats add --force and --jobs options #3253
  • feat: moarstats add "xsd_subtype" Gregorian date data types inferencing with --xsd-gdate-scan having fast (default) and comprehensive modes #3259
  • feat: qsvdp enable join command that moarstats uses #3252
  • docs: added comprehensive stats documentation #3240

Changed

  • refactor: describegpt - consolidate JSON response parsing; cache handling; and make DuckDB & Polars error handling more consistent #3241
  • refactor: frequency reduce duplication introduced by --weight option #3236
  • perf: frequency precompute other_prefix for performance 2dc75ee
  • perf: frequency simplify apply_limits* helper functions f0b7f9c
  • perf: pivotp convert directly to PlSmallStr for performance b7dbb3f
  • refactor MCP Server to optimize for Local Access to Files #3272
  • refactor: MCP Server improvements #3274
  • refactor: MCP Server remove examples from ci tests #3277
  • refactor: MCP Server add LIFO converted cache #3280
  • refactor: MCP Server moar refactoring after tests #3282
  • perf: moarstats much faster bivariate calculation #3248
  • perf: moarstats optimize non-streaming bivariate stats compilation #3250
  • refactor: qsv Skills Agent #3267
  • deps: polars bump to rev c241260 #3276
  • build(deps): bump itoa from 1.0.16 to 1.0.17 by @dependabot[bot] in #3239
  • build(deps): bump human-panic from 2.0.4 to 2.0.5 by @dependabot[bot] in #3234
  • build(deps): bump human-panic from 2.0.5 to 2.0.6 by @dependabot[bot] in #3249
  • build(deps): bump libc from 0.2.178 to 0.2.179 by @dependabot[bot] in #3265
  • build(deps): bump redis from 1.0.1 to 1.0.2 by @dependabot[bot] in #3232
  • build(deps): bump rfd from 0.16.0 to 0.17.0 by @dependabot[bot] in #3279
  • build(deps): bump rfd from 0.17.0 to 0.17.1 by @dependabot[bot] in #3284
  • build(deps): bump serde_json from 1.0.147 to 1.0.148 by @dependabot[bot] in #3238
  • build(deps): bump serial_test from 3.2.0 to 3.3.0 by @dependabot[bot] in #3273
  • build(deps): bump serial_test from 3.3.0 to 3.3.1 by @dependabot[bot] in #3275
  • build(deps): bump tokio from 1.48.0 to 1.49.0 by @dependabot[bot] in #3266
  • build(deps): bump url from 2.5.7 to 2.5.8 by @dependabot[bot] in #3286
  • build(deps): numerous bumps zmij from 0.1.7 to 1.0.12
  • bumped several indirect dependencies
  • applied select clippy & Codacy suggestions
  • applied several GH Copilot and Claude review suggestions

Fixed

  • fix: refresh_cpu_all() -> refresh_cpu_list(sysinfo::CpuRefreshKind::nothing())… #3261
  • fix: stats remove redundant check 0977ebf
  • fix: moarstats correct kendall_tau formula cf16543
  • fix: describegpt and util::run_qsv_cmd - add special case for sample as it expects output differently 6b6039f
  • fix: CVE-2025-66414 security vulnerability GHSA-w48q-cv73-mx4w
  • fix: RUSTSEC-2026-0001 (rkyv bump) c2d4937
  • typo: Portugese → Portuguese
  • typo: stats asummes → assumes

AI Contributors

  • @jqnatividad collaborated with and orchestrated @Copilot, Claude Code, Cursor and Gemini using various models

Full Changelog: 12.0.0...13.0.0

Don't miss a new qsv release

NewReleases is sending notifications on new releases.