A lot of work was done on sniff
to make it not just a CSV dialect detector, but a general purpose file type detector leveraging 🪄 magic ✨ - able to detect mime types even for files on URLs.
sniff
can now also use the same data types as stats
with the --stats-types
option. This was primarily done to support metadata collection when registering CKAN resources not only during data entry, but also when checking resource links for bitrot, and when harvesting metadata from other systems, so stats
& sniff
can be used interchangeably based on the response time requirement and the data quality of the data source.
For example, sniff
can be used for quickly inferring metadata by just downloading a small sample from a very large data file DURING data entry ("Resource-first upload workflow"), with stats
being used later on, when the data is actually being pushed to the Datastore with Datapusher+, when data type inferences need to be guaranteed, and the entire file will need to be scanned.
Added
stats
: add--infer-boolean
option #967sniff
: add--stats-types
option #968sniff
: add magic mime-type detection on Linux #970sniff
: add--user-agent
option bd0bf78sniff
: add last_modified info ef68bff
Changed
- make
--envlist
option allocator-aware f3566dc - Bump serde from 1.0.160 to 1.0.162 by @dependabot in #962
- Bump robinraju/release-downloader from 1.7 to 1.8 by @dependabot in #960
- Bump flexi_logger from 0.25.3 to 0.25.4 by @dependabot in #965
- Bump sysinfo from 0.28.4 to 0.29.0 by @dependabot in #966
- Bump jql-runner from 6.0.6 to 6.0.7 by @dependabot in #969
- Bump polars from 0.28.0 to 0.29.0 by @dependabot in #971
- apply select clippy recommendations
- cargo update bump indirect dependencies
- change MSRV to 1.69.0
- pin Rust nightly to 2023-05-07
Fixed
sniff
: make sniff give more consistent results #958. Fixes #956- Bump qsv-sniffer from 0.8.3 to 0.9.1. Replaced all assert with proper error-handling. #961 a7c607a 43d7eaf
sniff
: fixed rowcount calculation when sniffing a URL and the entire file was actually downloaded - ef68bff
Full Changelog: 0.101.0...0.102.0