Improvements
- Support for writing based on Arrow as the transfer mechanism of the data
from Python to GDAL (requires GDAL >= 3.8). This is provided through the
newpyogrio.raw.write_arrow
function, or by using theuse_arrow=True
option inpyogrio.write_dataframe
(#314, #346). - Add support for
fids
filter toread_arrow
andopen_arrow
, and to
read_dataframe
withuse_arrow=True
(#304). - Add some missing properties to
read_info
, including layer name, geometry name
and FID column name (#365). read_arrow
andopen_arrow
now provide
GeoArrow-compliant extension metadata,
including the CRS, when using GDAL 3.8 or higher (#366).- The
open_arrow
function can now be used without apyarrow
dependency. By
default, it will now return a stream object implementing the
Arrow PyCapsule Protocol
(i.e. having an__arrow_c_stream__
method). This object can then be consumed
by your Arrow implementation of choice that supports this protocol. To keep
the previous behaviour of returning apyarrow.RecordBatchReader
, specify
use_pyarrow=True
(#349). - Warn when reading from a multilayer file without specifying a layer (#362).
- Allow writing to a new in-memory datasource using io.BytesIO object (#397).
Bug fixes
- Fix error in
write_dataframe
if input has a date column and
non-consecutive index values (#325). - Fix encoding issues on windows for some formats (e.g. ".csv") and always write ESRI
Shapefiles using UTF-8 by default on all platforms (#361). - Raise exception in
read_arrow
orread_dataframe(..., use_arrow=True)
if
a boolean column is detected due to error in GDAL reading boolean values for
FlatGeobuf / GPKG drivers (#335, #387); this has been fixed in GDAL >= 3.8.3. - Properly ignore fields not listed in
columns
parameter when reading from
the data source not using the Arrow API (#391). - Properly handle decoding of ESRI Shapefiles with user-provided
encoding
option forread
,read_dataframe
, andopen_arrow
, and correctly encode
Shapefile field names and text values to the user-providedencoding
for
write
andwrite_dataframe
(#384). - Fixed bug preventing reading from bytes or file-like in
read_arrow
/
open_arrow
(#407).
Packaging
- The GDAL library included in the wheels is updated from 3.7.2 to GDAL 3.8.5.
Potentially breaking changes
- Using a
where
expression combined with a list ofcolumns
that does not include
the column referenced in the expression is not recommended and will now
return results based on driver-dependent behavior, which may include either
returning empty results (even if non-empty results are expected fromwhere
parameter)
or raise an exception (#391). Previous versions of pyogrio incorrectly
set ignored fields against the data source, allowing it to return non-empty
results in these cases.