github pgcentralfoundation/pgrx v0.5.0-beta.1

latest releases: v0.11.4, v0.12.0-alpha.1, v0.12.0-alpha.0...
pre-release19 months ago

This is a pilot release for 0.5.0 and no further functional changes are intended from now until the 0.5.0 release. Bikeshedding, renaming, and purely architectural/internal rearrangements may still occur, but the only things anyone should have to change would be within the scope of simple find-replace rules. Bugfixes and other non-functional changes will be merged. We do not intend to merge any additional new features.

For newer users

Code written using pgx can now use pgx::prelude::*; instead of use pgx::*; and in many cases this will make the extension compile with only a few additional explicit imports. Existing extensions may require extra effort and may want to approach this more incrementally.

Upgrading

Most of the transition should be smooth, except for certain extensions. You will need to reinstall cargo pgx first: everything regarding SQL generation changed internally, and cargo pgx is the binary that handles most of the actual SQL compilation. We now recommend using cargo install --locked cargo-pgx --version 0.5.0-beta.1 for this.

There are three major pain points you may immediately encounter: code that directly handles datums, date/time types, and set-returning or table-returning functions. But after that compiles, you will also want to address any soundness or behavior questions that might arise regarding Arrays in your code.

Handling Datums

Since 0.5.0-beta.0, pg_sys::Datum is no longer type Datum = usize;, but a "newtype": a struct wrapping a pointer that indicates it is an opaque type. This is for correctness and to better represent the unique nature of the type with respect to Postgres. A fair amount of your code may have relied on T as pg_sys::Datum or datum as T working. Certainly some in pgx did, resulting in patterns like:

// generating fresh arrays for data
let mut datums: Vec<pg_sys::Datum> = vec![0; N];
let mut nulls = vec![false; N];

// treating a datum as a pass-by-value type
let index = datum as i32;

// treating a datum as a pointer type
let varlena = datum as *mut pg_sys::varlena;

Now it doesn't! To make the transition easy, From<T> for pg_sys::Datum is implemented for everything it is reasonable to convert directly into it in a "by-value" fashion. Use IntoDatum for anything more complex. To get the usize hiding inside Datum out, use Datum::value(self) -> usize, to cast it as a pointer, Datum::cast_mut_ptr::<T>(self) -> *mut T is available.

// (getting|putting) data (from|into) Postgres as a datum array
let mut datums = vec![pg_sys::Datum::from(0); N];
let mut nulls = vec![false; N];

// treating a datum as a pass-by-value type
let index = datum.value() as i32;

// treating a datum as a pointer type
let varlena = datum.cast_mut_ptr::<pg_sys::varlena>();

Because a pg_sys::Datum is a union of types, only certain traits that are implemented on usize are also implemented on Datum, as it was deemed safest to limit implementations to those that are also valid if Datum is a pointer or float. This can induce code to reveal any assumptions it was making.

If you were an early upgrader to 0.5.0-beta.0: First, thanks for trying things out! Second, you may have used Datum::ptr_cast: it has the new name of cast_mut_ptr. Datum::to_void was simply dropped as it is unlikely to be what you want, and if it is, cast_mut_ptr::<c_void> works.

Date/Time types

Previously, pgx leaned on the excellent support for dates and times provided by the time crate. Unfortunately, that also meant that we ran into problems like "not representing all time values that Postgres understands". It also meant using heavy conversions when touching these types, even to just copy them out of Postgres, resulting in large amounts of switching between Rust and Postgres during control flow for e.g. constructing a Vec<Timestamp> from a pg_sys::ArrayType that was a handle to a varlena of pg_sys::Timestamp. To allow fixing these performance and correctness problems, pgx now implements these types in terms of the Postgres representations in Rust code.

This also means that all the {Date,Time,Timestamp}{,WithTimeZone}::new functions are now deprecated.

For easy transition, {,Try}{Into,From} for types in the time crate are available with pgx { version = "0.5.0.beta-1", features = ["time-crate"] }.

With thanks to @mhov, @workingjubilee, and @jsaied99's contributions:

Set/Table-returning functions

Functions that previously would be RETURNS SET OF or RETURNS TABLE used to return impl Iterator. Unfortunately, not being able to name the type of the iterator greatly complicated pgx 0.5.0's redesign for SQL generation lead by @Hoverbear and @workingjubilee. pgx::iter::SetOfIterator and pgx::iter::TableIterator now need to be used instead to return these values. In all existing cases you should be able to simply wrap the previous expression in {SetOf,Table}Iterator::new, but for cases where you are returning a single row, TableIterator::once makes for a sweeter alternative.

For example, this:

#[pg_extern]
fn fluffy() -> impl Iterator<Item = i32> {
   vec![1,2,3].into_iter()
}

Becomes this:

#[pg_extern]
fn fluffy() -> SetOfIterator<'static, i32> {
   SetOfIterator::new(vec![1,2,3].into_iter())
}

Array soundness issues

There were several soundness issues regarding interactions between the several ways of constructing pgx::datum::Array<'a, T>, Array::as_slice, and impl<'a, T> Drop for Array<'a, T>. While the critical soundness issue for Drop has been fixed and the resulting leak plugged, these have resulted in the deprecation ofArray::over and Array::as_slice.

Array::as_slice allowed wildly incorrect slices, or viewing data Postgres had marked as "null" in the SQL sense. Instead, this function now panics if either of these conditions would be met, or returns a newly-correct slice for simple data types. Because the usually-trivial fn as_slice panicking is rarely expected, this function is deprecated.

Array::over allows constructing improper Arrays from pointers to random data that is probably controlled by Rust and not Postgres, and thus is data in a form Rust prefers, when Array is supposed to be a zero-copy type that handles ArrayType data in the form Postgres prefers. This makes it almost impossible for the underlying type to be useful for both cases without severely impacting correctness and performance.

Both expose implications about the way that Postgres represents data that are not always true. If you still want to interact with the underlying representation of a Postgres Array and you know the Array was correctly constructed, consider the new experimental pgx::array::RawArray type, created with support from bitvec. It is possible pgx::datum::Array<'a, T> may be replaced entirely in the future by a new type that exposes less assumptions about internal repr.

New features and fixes

In general, PGX added significant improvements for working with uncommon use-cases or existing configurations, with thanks to @steve-chavez, @anth0nyleung, and @TimeToogo.

This includes a soundness fix for the way PGX handled symbols on MacOS thanks to @Smittyvb. This should fix a number of random crashes users were experiencing, but we can't tell for sure (by definition: UB!). Please give it a try and let us know if there are any issues remaining.

We also moved some code into a new crate: pgx-pg-config.

Better soundness

@willmurnane contributed several soundness and correctness improvements, allowing input functions to validate or reject input (now accepting Options).

BackgroundWorker functions may now panic if used from unregistered workers. Thanks to @EdMcBane for catching the previously unsound behavior.

Optional CREATE OR REPLACE FUNCTION

In 0.4.5, PGX switched to using CREATE FUNCTION by default, as this is nondestructive and more secure. However, that prevented migration beyond 0.4.4 for some users, as they relied on CREATE OR REPLACE for their extension update mechanism. You may now use #[pg_extern(create_or_replace)] to make a function that will use the CREATE OR REPLACE FUNCTION directive. It is advised to use caution and only add this annotation where necessary, if possible: it is not unsafe_create_or_replace, but mostly because Rust's memory safety does not extend to all security concerns.

Test improvements

Thanks to @BradyBonnette, #[pg_test] compatible with normal test attributes like #[ignore] and #[should_panic], and it has much nicer output! We're looking to continue making improvements in this direction, so any user feedback is welcome!

Bindings and mappings added

We added support for the fdw_handler type.

For directly interfacing with Postgres C API, some new bindings were added:

Postgres 15 on the horizon

SpiOk is now #[non-exhaustive] to allow migration to Postgres 15 and beyond. We will likely add Postgres 15 support sometime shortly after the November release and also likely drop Postgres 10 near that time.

Experimental features

You may have noticed the "plrust" and "postgrestd" features. These are highly experimental features for using PGX to back a very special configuration of Rust as a trusted procedural language: PL/Rust. It's best to assume these can induce unsound changes if they aren't combined with the rest of the configuration for the trusted language handler's building and running of functions.

New Contributors

Thanks!

Full Changelog: v0.5.0-beta.0...v0.5.0-beta.1

Don't miss a new pgrx release

NewReleases is sending notifications on new releases.