github treeverse/lakeFS v0.40.0

latest releases: v1.37.0, v1.36.0, v1.35.0...
3 years ago

Changelog

This is a big release for lakeFS with many notable improvements.

Some of these are breaking changes. It's always a tough decision to introduce a change that isn't backwards compatible,
but we felt that at this stage they represent a significant enough benefit to be worth it.

Going forward, our goal is to make as few of those as possible, as we near a 1.0.0 release.

Here are the most notable changes:

lakeFS is now OpenAPI 3.0 compliant ✨

The lakeFS API has been migrated from OpenAPI 2.0 to OpenAPI 3.0.

OpenAPI 3.0 includes many improvements over the previous version: Cookie based authentication, reusable query parameters, better JSON Schema support and more.

While homegrown clients that simply use the lakeFS API as a REST inteface will continue to work,
client that relied on OpenAPI 2.0 specific behaviors will stop working.

This includes the previously recommended bravado based client for Python. For that reason, we're also releasing an officially supported Python client:

lakeFS now ships with a native Python client ✨

It's now as simple as:

$ pip install lakefs-client~=0.40.0

And then:

import lakefs_client
from lakefs_client.client import LakeFSClient

lakefs = LakeFSClient(lakefs_client.Configuration(
    username='AKIAIOSFODNN7EXAMPLE', 
    password='...', 
    host='http://lakefs.example.com'))
    
lakefs.branches.list_branches(repository='my-repo')  # Or any other API action

This client is officially supported and distributed by the lakeFS team, and will be released in conjunction with lakeFS releases, so it should always align in capabilities with the latest lakeFS versions.

For more information, see the Python Client Documentation.

Native Spark client, allowing to export a commit (or set of commits) to another object store ✨

Using Apache Spark, lakeFS users can now quickly export the contents of a branch to an external location (say, S3 bucket). Exporting committed data will be parallelized using Spark workers to support copying millions of objects in minutes.

This is the first feature released based on lakeFS' Spark integration (soon to be followed by data retention for stale objects), and a native lakefs:// filesystem support for Spark).

For more information, see the Export Job configuration Documentation.

lakeFS standardized URIs ✨

The lakeFS CLI now supports a standardized URI in the form: lakefs://<repository>/<ref>/<path>.
Additionally, the CLI now allows setting a $LAKECTL_BASE_URI environment variable that, if set, will prefix any relative URI used.

For example, instead of:

$ lakectl diff lakefs://my-repository/my-branch lakefs://my-repository/main
$ lakectl fs ls lakefs://my-repository/my-branch/path/

It's now possible to simply do:

$ export LAKECTL_BASE_URI="lakefs://my-repository/"
$ lakectl diff my-branch main
$ lakectl fs ls mybranch/path/

For more information, see the CLI Command Reference Documentation.

Complete UI Overhaul

Making it faster, more responsive and contains many improvements to pagination, commit browsing and action views.

UI Screenshot

Full Feature list

  • [UI] Complete UI overhaul 💅 (#1766)
  • [Spark] Spark client that allows exporting from lakeFS to an object store ✨ (#1658)
  • [Metastore] Support metastore copy between two different hive metastores ✨ (#1704)
  • [API Gateway] BREAKING: Migrated to OpenAPI 3.0 💣 (#1667)
  • [Python SDK] Native lakeFS Python Client ✨ (#1725)
  • [Graveler] BREAKING: commit parents order for merge-commits are now [destination, source] instead of [source, destination] 💣 (#1754)
  • [CLI] BREAKING: lakefs:// URIs are now standard, replacing @ with / to denote ref 💣 (#1717)
  • [CLI] $LAKECTL_BASE_URI prefixes all lakectl URIs for more a human-friendly CLI 🥰 (#1717)
  • [CLI] Support non-seekable stdin (- arg) in "fs upload" command 🥰 (#1672)

Bug Fixes

  • [S3 Gateway] Avoid logging v2 sigs on failure 🔒 (#1679)
  • [Graveler] Limit length of Graveler serialization 🐞 (#1682)
  • [Graveler] Fix merge large changes performance (#1652)
  • [S3 Gateway] Handle no path for delete objects in gateway 🐞 (#1708)
  • [API Gateway] API merge message is optional 🐞 (#1710)
  • [API Gateway] Fix auth pagination 🐞 (#1755)
  • [API Gateway] List repository actions should not check branch existence 🐞 (#1743)

As always, we hang around at #help on the lakeFS Slack to assist and answer questions!

Don't miss a new lakeFS release

NewReleases is sending notifications on new releases.