Bacalhau 1.0 Release: Featuring Private Clusters, Octostore, and Federated Learning
Today marks the launch of Bacalhau 1.0, the general availability (GA) release of the open source distributed compute platform. The project’s mission is to revolutionize the way organizations and developers harness the power of collaborative computing, and the GA release marks an important milestone towards that goal. Since launching our beta release in November, the project has seen more than 3,000 commits from more than 30 contributors and a release every two weeks. Additionally, customers like the New Atlantis Foundation, the City of Las Vegas, and the University of Maryland are executing hundreds of thousands of jobs every month on the public network. To read more about Bacalhau, and try it out for yourself, go to https://bacalhau.org/.
Background
Distributed computing has long been recognized as a powerful approach for tackling large-scale, complex problems by harnessing the collective power of devices everywhere. However, developers face significant challenges in adopting it, including inefficient resource allocation, communication bottlenecks, and high barriers to entry for non-expert users.
But the time to address the issues is now. By 2025, IDC believes that we will have generated more than 175 zettabytes of data, 50 times more data than we do today. Yet critical insights to make better decisions are hidden behind distributed devices and storage.
(Re-)Introducing the Bacalhau Project
Bacalhau was created to address these challenges head-on through a platform designed from the ground up for the distributed world. Built by core members of the Kubernetes, Kubeflow, Amazon Kinesis communities and employees from Google, AWS, and Microsoft, Bacalhau provides a new way to build and use globally deployed applications and data that is familiar, high scale, and efficient. Further, because Bacalhau is open source and Apache2/MIT licensed, the community is built to foster collaboration and innovation, allowing developers from around the world to contribute their expertise and continually improve upon the platform.
General Availability Release of Bacalhau
The GA release of Bacalhau includes the following features:
- Running Docker & WASM jobs, with GPU support
- Multi-architecture support - Intel, Apple Silicon (M1/M2), ARMv6 & ARMv7, AMD64
- Support for 1000+ nodes
- Running 10k+ jobs simultaneously
- 100 TB processing across many files
- Simplified private cluster setup
- Reading and writing from any S3-compatible data store
- Concurrency and confidence for parallel and verifiable job execution
- Log streaming for Docker and WASM jobs
- DAG execution through Project Amplify
- Job selection hooks (against binaries, http endpoints, etc)
- Throttled allow-list networking
- Python SDK
- Airflow executors
- Open Telemetry Tracing
- Swappable verification, execution and publisher systems
- Scheduling against node labels
- Great examples for getting started including:
- Running Python, Pandas, R, Rust, TensorFlow, PyTorch natively (or any custom container)
- Running Jupyter Notebooks
- Converting a CSV to Avro or Parquet
- Reading simultaneously across many nodes from multiple S3 Buckets
- Querying data using DuckDB
- Processing Oceanographic Data
- Converting Video Files
- Running the Dolly 2.0 model with Hugging Face
- Using YOLOv5 for Object Detection
- Inferring using Stable Diffusion on a GPU
- Performing OCR
- Doing Speech Recognition
- Running an OpenMM Molecular Model
- Executing a Genomics Model
- And lots more!
Long Term Mission
Our long term goal is to transform the way that developers can interact with the breadth of computing and data resources out there. Some of the features we have on the horizon include:
A fully distributed computation platform that can run on any device, anywhere
A declarative pipeline that can both run the data processing and also record the lineage of the data
A highly resilient system that can schedule across latency boundaries and deliver the reliability a global deployment needs, even over spotty network connectivity
Secure and verifiable results that can be used to confirm the integrity and reproducibility of the results forever
But you tell us! We'd love to hear about new directions we may need to include.
How to Get Involved
We're looking for help in several areas. If you're interested in helping out, please reach out to us at any of the following locations: