github psviderski/uncloud v0.19.0

latest release: nightly
7 hours ago

This release brings easier debugging of failed deploys, new troubleshooting commands (uc machine logs and uc machine rtt), and a nightly release channel.

Read on for the full list of changes and upgrade instructions.

Show logs from a failed deploy

Changes: 8afe523, 303c8e4, 56a54ab

When uc deploy fails because a pre-deploy hook failed or a new container couldn't become healthy, it now prints the last 10 (configurable) log lines from the failed container.

This is a quality-of-life improvement that saves you from running a follow-up uc logs to figure out what went wrong.

See Failed container logs for more details.

Filter service logs by container

Change: 2045819

uc logs now accepts a SERVICE/CONTAINER form where CONTAINER is a container name, full ID, or unique ID prefix. This is handy when one replica of a service is misbehaving and you want to look at it without the noise from the others:

uc logs web/2f60
# Mix and match
uc logs api/61d57fd3428f web/2f60 db

Stream machine logs

⚠️ This requires both uc and the daemon to be upgraded to v0.19.0

PRs: #282, #283. Thanks to @miekg for the contribution ❤️

You can now stream logs from the systemd services that run Uncloud itself on remote machines using the new uc machine logs command. It covers three main services:

  • uncloud - the Uncloud daemon
  • uncloud-corrosion - the Corrosion service providing the distributed cluster store
  • docker - the Docker daemon

For example, stream the Uncloud daemon logs from all machines with:

uc machine logs -f uncloud

This is useful for troubleshooting Uncloud operations without having to SSH into every machine and run journalctl yourself.

See uc machine logs for more details and examples.

Round-trip time between machines

⚠️ This requires both uc and the daemon to be upgraded to v0.19.0

PR: #226. Thanks to @jabr for the contribution ❤️

The new uc machine rtt command shows the round-trip time between every pair of machines in the cluster. This gives you a real-time view of how the mesh network is performing without needing to run manual ping tests.

The data is collected from Corrosion's gossip protocol, which samples latency between peers as part of its normal operation.

$ uc machine rtt
MACHINE     PEER        MEDIAN   STDDEV
machine-1   machine-2   140ms    ±19.4ms
machine-1   machine-3   39ms     ±1.1ms
machine-2   machine-1   168ms    ±18.5ms
machine-2   machine-3   203ms    ±42.3ms
machine-3   machine-1   40ms     ±2.0ms
machine-3   machine-2   158ms    ±15.2ms

A new RTT column has also been added to uc wg show to show the median round-trip time to each WireGuard peer.

Local machine upstreams first in Caddyfile

Change: 45cf87a

The generated Caddyfile now lists upstreams from the local machine first for each service.

On its own this doesn't change routing behaviour (Caddy's default random load balancing policy ignores order). But if you pair it with the first policy in a custom Caddy config, you can always prefer the same-host replica and only fall back to remote machines when the local one is unhealthy.

services:
  app:
    ...
    x-caddy: |
      example.com {
          reverse_proxy {{upstreams 8000}} {
              import common_proxy
              lb_policy first
          }
          log
      }

This saves a cross-machine WireGuard hop for every request that hits Caddy on the machine which already has a replica of the target service. This is especially useful if you have a multi-region setup.

Nightly builds

PR: #308. Thanks to @tonyo for the contribution ❤️

Every push to main now produces a set of nightly binaries tagged as the nightly release on GitHub.

They're great for testing upcoming unreleased changes and reporting regressions early. They might be unstable, so please don't run them in production.

Install the nightly uc CLI locally:

curl -fsS https://get.uncloud.run/install.sh | VERSION=nightly sh

Initialise a cluster or add a machine with a nightly daemon:

uc machine init --version nightly user@host
uc machine add --version nightly user@host

Improvements

  • uc machine init/add now embed the install script into the uc binary and send it to the remote machine over the existing SSH connection instead of curl | bash (fe829ef).
  • New uc ctx show command prints the name of the currently active cluster context, great for shell prompts and scripts (#317).
  • SSH connections now use -o StrictHostKeyChecking=accept-new, which silently accepts host keys on first connection but still protects against key changes later (#303).
  • The STORE column in uc images is hidden when all machines use the containerd image store, which is the default for new clusters (c38d916)
  • Image push errors now include the underlying error from the unregistry proxy, which makes it much easier to tell apart a broken image push from a broken SSH tunnel (d340659)
  • ucind now accepts a UNCLOUD_CONFIG environment variable to override the config file path for development clusters (#315).
  • New ucind cluster ls command lists the local development clusters managed by ucind with their machines (#316).

Bug fixes

  • Fixed containerd socket auto-detection for unregistry after a machine reboot (47199c3).
  • Fixed Caddy config being regenerated on every service change because container records were serialised non-deterministically (#111).
  • Fixed SSH control socket path on WSL2 when the runtime directory doesn't exist (#319).
  • Fixed a stale SSH ControlMaster connection sometimes causing uc machine init/add to hang (fdbffbf).

Upgrade to 0.19.0

Uncloud CLI locally

To upgrade the Uncloud CLI (uc) locally:

# Homebrew (macOS, Linux)
brew upgrade uncloud

# Install script (macOS, Linux)
curl -fsS https://get.uncloud.run/install.sh | sh

Machine daemon

To upgrade the Uncloud daemon on your machines, run the following commands on each machine:

ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')
curl -fsSL -o uncloudd.tar.gz https://github.com/psviderski/uncloud/releases/download/v0.19.0/uncloudd_linux_${ARCH}.tar.gz
tar -xf uncloudd.tar.gz
sudo install uncloudd /usr/local/bin/uncloudd
rm uncloudd uncloudd.tar.gz
sudo systemctl restart uncloud

Changelog

  • 4d5ca42 chore: add more info about cluster connections in error for 'machine rm'
  • f60a9ff chore: check SSH TCP forwarding for dial operations and return a friendlier error
  • 3310b47 chore: do not log too noisy 'Sent log stream heartbeat.' log line for machine logs
  • d340659 chore: enrich image push errors with errors from proxy to unregistry
  • c38d916 chore: hide STORE column in 'uc images' output if all machines use containerd image store
  • f415901 chore: make experiments a separate Go module, remove unnecessary dependencies from root module
  • 3918170 chore: set x-context for website deploy
  • 7b554cd chore: style the machine reset prompt in red and fix the padding for [y/N]
  • 208a561 chore: trigger container sync on ActionHealthStatusRunning Docker event as well
  • dd3d809 chore: use a non-registry image format for website
  • 60ff088 ci: Add go build/module caching
  • 02a318c ci: Build and publish latest (nightly) binaries (#308)
  • b1be80a ci: Run nightly builds on macos runners
  • 9cfcf49 ci: Update cache key for nightly builds
  • 45cf87a feat(caddy): order local machine upstreams first in generated Caddyfile
  • 8afe523 feat(deploy): print last logs from failed pre-deploy hook
  • 303c8e4 feat(deploy): print last logs from new container when fails to become healthy
  • f44ada0 feat(logs): 'uc machine logs' to view logs from systemd services on machines (#283)
  • 56a54ab feat(logs): include hook information in log entry formatting
  • 2045819 feat(logs): update 'uc logs' command to support filtering by service/container
  • 8d023f5 feat(rtt): 'machine rtt' command to show round-trip time between macines usign using gossip data (#226)
  • 6409a47 feat: add uc ctx show command to print the current cluster context (#317)
  • e8111a4 feat: auto-accept only new SSH host keys using "-o StrictHostKeyChecking: accept-new" (#303)
  • dd989a2 feat: check automatically if uc can connect via Unix socket when running on cluster machine (#296)
  • 1c0d48c feat: ucind: allow overriding the config.yaml via env var (#315)
  • 7e1b91c feat: ucind: implement cluster list command (#316)
  • 9ce624f fix(logs): print logs with zero timestamps immediately to prevent indefinite stalling
  • b1897ce fix(nightly): inject correct semver 0..0-nightly-abc1234 for nightly builds
  • 6e9acef fix: Caddy config regeneration due to non-deterministic container serialisation (fixes #111)
  • 3a79aef fix: SSH control socket path in WSL2 when runtime dir doesn't exist (fixes #319)
  • f823491 fix: another attempt to fix flaky e2e tests: bump dind and get rid of incomplete cgroups fix, pre-allocate ucind machine port, allow ucind container retarts
  • fdbffbf fix: close stale ControlMaster ssh connection for machine init/add
  • 47199c3 fix: containerd socket detection and unregistry startup on machine reboot
  • 6fb68c2 fix: correctly wait for journalctl processes to not leave zombies when streaming machine logs (#325)
  • 50057c4 fix: machine logs timestamp parsing and streaming for systemd <v255
  • c888e0a fix: probe for checking TCP forwarding over SSH (fixes #321)
  • 102421d fix: tests after DinD update
  • 8384d41 lint
  • 5b5cd44 refactor(logs): encapsulate printing errors in PrintEntry
  • b852d34 refactor(logs): update log formatting for systemd services and handle merging in the client
  • 84990ad refactor(rtt): use proto Duration, add tests
  • fe829ef refactor: embed install.sh script in uc CLI to not curl | sh and version together with CLI
  • 19c13f6 refactor: move internal logs pkg from cmd to internal/cli
  • 0f5f872 test(logs): add unit tests for ParseServiceArgs function

Don't miss a new uncloud release

NewReleases is sending notifications on new releases.