github maziggy/bambuddy v0.2.4b1-daily.20260426
Daily Beta Build v0.2.4b1-daily.20260426

pre-release9 hours ago

Note

This is a daily beta build (2026-04-26). It contains the latest fixes and improvements but may have undiscovered issues.

Docker users: Update by pulling the new image:

docker pull ghcr.io/maziggy/bambuddy:daily

or

docker pull maziggy/bambuddy:daily


**Tip:** Use [Watchtower](https://containrrr.dev/watchtower/) to automatically update when new daily builds are pushed.

Added

  • Per-spool category + low-stock threshold override (#729 — minimal version) — Two new fields on the spool form: a free-text Category (with autocomplete from categories already in use, so users naturally re-use "Production" instead of accidentally typing "production" / "prod") and a per-spool Low-stock threshold (%) override that defaults to the global setting if left blank. Powers the "I want to differentiate critical spools from prototype spools and alert at different thresholds" use case from the issue without taking on the full multi-tag taxonomy + auto-apply-rules + per-tag alert system the ticket originally proposed (which would have been ~5x the work for the same underlying value). Inventory page gains a Category filter chip — only renders once at least one spool carries a category, otherwise hidden so the chip row stays uncluttered. Low-stock counts in the stat-card and the "Low Stock" filter both honour the per-spool override (so a "Production" spool with override = 90% will count as low-stock at 80% remaining even when the global threshold is 20%). 50-char cap on category, 1-99% range on threshold (0 and 100 are both rejected as footguns). 9 new backend schema-validation tests covering the field defaults, partial-update behaviour, range/length rejection; 2 new frontend tests confirming the per-spool threshold pulls in spools the global threshold misses, and that the category filter chip stays hidden until at least one spool has a category. Localised across all 8 UI languages with full translations. The full multi-tag taxonomy from the original issue isn't going forward; if demand for it grows past the current 3 thumbs-up the design can layer on top of these fields without breakage.
  • Per-event ntfy priority (#990) — ntfy supports a Priority header (1=min, 2=low, 3=default, 4=high, 5=urgent) that drives sound, visibility, and push behaviour on the receiving device, but the existing notifier sent every event at the server default — so a "50% complete" ping looked identical to "print failed" or "printer offline". The Add/Edit Notification modal now renders a per-event "ntfy Priority" section (visible only when the provider type is ntfy) listing each enabled event with its own Min / Low / Default / High / Urgent dropdown; selections persist into the provider's config.event_priorities map and the backend emits a matching Priority: N header on the ntfy POST/PUT request (including the image-attachment path). Events not explicitly mapped, malformed values, and out-of-range values (0, 6, "abc", null) all fall through to ntfy's server-side default — there is no clamping, so a misconfigured value never silently sends at the wrong urgency. Test sends (no event_type context) deliberately omit the header so the test path cannot accidentally page someone at urgent priority. Existing providers without event_priorities are untouched on upgrade. Localised across all 8 UI languages with full translations (en/de/fr/it/ja/pt-BR/zh-CN/zh-TW). 6 new backend tests covering header set on mapped event, omitted on unmapped event, omitted when no event_priorities configured, omitted when event_type is missing, ignored for out-of-range / non-numeric values, and propagated through the image-attachment PUT path.
  • Long-lived camera-stream tokens for HA / Frigate / kiosks (#1108) — The existing ?token=… camera-stream tokens expire after 60 minutes which forced home-automation integrations (Home Assistant cards, Frigate, hallway kiosks) to either refresh on a cron or run with auth disabled. New self-service "Camera API Tokens" panel under Settings → API Keys (also reachable via the existing settings search box — type "camera token" / "frigate" / "home assistant") lets any user holding camera:view mint a long-lived token they can paste once and forget. Revoke uses Bambuddy's standard styled confirmation modal (no window.confirm browser default — same pattern as the rest of the app). Tokens are scoped strictly to camera streaming (no privilege escalation surface — no other endpoint accepts them), formatted bblt_<8-char-prefix>_<32-char-secret>, and stored as a pbkdf2 hash so even a DB dump can't replay them; the plaintext is shown to the user exactly once in a copy-to-clipboard modal (with a document.execCommand('copy') fallback for plain-HTTP LAN deployments where navigator.clipboard is gated by the secure-context requirement). Hard 365-day max — the issue's expire_in: 0 (never) is explicitly rejected because an irrevocable infinite token is a footgun-by-design; UI defaults to 90 days, the cap is enforced both client-side (input clamp) and server-side (validation guard). Owners can revoke their own tokens; admins additionally see an "All users" view for leak triage and can revoke anyone's. The /camera/stream?token=… auth dependency tries the existing 60-min ephemeral row first (no behaviour change for the common browser case) and falls through to the long-lived path, so the SPA's existing camera flow is unaffected. Indexed lookup_prefix keeps verify O(1) per token even on large installs — pbkdf2 only runs against the one candidate row that matches the prefix, never the whole table. New long_lived_tokens table (separate from auth_ephemeral_tokens because the lifecycle is different — user-owned, named, revocable, hashed; and separate from api_keys because that one is for global webhooks with no user FK and a different permission shape). 15 unit tests covering create-validation/scope/expiry rules, verify happy/garbage/expired/revoked/scope-mismatch/prefix-collision paths, list-by-user vs list-all, idempotent revoke; 14 integration tests covering the create-once-then-listing-hides-plaintext contract, the 365-day cap, the auth gate, owner-vs-admin revoke ownership rules, and that the long-lived token verifies through the same camera-stream auth dependency the route uses (and that revoke immediately invalidates it). 6 frontend tests covering list render, empty state, create-then-shown-once flow, days-input clamp, revoke-with-confirm, and revoke-cancelled paths. New cameraTokens.* keys across all 8 locales (English fully translated; the seven other locales seeded with English copies pending native translation, matching the project's existing flow for newly-added user-facing features).
  • Tailscale integration for virtual printers (builds on #1070 by legend813) — Opt-in per-VP Tailscale toggle brings each virtual printer into the tailnet, so it's reachable from any tailnet device over a private WireGuard tunnel without port forwarding or public exposure. When enabled, Bambuddy provisions a Let's Encrypt cert for the VP's MagicDNS hostname via tailscale cert and the MQTT/FTPS listeners serve it. Slicer-side caveat worth knowing up front: both Bambu Studio and OrcaSlicer only accept IP addresses (not hostnames) in the Add Printer dialog, so the LE cert's hostname validation doesn't apply — users still need the Bambuddy CA imported into the slicer, same as LAN mode. The practical benefit here is the private tunnel (remote access without DDNS / port forwarding / public exposure), not cert-import elimination. Default is opt-out (toggle off) so users without Tailscale don't see cert-provisioning attempts or log noise. When a user flips the toggle on a host without a working Tailscale binary, the backend returns 409 tailscale_not_available and the UI reverts + surfaces a specific toast pointing at the setup steps (install Tailscale → tailscale uptailscale set --operator=<user> → enable HTTPS in the tailnet admin console). Docker image now ships the tailscale CLI pre-installed; users wire up by uncommenting the /var/run/tailscale/tailscaled.sock volume mount in docker-compose.yml. The MagicDNS hostname is surfaced on the VP card with a copy-to-clipboard button (modern navigator.clipboard in secure contexts, document.execCommand fallback for plain-HTTP contexts with textarea cleanup in finally). Cert renewal runs daily in-process and restarts only the affected VP's TLS listeners. New i18n keys virtualPrinter.tailscaleDisabled.{title,description} + virtualPrinter.toast.{tailscaleNotAvailable,copyFailed} across all 8 locales with full translations. 3 new backend integration tests for the 409 guard, 2 unit tests for the _cancel_restart_task self-await guard, 4 unit tests for the settings-dedupe migration, and 3 new frontend tests for the clipboard fallback path. Thanks to legend813 for the original opt-out toggle PR that this was built on top of.
  • Library Trash Bin + Admin Bulk Purge + Auto-Purge (#1008) — Library files now move to a trash bin on delete instead of being hard-deleted from disk, with a configurable retention window (default 30 days) before a background sweeper permanently removes them. Admins get a new "Purge old" action on the File Manager that shows a live preview of count + total size before moving every file older than N days (with an opt-in toggle for never-printed files, on by default) into the trash in one shot. A new Auto-purge setting in Settings → File Manager runs the same purge automatically on a 24-hour cadence when enabled — files still go to Trash first so the retention window remains the safety net; default-off so existing installs don't surprise anyone. Both the per-user delete flow and the admin bulk purge go through the same trash — regular users see and manage their own trashed files; admins see everyone's. External (linked) files bypass trash and keep the original hard-delete behaviour since their bytes aren't under Bambuddy's control. New library:purge permission gates the admin operations; retention is adjustable inline on the Trash page for admins. Adds nullable deleted_at column on library_files with an index (dialect-aware migration: DATETIME on SQLite, TIMESTAMP on PostgreSQL, since raw DATETIME is SQLite-only syntax); every LibraryFile query site now routes through a new LibraryFile.active() classmethod so trashed rows can't leak into listings, print dispatch, MakerWorld dedupe, or stats. 17 new backend integration tests + 8 new frontend component/page tests; localised across all 8 UI languages. Thanks to cadtoolbox for the proposal and the follow-up answers that tightened the spec.
  • Archive Auto-Purge (#1008 follow-up) — Settings → Archives now has an auto-purge toggle plus a Purge archives now action on the Archives page header (next to Upload 3MF, mirroring File Manager's placement) that hard-deletes print archives not printed within a configurable window (default 365 days, min 7, max 10 years) with the same live-preview modal as the library purge. Reprinting an archive reuses the row and updates its completed_at, so the purge honours the most recent print completion — a two-year-old archive you reprinted yesterday is not eligible for deletion. Unlike the library trash, archives are hard-deleted: print history is a decaying timeline, so there is no trash bin intermediate; download or favourite anything you want to keep first. The sweeper runs on the same 15-minute scheduler as the library trash but throttles actual purge runs to once per 24h so a tight tick cadence doesn't churn the DB. Each purged archive goes through the existing safety-checked ArchiveService.delete_archive path so the 3MF, thumbnail, timelapse, source 3MF, F3D, and photo folder are all cleaned up together with the DB row. Gated by a new dedicated archives:purge permission (Administrators group by default, backfilled on upgrade); 9 new backend integration tests; localised across all 8 UI languages.
  • MakerWorld Integration — Paste any makerworld.com/models/… URL on the new MakerWorld sidebar page to pull the full model metadata, plate list, creator/license info, and per-plate images, then one-click Save or Save & Slice in Bambu Studio / OrcaSlicer per plate. Closes the last workflow gap for LAN-only users who still had to keep the Bambu Handy app installed solely to send MakerWorld models to their printers. Reuses the existing Bambu Cloud login token for download authentication — no separate OAuth flow, no companion browser extension, no cookie paste. LibraryFile now tracks source_type + source_url, so re-importing the same plate dedupes to the existing library entry. Search / browse-catalogue is intentionally out of scope because MakerWorld's public search endpoint isn't reachable from a server-originated request; the URL-paste flow covers the actual discovery pattern (Reddit / YouTube / shared links).
    Endpoint route (non-obvious, ~1 day of reverse engineering)Pr0zak/YASTL#51 documented that makerworld.com-hosted design-service endpoints are cookie-gated (Cloudflare WAF serves a generic "Please log in to download models" to any non-browser bearer request), but the same backend is exposed unblocked at api.bambulab.com. The working path turned out to be GET https://api.bambulab.com/v1/iot-service/api/user/profile/{profileId}?model_id={alphanumericModelId} with Authorization: Bearer <cloud_token> — a different service (iot-service, not design-service) and a different host, accepting the same bearer the user already signs in with. Response carries a 5-minute-TTL presigned S3 URL (s3.us-west-2.amazonaws.com/…?at=…&exp=…&key=…). The modelId query param is the alphanumeric identifier (e.g. US2bb73b106683e5) that only appears in the design response body, not the integer designId from the /models/{N} URL — so the import flow fetches design metadata first, reads modelId, then calls iot-service. S3 presigned URLs must be fetched with urllib.request (not httpx / curl_cffi) because the signature is computed over the exact query-string bytes and any normalising encoder breaks it with SignatureDoesNotMatch 400s (YASTL#52 describes the same issue). Every other published reverse-engineering project we evaluated (schwarztim/bambu-mcp, kata-kas/MMP) solved the gating by shipping "paste your browser cookie" flows; reusing the existing Bambu Cloud bearer is a substantially cleaner UX and the only fully-automated path.
    UI and UX features — per-plate picker with inline Save / Save & Slice in Bambu Studio / OrcaSlicer buttons, Import all to batch-import every plate sequentially, folder picker on the page (default: auto-created top-level "MakerWorld" folder), image gallery lightbox per plate (keyboard ←/→/Esc), two-column sticky layout with Recent imports sidebar (last 10 MakerWorld imports), per-plate inline follow-up actions after import (View in File Manager / Open in Bambu Studio / Open in OrcaSlicer / Remove from library), per-plate delete via the standard Bambuddy confirm modal (no browser confirm()), elapsed-time + phase label ("Resolving … 3 s", "Downloading … 18 s") during the synchronous import POST so users see progress on large 3MFs, URL-change detection that drops the preview when the pasted URL diverges from the resolved one (fixes a class of "I thought I was importing model B but got A" dedupe confusion), rich error toasts per-phase, and the slicer-open path reuses Bambuddy's existing token-embedded library download (/library/files/{id}/dl/{token}/{filename}) so the handoff works even with auth enabled. Localised across all eight UI languages.
    Security hardening — the MakerWorld description HTML is user-authored and goes through DOMPurify.sanitize() before dangerouslySetInnerHTML. <img> tags inside summaries are rewritten to route through Bambuddy's /makerworld/thumbnail proxy so the SPA's img-src 'self' data: blob: CSP stays unwidened. Thumbnail proxy now uses follow_redirects=False (the host-allowlist guarantee is only meaningful on the initial URL — a 302 to 169.254.169.254 would otherwise bypass it). The 3MF CDN fetch sends only User-Agent — the Bambu Cloud bearer is never forwarded to the CDN. S3 presigned-URL fetch uses a urllib.request opener with a no-op HTTPRedirectHandler for the same reason. Filenames from MakerWorld responses are os.path.basename'd before persisting, so a malicious name: "../../evil.3mf" cannot surface a path-traversal string into the DB / UI (on-disk storage uses a UUID filename regardless). New routes respect the MAKERWORLD_VIEW (resolve / recent-imports / status) and MAKERWORLD_IMPORT (import) permissions. SSRF guard on downloads rejects any host that isn't makerworld.bblmw.com, public-cdn.bblmw.com, or a .amazonaws.com subdomain.
    Test coverage — 46 unit tests for services/makerworld.py (header shape, API base, get_design/get_design_instances/get_profile, get_profile_download 200/401/403/404/no-token, download_3mf SSRF rejection of 4 hostile hosts, S3 path delegation, CDN path with minimal headers, size-cap, _download_s3_urllib happy/redirect/size/network paths, fetch_thumbnail with follow_redirects=False); 19 route tests (/resolve, /import with folder autocreation + explicit folder + dedupe + filename basename + profile_id response, /recent-imports with empty-list / ordering / pydantic shape / limit clamping, _canonical_url unit); 12 frontend tests (button labels, slicer-name interpolation, URL-change detection, inline post-import actions, Recent imports rendering, DOMPurify <script> strip).

Changed

  • AMS slot "Assign to inventory spool" picker now lists every spool, including RFID-tagged Bambu Lab ones (#1133) — The picker that opens from <FilamentHoverCard> / SpoolBuddy's slot-action sheet had two stacked filters that together blocked a real workflow: (1) AssignSpoolModal only listed spools whose tag_uid AND tray_uuid were both null, hiding any Bambu Lab spool that had been auto-created from RFID or scanned via SpoolBuddy NFC; (2) FilamentHoverCard rendered its inventory section (assign + unassign affordances) only when the slot's vendor was not Bambu Lab, so even if you fixed the picker the button to open it wasn't visible on a BL slot. The use case both filters blocked: a user who has a Bambu Lab spool sitting in their inventory but doesn't want to scan it via SpoolBuddy NFC each time and just wants to pick it from the list. Both gates are gone now: the modal lists every spool that isn't already taken by a different (printer / ams_id / tray_id) tuple, and the hover-card inventory section renders for every vendor including Bambu Lab. The AMS-vs-external-slot distinction in the modal also collapsed — external slots (amsId 254/255) used to be the only path that allowed picking a tagged spool, and that special-case is now redundant. Empty slots (<EmptySlotHoverCard> in Bambuddy, slotActionPicker.tray === null in SpoolBuddy) lost their assign affordance entirely: a physically empty slot has no spool to attach an inventory record to, and offering the action there only led to users assigning the wrong spool to a slot the printer hadn't actually loaded yet — assignment now requires a loaded slot. The i18n.inventory.noManualSpools key (whose copy talked specifically about "manually added spools") was renamed to inventory.noAvailableSpools with new copy ("No spools available. Add a spool to your inventory or unassign one from another slot first.") since the empty-state premise changed; localised across all 8 languages with full translations. 5 net-new frontend tests in __tests__/components/FilamentHoverCard.test.tsx (assign/unassign buttons render for vendor: 'Bambu Lab', non-BL vendors unchanged, EmptySlotHoverCard renders no assign affordance, configure button still works on empty slots) plus the existing AssignSpoolModal.test.tsx "filters out BL spools" expectation was inverted to match the new contract and the empty-state test reworked to exercise the only remaining trigger (every spool taken by another slot).
  • Inventory: "Delete Tag" button renamed to "Clear RFID Tag" (#729 follow-up) — The reporter mistook the button for a taxonomy-tag delete (it actually clears the RFID tag UID/UUID off the spool record so the row can be re-attached to a different physical spool). Renaming it to "Clear RFID Tag" + the success toast to "RFID tag cleared" removes the ambiguity. No behaviour change. Localised across all 8 UI languages with full translations.
  • Nozzle icon on the dual-nozzle status card (#1115) — the dual-nozzle active-extruder card on the printer status bar was the only card in that row without a theme icon (the Nozzle/Bed/Chamber temperature cards all carry a thermometer icon), which left the row looking visually uneven on H2D / H2S / H2C. Adds a small schematic nozzle icon (filament body + heater block + tip) above the L/R diameter labels, styled in amber-400 to match the card's active-extruder accent. SVG design contributed by m4rtini2.
  • Settings page: permission-gated instead of admin-only — the Settings sidebar entry has always been visible to any user holding settings:read, but the route guard required admin role, so a non-admin with settings:read would see the entry, click it, and get silently redirected back to the dashboard. The route guard now matches the sidebar: any user with settings:read can open the page, and the individual tabs / cards continue to enforce their own per-feature permissions (users:read, groups:update, oidc:*, etc. — many of them admin-only, some not). Group editor routes moved to permission-based guards too (groups:create for /groups/new, groups:update for /groups/:id/edit), so permission delegation works end-to-end. Admins retain full access since admins implicitly hold every permission.

Added

  • Per-request trace ID column on every log line, plumbed through HTTP access log + application logs + response headers — Builds on the new uvicorn-access-log-into-bambuddy.log change below: the access line tells you who called an endpoint, but until now there was no way to tie that line to the application records emitted on the server side while handling that request. A new FastAPI middleware (trace_id_middleware in main.py, sourced from backend.app.core.trace) stamps each request with a fresh 8-char hex ID (or honours a sane inbound X-Trace-Id header for cross-system correlation), stores it in a ContextVar so any code in the request's call stack can read it, echoes it on the response as X-Trace-Id, and a new TraceIDFilter injects it into every LogRecord so the format string [%(trace_id)s] resolves to the right ID for the right request. ContextVars (rather than request.state) are the right plumbing here because asyncio copies the current context into every asyncio.create_task, so background work spawned from inside a request inherits the trace ID without explicit threading; the logging filter has no access to the FastAPI request object regardless. Records emitted outside any request scope (startup, MQTT callbacks, scheduler) get a stable - placeholder so the column stays visually aligned and missing values are obvious in grep. Inbound X-Trace-Id is hard-validated against a strict whitelist ([A-Za-z0-9_-]+, max 64 chars) before being honoured — a hostile or buggy caller cannot smuggle log-injection payloads (newlines, control chars, megabyte blobs) into bambuddy.log via the trace-ID column; values that fail the gate silently trigger a freshly minted server-side ID rather than failing the request. Middleware is decorated AFTER auth_middleware on purpose: Starlette stacks app.middleware decorators LIFO so the last-decorated runs first inbound, making trace stamp the OUTERMOST layer — auth log lines and every record emitted on the way down to and back from the route handler all carry the same ID. Output now looks like 2026-04-26 09:51:39,152 INFO [uvicorn.access] [a4f3b1e7] 192.168.1.42:54812 - "POST /api/v1/printers/1/print/stop HTTP/1.1" 200 paired with the route handler's 2026-04-26 09:51:39,158 INFO [bambu_mqtt] [a4f3b1e7] [SERIAL] Sent stop print command — one grep a4f3b1e7 away from the full causality chain. 30 new tests across tests/unit/test_trace.py (placeholder when no request scope, filter copies ContextVar value onto records, ID propagates into spawned tasks via asyncio context copy, concurrent requests don't leak IDs into each other, generator produces unique hex IDs, hostile payloads rejected by validator, max-length boundary, dash/underscore variants accepted) plus tests/integration/test_trace_middleware.py (X-Trace-Id header echoed on response, body and header IDs match, each request gets a unique ID, generator format stays short hex, safe inbound IDs honoured, hostile inbound IDs replaced, overlong inbound IDs replaced, ContextVar reset cleanly after request).

Fixed

  • Reprint-from-archive failed with 0500_4003 SD R/W errors after a stuck dispatch, fixable only by restarting the container (#1136) — Reported by smandon: reprinting from archives sometimes fails immediately with MicroSD R/W exception errors, with the printer's MQTT push referencing a 3MF file from a different unrelated archive (WARIO_Wall_decor_-_NO_AMS.3mf while the user was actually trying to print Cable_Organiser_Cable_Clip.3mf). Once it starts happening, every subsequent reprint hits the same error until the container is restarted. Root cause traced from his support package log to paho-mqtt's client-side QoS 1 queue: when the printer's command channel goes half-broken (telemetry still flowing, publishes silently dropped — same #887/#936 pattern), Bambuddy's 15s dispatch deadline expires (background_dispatch.py:993) and calls force_reconnect_stale_session(). That function was force-closing the underlying socket so paho's auto-reconnect would kick in — but the same mqtt.Client instance, same client_id, and same in-process QoS 1 queue stayed alive across the reconnect. Any unacked publish from the broken session — typically the just-sent project_file for the new archive — got replayed verbatim on the new connection. And because the in-process queue accumulates across multiple stuck dispatches within one Python process, by the second or third stuck reprint there were several stale project_file/resume/stop/clean_print_error commands queued up and replaying together. The printer received the flood, tried to load whichever stale path the firmware latched onto last, found a file that no longer existed on its SD card → 0500_4003. Container restart was the only thing that fixed it because it was the only thing that wiped paho's in-process queue. Replaced the socket-close with a context-aware reconnect: force_reconnect_stale_session() and check_staleness() now go through a routing helper _reset_client_for_reconnect() that picks the right teardown strategy based on caller context. Async-context callers (the dispatch deadline path — background_dispatch.py:993 — which is the actual #1136 trigger, plus FastAPI route handlers via check_staleness) get the hard-reset path: client.disconnect() (broker sees DISCONNECT and drops the session immediately, since clean_session=True), client.loop_stop() (kills the paho network thread, taking its QoS 1 queue with it), nulls out self._client, and calls self.connect() to construct a fresh mqtt.Client with an incremented client_id. New connection starts genuinely empty, no replay possible. Paho-network-thread callers (the developer-mode probe and ams_filament_setting zombie detection inside _update_state, lines ~2604 and ~2623) keep the socket-close fallback — calling loop_stop() from inside the network thread would self-join and deadlock, so the safe pattern there remains "close the socket and let paho's own loop detect it and auto-reconnect on the same client". Theoretical queue replay is still possible on those paths but #1136 specifically traced through the dispatch path, and the legacy socket-close has been battle-tested for the zombie paths since #887. Routing decision is made via asyncio.get_running_loop() — paho's callback thread has no loop, every legitimate hard-reset caller does. 7 regression tests across two new test classes: TestForceReconnectRouting (3 tests pinning the sync-context → socket-close fallback, async-context → hard-reset path with mock-stubbed connect(), and the state-disconnected broadcast firing once on either path) and TestHardResetClientDirect (3 tests pinning the helper directly: old client receives disconnect() + loop_stop(), _client reference cleared, failing disconnect() doesn't propagate so the await chain in background_dispatch.py doesn't break). Existing TestZombieSessionDetection::test_two_timeouts_force_reconnect and TestDeveloperModeProbeTimeout::test_second_timeout_forces_reconnect updated to assert the socket-close path (matching their paho-thread context), preserving the legacy contract. All 2179 backend unit tests pass. Thanks to smandon for the precise reproduction logs that made this diagnosable from a single support package.
  • logs/bambuddy.log was silently dropping records from named child loggers — When the trace-ID column was added to the log format (%(trace_id)s), the TraceIDFilter was attached to the root logger. Per Python's logging semantics, a filter on a Logger only fires for records that originate at that logger — records propagated up from child loggers (every backend.app.* module — most of the application) never trigger it. Result: child-logger records arrived at the file handler with no trace_id attribute, the formatter raised KeyError: 'trace_id', and Handler.handleError printed to stderr and dropped the record. bambuddy.log ended up with INFO/DEBUG records appearing only "partially" — exactly the records emitted directly through logging.info(...) (root logger) or uvicorn.access (which had its own explicit filter attachment) made it; everything else was discarded. Moved _trace_id_filter from root_logger.addFilter() to console_handler.addFilter() + file_handler.addFilter() — handler-level filters fire for every record the handler receives, regardless of which logger emitted it. The filter's own docstring already said "Attach to the file handler (or any handler whose format string references %(trace_id)s)" — the implementation was just wrong. New regression test in test_trace.py::TestFilterMustBeAttachedToHandlerNotLogger pins the contract: a child logger emits a record, propagation reaches the handler-level filter, the formatter sees a populated trace_id field, and the line is written. Existing 23 trace tests keep passing unchanged. Restart-shutdown recursion in journalctl was also a side effect — every shutdown log line was raising the formatter ValueError, which got caught and logged… raising again, forever, until the lifespan exit unwound; the new placement breaks the cycle since records now format cleanly.
  • User-cancelled prints surfaced as "1 problem" on the printer card AND were archived as "Layer shift" failures — Cancelling a print left the printer card stuck on a permanent "1 problem" badge, and stamped the resulting archive entry with failure_reason="Layer shift" — a fake firmware-fault label in the print history. Affects every Bambu printer that emits a cancel-sequence HMS — the user surfaced it on an H2D where the firmware emits both 0300_400C ("The task was canceled.") and the not-in-the-public-wiki 0C00_001B echo as part of the cancel sequence. Four compounding causes, all fixed together. (1) The direct stop endpoint never set the user-stopped flag. POST /printers/{id}/print/stop (backend/app/api/routes/printers.py) sent the MQTT stop command but didn't call mark_printer_stopped_by_user(), so when the printer reported "failed" via MQTT the on_print_complete override (main.py:2558) couldn't reclassify it as "cancelled". The same flag was being set from POST /print-queue/{id}/stop, which is why queue-driven cancels mostly worked but printer-card cancels didn't. The direct endpoint now mirrors the queue path. (2) The HMS → failure_reason heuristic was way too broad. Old code mapped any module 0x0C HMS to "Layer shift" (main.py:3072), but module 0x0C is "Motion Controller" — covers cameras, visual markers, the BirdsEye assembly and the cancel-sequence HMS the firmware emits during a user-cancel. Real layer-shift codes actually live in module 0x03 (0300_4057, 0300_4068, 0300_800C). The same module-only heuristic was also being used to auto-label "Filament runout" (any 0x07) and "Clogged nozzle" (any 0x05), so the same false-positive class existed on those branches. Replaced the broad module heuristic with a curated short-code → reason map (_HMS_FAILURE_REASONS, 23 specific HMS codes from the real wiki); anything not in that map leaves failure_reason=None rather than guessing. Also extracted the logic into a pure function derive_failure_reason(status, hms_errors) so it's unit-testable without the full archive pipeline. (3) Cancel-echo HMS codes were polluting state.hms_errors. Even with (1) and (2) fixed, the printer card kept showing "1 problem" because the firmware kept reporting 0300_400C ("The task was canceled.") in subsequent MQTT pushes — and bambu_mqtt._update_state was happily appending it to state.hms_errors, where the frontend's filterKnownHMSErrors accepted it as a valid known code (it IS in ERROR_DESCRIPTIONS — just describing a user action, not a fault). Added a parse-time filter (_HMS_USER_ACTION_CODES = {"0300_400C", "0500_400E"}) that drops these short codes before they ever enter the state, mirroring the suppression main.py:_HMS_NOTIFICATION_SUPPRESS was already doing for notifications. The card pip, the "X problem" badge, the modal, and any other consumer of hms_errors all get consistent behavior automatically. (4) Frontend counted gcode_state="FAILED" without HMS as a problem. Even with (1)–(3) fixed, the printer card still showed "1 problem" because the H2D's gcode_state sits at FAILED after a cancel until the next print starts, and PrintersPage.tsx:940 (header badge) + classifyPrinterStatus (line 1028) + BulkPrinterToolbar.tsx:102 all unconditionally bumped the error bucket on case 'FAILED'. Real failures attach an HMS error; user-cancels don't — so FAILED-without-HMS now buckets as finished (same operator meaning: print ended, plate may need clearing) and only escalates to error when there's an active known HMS. Same change applied across all three call sites for consistency. 20 regression tests total across three files: test_failure_reason_derivation.py (11 tests pinning the cancel-sequence HMS pair to NOT yield "Layer shift", unknown module-0x0C → None, real layer-shift/runout/clog codes still classify, int-vs-hex code-format tolerance, status="cancelled" symmetric with "aborted"), test_bambu_mqtt.py::TestHMSUserActionFiltering (4 tests pinning 0300_400C/0500_400E filtering on both hms[] and print_error parse paths, real layer-shift 0300_4057 still passes through, mid-cancel concurrent real-fault keeps the real one and drops only the echo), and PrintersPageBucketing.test.ts (5 tests pinning FAILED-without-HMS → finished, FAILED-with-known-HMS → error, FAILED-with-only-unknown-HMS → finished, FINISH baseline unchanged, disconnected stays offline). Existing stale state on running printers clears on the next MQTT push that includes an hms key (printer firmware re-sends the list, parser filters it out, badge clears). Users with a stuck badge can also click the HMS modal "Clear" button to clear immediately via MQTT command.
  • Settings → API Keys: deleted key stayed on screen until manual reload — the delete-key mutation marked the ['api-keys'] query stale via queryClient.invalidateQueries, which in v5 should also refetch active queries — but in practice the deleted row remained visible until the user reloaded the page. Switched the mutation's onSuccess to queryClient.setQueryData so the deleted key is filtered out of the cache synchronously the moment the API confirms; no refetch round-trip required, no chance for an invalidation→refetch race to leave the UI stale. Create-path keeps invalidateQueries since that one was working correctly. New SettingsPage.test.tsx test "removes a deleted key from the list without a page reload" pins the synchronous-removal contract.
  • SpoolBuddy AMS page: re-assigning a just-unassigned spool sometimes showed an empty picker (#1133 follow-up) — Reported live during the rollout of the #1133 picker change: unassigning a Bambu PLA Metal spool from SpoolBuddy and re-opening the picker showed "no spools available" — the just-freed spool was missing. The investigation surfaced four distinct causes that all needed addressing for the picker to stay correct, plus a deployment-side cause that prevented any of the fixes from reaching the live kiosk. (1) Dual cache-key shapes for spool assignments: SpoolBuddyAmsPage keys by ['spool-assignments', selectedPrinterId] while the shared AssignSpoolModal keys by ['spool-assignments'], and SpoolBuddyAmsPage.unassignMutation.onSuccess only invalidated the printerId-keyed one, leaving the modal's unkeyed cache stale. Both invalidate calls (mutation success + modal-close handler) now hit both keys; collapsing the two key shapes into one is intentionally deferred since the dual-key pattern predates this change and shows up in 6 components. (2) Toggle wasn't a real escape hatch: the existing "Show all spools" toggle's label said it would help when a spool was hidden but only bypassed the material/profile filter, not the assignment-elsewhere gate. It now bypasses BOTH filters, making it a real escape hatch (the backend's assign_spool is upsert-per-(printer/ams/tray), so picking a currently-taken spool just creates a second assignment row — foot-gun for normal flows but exactly the recovery path this toggle is for). (3) Cross-component cache pollution: ['inventory-spools'] was used as a query key by 5+ components calling getSpools() with different includeArchived arguments — React Query treated them as one query and served whichever response landed first, so a SpoolBuddy component priming the cache with getSpools(false) could hide spools from the modal that wasn't yet present at that fetch time. The modal now uses its own dedicated key ['inventory-spools', 'assign-modal'] + getSpools(true) so it's never at the mercy of someone else's cache state. (4) Empty-state had no diagnostic surface: when the picker showed "No spools available" there was no way to tell why — was the fetch empty? Were spools archived? All assigned elsewhere? A small counter X fetched · Y archived · Z assigned to other slots now renders in the empty state so future reports of this kind are immediately answerable from a screenshot rather than requiring devtools digging. (5) Browser holding stale JS forever: index.html was being served without Cache-Control headers, so Chromium's heuristic-cache freshness window kept the OLD HTML "fresh" for days across browser restarts. The OLD HTML referenced an OLD content-hashed bundle, which was also still in disk cache, so the kiosk kept running pre-deploy JS no matter how many times its Chromium was restarted or cache-cleared — the persistent profile would re-seed the cache from disk on next start. Backend now sends Cache-Control: no-cache, must-revalidate on both / and the SPA catch-all that serve index.html; service worker CACHE_NAME bumped from bambuddy-v25 to bambuddy-v26 so any client that does eventually re-fetch sw.js invalidates its CacheStorage; and spoolbuddy/install/install.sh now generates the kiosk launcher with --user-data-dir=/tmp/spoolbuddy-kiosk-userdata plus a pre-launch rm -rf so every kiosk restart starts from a clean slate (the kiosk has no per-user state worth persisting — auth token is in the URL query, not a stored cookie). 6 net-new tests across AssignSpoolModal.test.tsx (toggle escape-hatch behavior) and tests/integration/test_static_html_cache_headers.py (Cache-Control directive on root + SPA catch-all routes, no leak onto API routes). Reproduced end-to-end on an H2D + dual AMS + SpoolBuddy display: unassign Bambu PLA Metal Iridium Gold Metallic from slot B4 → reopen picker → spool now visible without browser intervention.
  • Plate-clear button stayed visible after the API cleared awaiting_plate_clear outside the printer-card click path (#1128) — awaiting_plate_clear is a Bambuddy-side flag, not a printer-side one, so toggling it does not produce an MQTT push from the printer. Commit 4e86e8c added the flag to the printer_status payload so MQTT-driven broadcasts (e.g. when a print finishes and on_print_complete sets the flag to True alongside a state transition to FINISH) carry it correctly. The reverse transition didn't get the same treatment: POST /printers/{id}/clear-plate mutated PrinterManager._awaiting_plate_clear and persisted to the DB, but emitted no printer_status WebSocket update — and the in-main.py status-change broadcaster's status_key deduplication intentionally excludes Bambuddy-side flags, so even a coincidentally-arriving MQTT push wouldn't reflect the change. The "Mark plate as cleared" button on the printer card disappeared "immediately" after a click only because the React Query cache was being optimistically updated client-side; clearing the flag through any other route (an admin script, a second tab, an automation hitting the endpoint directly, the scheduler at print_scheduler.py:1844 when dispatching the next queued print) silently left every UI subscriber but the originating tab stale until a coincidental status refresh. Centralised the broadcast in PrinterManager.set_awaiting_plate_clear itself rather than at each call site, so every current AND future caller is covered without remembering to wire it up: a new _broadcast_status_change(printer_id) private coroutine is scheduled alongside the existing _persist_awaiting_plate_clear whenever the flag flips under a running event loop. The broadcast lazy-imports ws_manager to keep printer_manager.py clean of application-layer infra at module-import time, short-circuits when get_status returns None (printer disconnected — the next reconnect produces a fresh push anyway), and swallows ws_manager.send_printer_status failures so the persistence path can complete even if the WS layer is temporarily unavailable. The same hook is now in place for any other Bambuddy-side flag that gets added to printer_state_to_dict later — they'll all need to broadcast their own changes for the same reason. 8 new regression tests in test_printer_manager_status_broadcast.py: schedules-on-True/False/loop-running/no-loop/loop-stopped contracts, _broadcast_status_change happy path with payload assertion, skip-when-no-state, swallow-WS-errors, and an end-to-end live-loop test that fires set_awaiting_plate_clear(False) and asserts a broadcast lands with awaiting_plate_clear: false in the payload. Existing 24 tests in test_scheduler_clear_plate.py continue to pass unchanged because they instantiate PrinterManager() without attaching a loop (sync unit-test path) — the new _schedule_async call short-circuits on the same loop check the existing persistence call already used. Thanks to EdwardChamberlain for the precise root-cause analysis (down to the exact line and the suggested ws_manager.send_printer_status() fix).
  • Uvicorn HTTP access log was missing from bambuddy.log, leaving rogue server-state changes untraceable — When an HTTP endpoint that mutates server state fires unexpectedly (the canonical example: a print spontaneously stopping mid-job because something hit POST /printers/{id}/print/stop), the only on-disk trail was Bambuddy's own application log — which by design only records the outbound MQTT publish (Sent stop print command), not the inbound HTTP call that triggered it. The result was an unsolvable mystery on 2026-04-26: prints stopping with no preceding Bambuddy-side log line, no way to identify the caller, and the rotated container stdout already gone by the time the support pack was generated. Root cause: uvicorn ships its access logger with propagate=False by default, so the existing RotatingFileHandler attached to root never received those records. main.py now attaches the same file handler directly to logging.getLogger("uvicorn.access") and applies a new WriteRequestsOnlyFilter (backend/app/core/logging_filters.py) that keeps POST / PUT / PATCH / DELETE and drops GET / HEAD / OPTIONS. Status polls, camera streams, snapshot fetches, websocket upgrades, and CORS preflights account for the bulk of access traffic on a running install and none of them can change server state on their own — dropping them keeps bambuddy.log focused on lines that matter for incident triage without churning the 5 MB rotation window faster than it's useful. Filter anchors on the " +verb+ pattern uvicorn's format string guarantees, so a literal "POST" substring inside a URL (e.g. GET /api/posts/POST_123) cannot false-match. The filter lives in its own module so the test suite can import it without pulling in main.py's entire startup graph. 13 new tests in test_logging_filters.py cover all four write verbs being kept, GET/HEAD/OPTIONS being dropped, two URL-contains-verb-substring false-match guards, empty/unrelated-line/idempotency edge cases. Output now looks like 2026-04-26 09:23:14,690 INFO [uvicorn.access] 192.168.1.42:54812 - "POST /api/v1/printers/1/print/stop HTTP/1.1" 200 — one grep "POST.*stop" away from "who triggered this".
  • Spool auto-assign hit IntegrityError on Postgres when AMS pushes arrived in quick succession — Bambu MQTT can deliver two ams_data push frames for the same printer ~30 ms apart (observed on H2D + dual AMS at K-profile-load / RFID-read boundaries). Each frame triggers on_ams_change in backend/app/main.py, whose auto-assign block reads (printer_id, ams_id, tray_id), decides "no existing assignment", and INSERTs via auto_assign_spool — and the two callbacks raced in their respective sessions, both deciding to insert, with the second commit losing on spool_assignment_printer_id_ams_id_tray_id_key. SQLite's WAL serial-write semantics had been silently swallowing the race for ~7 weeks since the spool-assignment feature shipped (latent in ec82092b); when optional Postgres support landed in 610431d6 and asyncpg started allowing true concurrent transactions, it surfaced as WARNING [main] RFID spool auto-assign failed: ... duplicate key value violates unique constraint ...; DETAIL: Key (printer_id, ams_id, tray_id)=(1, 0, 0) already exists. Added a per-printer asyncio.Lock (_ams_assignment_locks keyed by printer_id) wrapping the auto-assign critical section so two callbacks for the same printer serialise — by the time the second one's session runs select(SpoolAssignment).where(...), the first's commit is visible and the early-return "existing assignment" branch fires instead of a duplicate INSERT. The Spoolman sync block further down in the same callback intentionally stays OUTSIDE the lock — it's network-bound and idempotent, so serialising it would block subsequent AMS callbacks for the duration of a remote roundtrip. Per-printer scope keeps unrelated printers fully parallel: one printer's slow assignment never blocks another's. The auto-unlink block above the assign block isn't wrapped because its DELETE/UPDATE operations don't have the same constraint surface; the assign-block lock is sufficient because the second callback's select will see the first's committed state. 5 new regression tests in test_ams_assignment_lock.py cover same-printer-same-lock identity, different-printers-different-lock isolation, second acquirer waits for first inside the lock (proves serialisation), different printers run truly in parallel under a held lock (proves per-printer scope), and an auto-cleanup fixture resets the module-level dict between tests so cross-test loop affinity bugs can't surface.
  • Camera TLS proxy logged "Unhandled exception in client_connected_cb" when ffmpeg dropped its half of the connection mid-stream under uvloop — The bidirectional forwarders inside services/camera.py::create_tls_proxy._handle (the OpenSSL TLS shim added in #661 so Bambu's RTSPS handshake works around Debian GnuTLS hardening) caught (ConnectionError, OSError, asyncio.CancelledError) on writes, but uvloop's UVStream.write raises a plain RuntimeError from UVHandle._ensure_alive when the underlying handle is already closed. asyncio's default selector loop reports the same situation as ConnectionResetError, so the bug only surfaced on uvloop deployments — and only at the moment the client (typically ffmpeg or a snapshot-capture subprocess) tore down its socket while the proxy was mid-flush. The RuntimeError slipped past the except tuple, escaped the forwarder coroutine, and asyncio's client_connected_cb task-exception handler logged a noisy multi-line traceback ending in RuntimeError: unable to perform operation on <TCPTransport closed=True ...>; the handler is closed. Added RuntimeError to the except tuple in both _fwd_to_server and _fwd_to_client (the latter being the actual frame in the bug report — server→client is where buffered TLS chunks land after the client has gone). The forwarders are intentionally fire-and-forget on tear-down; once either peer drops, both halves of the proxy should exit quietly and the existing dst.close() in the finally block already handles cleanup. No functional regression possible — the connection is already dead by the time the exception fires; this only changes whether asyncio logs an "Unhandled exception" trace for it. 2 new regression contract tests in test_camera_tls_proxy.py use inspect.getsource to assert both forwarder closures' except clauses include RuntimeError, since the closures are nested inside _handle and extracting them just for testability would require a pure-cosmetic refactor of the proxy.
  • Background-dispatch reported "Print started successfully" when the printer never actually transitioned (#1134, follow-up to #1042) — The int32 task_id modulo fix that was the original root cause of #1042 is verified working in the reporter's most recent support pack (the published task_id values are well below 2^31-1 and match the int(time.time() * 1000) % 2_147_483_647 formula exactly). The remaining residual — "the UI reports despatch success which is slightly misleading" — was a real second bug class: the post-dispatch watchdog _verify_print_response in services/background_dispatch.py was fire-and-forget. It would correctly detect that the printer never transitioned (e.g. P1S sitting in gcode_state: FAILED with HMS 0300_400C "task was canceled", a half-broken MQTT session, an SD card error, or any other pre-print blocker), log a did not respond to print command within 15s warning, force-reconnect the MQTT session — and then return without touching the dispatch job state. The dispatch job had already been marked successful on the optimistic MQTT-publish-acknowledged path, so the UI carried on showing "Print started successfully" while the printer sat idle. The watchdog now returns a bool and is awaited inline by both call sites (_run_reprint_archive at line 687, _run_print_library_file at line 860); on False (timeout) the call sites raise a RuntimeError carrying a user-actionable message ("Printer did not acknowledge print command — state still {pre_state}. Check the printer for a pending error (HMS code, plate-clear prompt, SD card) and try again."), which routes through the existing _mark_job_finished(failed=True, …) path so the dispatch UI shows a real failure toast and the library-file flow's freshly-created archive is db.rollback()'d (no orphan rows for prints that never started). The watchdog now also accepts subtask_id advancing past the captured pre_subtask_id as a definitive "command landed" signal — same as the queue-side watchdog at print_scheduler.py:1992 (#1078) — so slow H2D FINISH→PREPARE transitions (~50 s observed) don't false-fail when the printer has clearly accepted the project_file but is still in FINISH. Default timeout raised from 15 s to 90 s to match the queue-side watchdog (#967 / #1078) and give the same headroom on both dispatch paths. Brief mid-window MQTT disconnects (get_status() is None for one tick) now keep polling instead of immediately failing — matches what the queue watchdog already does and avoids false-failing on transient telemetry gaps. The existing force_reconnect_stale_session recovery is preserved on the timeout path. 8 new regression tests in test_background_dispatch_watchdog.py cover state-change pickup, subtask_id-change pickup with state still FINISH (the H2D case), neither-signal-changed timeout + force-reconnect, pre_subtask_id=None backwards-compat, post-dispatch subtask_id=None not counting as a change (avoids false-pass on transient reconnect), brief disconnect not short-circuiting the window, persistent disconnect for the full window returning False, and a contract test that the default timeout is 90 s. Thanks to EdwardChamberlain for the detailed retest with logs that pinpointed the watchdog's no-propagation gap.
  • Bambu RFID auto-match created duplicate inventory rows for Quick-Add and non-Bambu-branded spools (#918) — find_matching_untagged_spool is supposed to attach a Bambu RFID UID to a pre-existing manually-logged spool of the same material/color so users who log inventory before scanning don't end up with a duplicate row on first AMS read. Two bugs in the matcher meant it almost never worked for the actual reporting workflow: (1) the subtype filter was strict — when the AMS tray reports tray_sub_brands="PLA Basic" the matcher required Spool.subtype = 'Basic' exactly, so any Quick-Add row (Quick-Add only requires material, leaving subtype=NULL) was excluded and duplicated on first AMS read. (2) the docstring claimed it filtered on brand but the WHERE clause didn't, so a same-color Polymaker untagged spool would silently acquire a Bambu Lab tray UUID, leaving the user with brand="Polymaker" but a Bambu UUID — silent data corruption. Both bugs are addressed in the same query: subtype now prefers an exact match but accepts a NULL-subtype row as fallback (with a CASE in ORDER BY so an exact match still wins when both exist), and brand is now restricted to "contains 'bambu' (case-insensitive)" or NULL — matching 'Bambu' (the form's DEFAULT_BRANDS value), 'Bambu Lab' (the catalog value), 'BambuLab', 'bambu lab', etc., while rejecting any explicitly-named third-party brand. 6 new regression tests in test_spool_tag_matcher.py cover the NULL-subtype fallback, exact-subtype-wins-over-NULL ordering, non-Bambu brand rejection, NULL brand acceptance, all four Bambu brand spelling variants, and the full Quick-Add scenario (brand=NULL + subtype=NULL). The broader UI proposals in #918 (manual override / merge / disambiguation prompt) are intentionally out of scope — once the matcher works, the duplicate-on-RFID complaint that motivated those proposals goes away. Thanks to ViridityCorn for the report and pointing at the right function, and to Arn0uDz for confirming with a 20-spool repro.
  • Swagger UI link in Settings → API Keys rendered a blank page — the global CSP applied by security_headers_middleware set script-src 'self' and style-src 'self' 'unsafe-inline' https://fonts.googleapis.com, which blocked both the inline <script> that boots Swagger and the cdn.jsdelivr.net URL that ships swagger-ui-bundle.js / swagger-ui.css. FastAPI's /docs page therefore loaded a 1 KB shell with no JS executed, leaving an empty white page. The middleware now emits a docs-scoped CSP for /docs, /redoc, and /docs/oauth2-redirect that allows https://cdn.jsdelivr.net for scripts + styles, the FastAPI/Redoc favicon hosts for images, and 'unsafe-inline' for the Swagger boot script — every other route keeps the unchanged stricter SPA policy.
  • Camera stream second viewer fails / kicks the first off (#1089) — Most Bambu Lab printers only allow one concurrent camera connection (RTSP socket on X1/H2/P2, port-6000 chamber-image socket on A1/P1), but GET /printers/{id}/camera/stream opened a fresh upstream per viewer keyed on a per-request stream_id. Two browser tabs / two dashboard cards → the second viewer either failed silently or kicked the first one off. New services/camera_fanout.py::MjpegBroadcaster owns a single upstream per printer and fans pre-formatted MJPEG chunks out to N subscriber queues; new viewers tap the existing connection. When the last subscriber leaves, the upstream stays alive for a 5 s grace window so a tab refresh or "open in new tab" doesn't pay an ffmpeg/RTSP reconnect, then tears down cleanly. Per-subscriber queues are bounded (depth 4) so a slow viewer drops frames for itself rather than blocking the broadcaster — live video, old frames have no value. Stop endpoint and app-shutdown both call into the broadcaster's force-shutdown path so subscribers wake up via an upstream-gone sentinel instead of hanging on queue.get(). External-camera path is unchanged (user-supplied MJPEG/RTSP servers handle multi-viewer themselves). The upstream uses a deterministic {printer_id}-fanout stream id so every existing prefix-match in cleanup_orphaned_streams, camera_status, the snapshot fall-through in main.py, and the stop endpoint continues to find it without changes. Two follow-up correctness fixes from the audit pass: (1) _stream_start_times[printer_id] is now set with setdefault() so /camera/status reports the SHARED upstream's age — previously each new viewer overwrote it, making stream_uptime jump backward whenever a second viewer attached; (2) the route now retries subscribe() once on RuntimeError to close a tiny race where the grace teardown can flip the broadcaster to stopped between the registry lookup and the subscribe call (the retry forces the registry to mint a fresh broadcaster). Detach log line shows the post-unsubscribe count returned atomically by unsubscribe() — no more two viewers leaving simultaneously both reporting subscribers=0. Permission gates unchanged: /camera/stream still requires the existing token (minted by POST /camera/stream-token with CAMERA_VIEW); /camera/stop still requires CAMERA_VIEW; the broadcaster is internal infra with no FastAPI surface. 13 unit tests for the broadcaster (single subscriber, multi-subscriber-shares-one-pump, slow-subscriber-doesn't-block-fast, grace-window teardown, grace-cancelled-on-rejoin, force-shutdown sentinel, iter_subscriber exits on upstream-gone and on client-disconnect, registry replaces stopped broadcasters, subscribe() raises on stopped broadcaster, unsubscribe() returns post-removal count atomically across concurrent leavers, double-unsubscribe is idempotent, and the route's force-shutdown-then-fresh-subscribe retry path) plus 2 new integration tests on the stop endpoint covering the deterministic fan-out stream id and the shutdown_broadcaster wiring. Thanks to swheettaos for the diagnosis and broadcaster sketch.
  • Uploads to writable external folders silently landed in internal storage (#1112) — LibraryFolder has an external_readonly flag, so the model already distinguishes writable from read-only external mounts, but POST /library/files rejected only the read-only branch and then unconditionally wrote to get_library_files_dir() with a UUID-scoped filename. The resulting LibraryFile row linked back to the external folder via folder_id, so the file showed up in the Bambuddy UI and could be printed, but the bytes physically lived in archive/library/files/ and never touched the mount — invisible from any other machine accessing the same NAS/SMB share. New _resolve_upload_destination() helper detects writable external targets and writes through to <external_path>/<filename> (keeping the original filename so the file is recognisable on the mount), with guards for missing/inaccessible path (400), non-writable mount (400), pre-existing filename on the mount (409 — no silent overwrite; the user is expected to rename and retry, matching how scan treats external files as externally-owned bytes), and a resolve + relative_to path-traversal guard on the joined destination. DB row now matches what scan produces: is_external=True, file_path=<absolute external path>, so the existing download / delete / dedupe paths work unchanged (to_absolute_path already fast-paths is_absolute() inputs, and external-file deletion already bypasses trash and only drops the DB row + internal thumbnail). POST /library/files/extract-zip is now rejected against any external folder (not just read-only) with a clear "extract the ZIP on the external mount and run Scan" message — the nested-subfolder creation path would need to mkdir on the mount and create matching is_external=True LibraryFolder rows, which is a separate design round, and the Scan flow already handles that shape. 7 new integration tests cover: bytes land on the mount; DB row has is_external=True + absolute file_path; filename collision → 409 with prior bytes preserved; vanished external path → 400; path-traversal filename never escapes the external dir; extract-zip into writable external rejected with the Scan hint; root uploads unchanged.
  • Queue item stuck at "printing" when print failed before reaching RUNNING (#1111) — Dispatching a file sliced for the wrong nozzle size (or any other pre-print error: AMS fault, wrong plate, nozzle not installed, etc.) left the queue item stuck at status="printing" forever, blocking every subsequent pending item for that printer (check_queue seeds busy_printers from any row in 'printing' state and skips further dispatches for those printer IDs). Completion detection in BambuMQTTClient._process_message required the print to have reached RUNNING — either via _previous_gcode_state == "RUNNING" or the _was_running fallback — but a nozzle-mismatch failure transitions the printer IDLE → PREPARE → FAILED without ever entering RUNNING, so neither branch matched and on_print_complete never fired. The diagnostic log line at bambu_mqtt.py:2690 ("State is FAILED but completion NOT triggered: prev=PREPARE, was_running=False") confirmed the path. Completion now also fires on FAILED from a pre-print state (PREPARE or SLICING) — restricted to those two so a stale FAILED on first connection (prev=None) still can't accidentally advance an unrelated queue item. Additionally, when a queue item transitions to failed the handler in main.py now populates error_message from the printer's current HMS error list, rendered via the existing backend/app/services/hms_errors.py lookup table (e.g. [0500_4038] The nozzle diameter in sliced file is not consistent with the current nozzle setting. This file can't be printed.) — previously error_message was left NULL, so users saw "failed" with no hint at the cause. 5 new unit tests in TestPrePrintFailureCompletion cover PREPARE→FAILED and SLICING→FAILED firing, IDLE→FAILED and initial-FAILED not firing (boot-time safety), and HMS errors being passed through in the callback payload; 6 new tests in test_hms_error_summary.py cover the error-message formatter (known-code lookup, unknown-code fallback, multi-error join, malformed-entry tolerance, all-malformed → None, empty → None). Thanks to MartinNYHC for the report.
  • Tailscale cert-renewal restart silently failed mid-way (follow-up to #1070) — The daily renewal path creates an asyncio.Task to restart VP services with the new cert. Inside that task, stop_server() / stop_proxy() call _cancel_restart_task(), which cancelled+awaited the currently-running task (itself). The self-await raised RuntimeError, got caught by the broad exception handler, but the cancel flag was still set — so the next await in stop_server raised CancelledError and aborted the restart partway through. The VP kept running the OLD expired cert until the process was manually restarted, silently defeating the feature. _cancel_restart_task now checks asyncio.current_task() and skips the cancel+await when the caller IS the restart task itself. Two new regression tests cover the self-cancel and outside-cancel paths.
  • Settings table filled with duplicate rows on legacy SQLite installs — pre-UNIQUE-constraint databases stored the settings.key column without a unique index, so the seed loop's INSERT OR IGNORE silently degraded to a plain INSERT and every systemctl restart bambuddy added another row of advanced_auth_enabled / smtp_auth_enabled. After a handful of restarts, scalar_one_or_none() in is_advanced_auth_enabled and similar sites blew up with MultipleResultsFound, 500'ing the login flow. run_migrations now dedupes (keeps MIN(id) per key) and creates the missing ix_settings_key unique index before the seed loop runs. Postgres installs were unaffected. 4 new regression tests cover legacy-with-dupes, legacy-already-clean (idempotent), and fresh-install (no-op) paths.
  • Virtual printer card's Tailscale FQDN copy button failed on HTTPnavigator.clipboard.writeText is only available in secure contexts (HTTPS / localhost). When Bambuddy is reached over plain HTTP via a LAN or Tailscale IP, the clipboard API is blocked and the copy button silently failed with a generic "Failed to update settings" toast. Added a legacy document.execCommand('copy') fallback via a hidden textarea for non-secure contexts; the textarea is removed in a finally block so it doesn't leak into the DOM on exception paths. New virtualPrinter.toast.copyFailed i18n key across all 8 locales for the rare case where both paths fail.
  • Install script failed for first-time users — three separate permission issues in install/install.sh stopped the native installer mid-way: (a) download_bambuddy chowned the empty install dir to the service user BEFORE running git clone as the current user → permission denied on .git; (b) setup_virtualenv created the venv as the service user but then ran pip install --upgrade pip as the current user → permission denied writing venv/bin/pip; (c) build_frontend would have hit the same pattern on npm ci. All three now route through sudo -u "$SERVICE_USER" (or sudo -H -u for npm so HOME is set correctly for the npm cache). The git-clone fix runs as root then chowns the tree. macOS path unchanged (no service user there).
  • H2C dual-nozzle detection missed post-2026 serial batches (#1105) — Bambu has started shipping H2C units with a new serial prefix (31B8B… observed on a January 2026 unit) instead of the legacy 094… shared by the H2D/H2C/H2S family. The K-profile edit flow (backend/app/api/routes/kprofiles.py) and the delete-K-profile MQTT path (backend/app/services/bambu_mqtt.py::delete_kprofile) branch on serial prefix to pick the dual-nozzle command format, so units with the new prefix were silently falling into the single-nozzle branch and getting the wrong K-profile payload shape. Added 31B8B (5-char match covering the model code + revision bytes, leaving the revision-letter slot free to iterate) alongside the existing 094 and 20P9 prefixes; runtime paths that auto-detect dual-nozzle from device.extruder.info were already prefix-agnostic. New regression test test_h2c_new_prefix_uses_dual_nozzle_format in test_bambu_mqtt.py. Thanks to m4rtini2 for the report.
  • Spoolman iframe silently blank on HTTPS Bambuddy with HTTP Spoolman (#1096) — Users behind an HTTPS reverse proxy (Traefik / Nginx / Caddy) pointing the Spoolman URL at plain HTTP saw the Filament tab render as a blank page with only a console-side Mixed Content warning. CSP was fine (the #1054 fix already allowed frame-src http:), but browsers enforce mixed-content blocking independently of CSP — an HTTP iframe inside an HTTPS parent is always blocked. Bambuddy can't technically fix this (the browser is correct to refuse), so instead of the silent blank frame the Filament page now detects the protocol mismatch (window.location.protocol === 'https:' plus Spoolman URL starting with http://) and renders an inline warning card explaining the root cause, pointing users at the right fix (put Spoolman behind the same HTTPS reverse proxy and update the Spoolman URL in Settings), and offering an "Open Spoolman in a new tab" button as an immediate workaround — a standalone tab isn't subject to mixed-content rules. Localised across all 8 UI languages. Thanks to jsapede for the report.
  • Reprint-from-Archive left created_by_id as NULL (#730 follow-up) — 0.2.4b1 fixed user attribution for Direct Print / File Manager / Library prints, but the reprint path was still unattributed on the archive row. Reprint intentionally reuses the source archive (to avoid duplicate rows — see register_expected_print), so an archive auto-created from a printer-initiated print with no known user stayed created_by_id=NULL forever, even after multiple reprints by authenticated Bambuddy users. Print Log got the reprinter's username correctly (via _print_user_info), but the Statistics per-user filter — which reads archive.created_by_id — kept showing the archive as unassigned. Fix in main.py's print-complete handler: when the archive has no created_by_id and a print-session user is set (which reprint always sets via set_current_print_user), back-fill the archive's attribution. Never overwrites an existing attribution — the original uploader keeps ownership; NULL archives are the only ones touched. Thanks to 3823u44238 for the detailed retest that caught this.
  • Settings: failed-save toast looped forever when the user lacked settings:update — the Settings page runs a debounced auto-save effect that fires PATCH /settings whenever localSettings diverges from the last server snapshot. When a delegated user with settings:read but not settings:update toggled a control, the effect fired PATCH, got 403, and kept re-firing every ~500 ms producing an endless stream of identical "Failed to save" toasts. Gated at three points so the mutation is never attempted without permission: (1) the updateSetting callback — every onChange path — shows one settings.toast.noPermissionUpdate toast and short-circuits before diverging localSettings; (2) the debounced-save effect safety-nets the same check in case any call site bypassed updateSetting; (3) the language <select> was a fire-and-forget direct api.updateSettings call that always flashed a success toast regardless of outcome — it now goes through updateMutation with the same permission guard. New settings.toast.noPermissionUpdate key added across all 8 locales with full translations (not English-fallback).
  • Groups: edits to custom-group permissions appeared lost on reopen (#1083) — creating a custom group and reopening the editor showed the correct permissions, but after editing that group's permissions and saving, reopening the editor within ~1 minute displayed the pre-edit snapshot as if the save had failed. The backend PATCH /api/v1/groups/{id} was persisting correctly (now covered by four new integration tests in test_groups_api.py, including a direct DB read after update); the issue was purely in the frontend React Query cache — GroupEditPage.onSuccess invalidated ['groups'] (the list) but left the ['group', id] detail cache stale, and with the app-wide 60 s staleTime the next mount served the cached pre-update body instead of refetching. onSuccess now primes the ['group', id] detail cache with the PATCH response body so the next mount hits fresh data immediately without a round-trip. Create-path invalidates ['group'] for symmetry. Regression test in GroupEditPage.test.tsx verifies the detail cache contains the updated permissions after save.
  • Setup: re-enabling auth could 422 on a password the form no longer needs — after disabling authentication and re-enabling it (common when switching between local auth and LDAP, or recovering from a bad config), the setup form still sends admin_password in the body even though the backend route ignores it when an admin user already exists. The SetupRequest Pydantic schema enforced password complexity (uppercase + lowercase + digit + special char) unconditionally, so any existing password that predated the complexity rule — or a legitimate LDAP-mode placeholder — triggered 422 Value error, Password must contain at least one special character before the route body could decide to ignore the field. Complexity validation has moved out of the schema and into the route body, scoped to the branch that actually creates a new local admin. Re-enabling auth with an existing admin (or any LDAP user) now accepts whatever the form sends; fresh first-time setup still rejects weak passwords with a clear 400. Two regression tests added in test_auth_api.py: weak password rejected at setup when creating the first admin, weak/placeholder password accepted when an admin already exists.
  • Queue: batch (quantity>1) double-dispatched onto the same printer — scheduling an ASAP print with quantity > 1 could end up with two queue items in 'printing' status for the same printer, surfaced in the logs as BUG: Multiple queue items in 'printing' status for printer N. The scheduler's in-memory busy_printers set was seeded empty each tick and only populated after _start_print succeeded in the current iteration, so on the next tick (30 s later) _is_printer_idle() read the printer's live MQTT state — which on H2D / P1 series lags several seconds behind the print command and still reported IDLE / FINISH — and dispatched the second batch item onto the already-running printer. check_queue() now queries PrintQueueItem for status='printing' rows and seeds busy_printers with their printer IDs before iterating pending items, so any printer with an outstanding dispatched job is excluded regardless of what MQTT currently reports. Regression covered in test_phantom_print_hardening.py (TestBusyPrinterSeedingFromPrintingItems): seeding query returns printers with 'printing' rows only, returns empty when none exist, and end-to-end check_queue() does not call _start_print for a pending item whose printer already has a 'printing' row even when _is_printer_idle() is forced True.
  • Queue: active-item progress bar flashed 100% before dropping to 0% — immediately after a queue item was dispatched, the per-item progress bar on the Queue page showed 100% (or whatever the prior print's final mc_percent was) for the few seconds between dispatch and the printer's MQTT state transitioning to RUNNING. Frontend QueuePage.tsx read status.progress directly from the printer's live MQTT snapshot, which carries over the last reported value from the previous print until the new one starts ticking. The progress bar, remaining time, ETA, and layer counter are now gated on status.state being RUNNING or PAUSE; in any other state (including FINISH from the prior print, IDLE, or PREPARE while heating) the bar renders at 0% with no stale ETA/layer values.
  • "Open in Slicer" fails on Windows / Linux for any filename containing spaces or special characters (#1059) — clicking "Open in Slicer" from the File Manager or Archives page produced one of three symptoms depending on the file: .3mf files opened Bambu Studio / OrcaSlicer but the app showed "Importing to Bambu Studio failed. Please download the file and open it manually" (the file on disk was 0 bytes); .stl files greyed the button out; .step couldn't be previewed at all. The protocol-handler URL emitted by frontend/src/utils/slicer.ts for OrcaSlicer (orcaslicer://open?file=<URL>) and Windows/Linux Bambu Studio (bambustudio://open?file=<URL>) was built by plain string concatenation with no encodeURIComponent() — the macOS bambustudioopen://<URL> branch was already encoding correctly, which is why macOS users didn't see this. A stale comment block in the file claimed the browser preserves the URL in the query string so no encoding is needed; that's true for the browser-to-OS handoff but ignores that the slicer itself calls url_decode() on the received query (BS post_init() calls url_decode then split_str; OrcaSlicer's Downloader regex-extracts then url_decode). Any already-percent-encoded character in the download URL — most commonly %20 from filenames with spaces, which Bambuddy's archive paths produce naturally — decoded to a literal space and the slicer's subsequent HTTP GET came back 0 bytes or 404. All three URL forms now encodeURIComponent() the file URL, so the slicer sees the correctly-encoded URL after its own url_decode. The comment block is corrected to document the actual invariant. Regression test in slicer.test.ts feeds the exact issue reproduction URL (Toothpick%20Launcher%20Print-in-Place.3mf) and asserts %2520 appears in the generated orcaslicer:// href — so any future refactor that drops the encoding fails CI. Thanks to jsapede for the double-encoding diagnosis and AllanonBrooks and lunaticds for the original reports.

Don't miss a new bambuddy release

NewReleases is sending notifications on new releases.