github teng-lin/notebooklm-py v0.4.1

2 hours ago

[0.4.1] - 2026-05-11

Compatibility note. Despite a few additive items (notebooklm auth refresh CLI, keepalive= constructor argument on NotebookLMClient, NOTEBOOKLM_REFRESH_CMD env var, two new dataclass fields), 0.4.1 is shipped as a patch release because the dominant work — and the reason to ship now — is auth/cookie stability remediation. Bumping to v0.5.0 would force the long-deferred removal of v0.3-era deprecated APIs (see Stability) earlier than scheduled; we'd rather keep that change isolated from the auth-keepalive work. All additive items are backward compatible — existing code keeps working without changes.

Added

  • notebooklm auth refresh CLI command - One-shot keepalive that opens a session, triggers the layer-1 SIDTS rotation poke against accounts.google.com, persists the rotated cookies to storage_state.json, and exits. Designed to be scheduled by the OS (launchd / systemd / cron / Task Scheduler / k8s CronJob) to keep an idle profile from staling out between user-driven calls. Pairs naturally with --quiet for log-only-on-error cron output. Requires file/profile-backed authentication — explicitly refuses to run when NOTEBOOKLM_AUTH_JSON is set (no writable backing store). See docs/troubleshooting.md for per-OS scheduler recipes (#336).
  • Periodic keepalive task on NotebookLMClient - Long-lived clients (agents, workers, multi-hour async with blocks) can opt into a background task that periodically POSTs RotateCookies to drive __Secure-1PSIDTS rotation, then persists rotated cookies to storage_state.json immediately so a crash doesn't lose the freshness. Disabled by default — pass keepalive=<seconds> to NotebookLMClient(...) or NotebookLMClient.from_storage(...) to enable. Values below keepalive_min_interval (default 60 s) are clamped up to that floor. The loop swallows transient errors at DEBUG and continues; cancellation on __aexit__ is clean. Persistence runs off-loop via asyncio.to_thread so the loop never blocks on disk I/O. Closes the gap left by the per-call layer-1 poke for clients that never re-call fetch_tokens (#297, #312, #341).
  • Auto-refresh on auth expiry - fetch_tokens now optionally runs a user-provided shell command when a Google session cookie has expired, reloads cookies from the same storage path, and retries once. Opt in by setting the NOTEBOOKLM_REFRESH_CMD environment variable to a command that rewrites storage_state.json (e.g. a sync script reading from a cookie vault). Refresh commands receive NOTEBOOKLM_REFRESH_STORAGE_PATH and NOTEBOOKLM_REFRESH_PROFILE so profile-aware scripts can target the active auth file. Covers every CLI entry point without changing the public API. Retry guards prevent refresh loops (#336).
  • examples/refresh_browser_cookies.py - Sample NOTEBOOKLM_REFRESH_CMD script that re-extracts cookies from a live local browser via notebooklm login --browser-cookies. Provides a recovery path for unattended automation when the in-process keepalive isn't enough (idle gaps, force-logout, password change).
  • Source.created_at and GenerationStatus.url public dataclass fields - Source.created_at is now populated for both nested and deeply-nested response paths. GenerationStatus.url is now populated by poll_status for media artifact types (audio, video, infographic, slide-deck PDF) so callers can stream the asset as soon as the status flips to ready (#349, #356).
  • ALLOWED_COOKIE_DOMAINS extended for sibling Google products - The browser-cookie import path now accepts cookies from Google's sibling product domains, restoring --browser-cookies flows for users whose active Google session lives on a sibling surface rather than notebooklm.google.com directly (#362).

Fixed

  • Cookies could silently stale out under sustained use - fetch_tokens now POSTs to https://accounts.google.com/RotateCookies (Chrome's dedicated unsigned rotation endpoint) before hitting notebooklm.google.com to drive __Secure-1PSIDTS / __Secure-3PSIDTS rotation. Empirically validated against both DBSC-bound (Playwright-minted) and unbound (Firefox-imported) profiles. RPC traffic against notebooklm.google.com alone does not appear to trigger rotation, so a keepalive that hit NotebookLM alone could silently stale out. The rotated Set-Cookie lands in the live httpx jar and is persisted via save_cookies_to_storage() along the fetch_tokens_with_domains / AuthTokens.from_storage paths. A 60 s mtime guard rate-limits the layer-1 poke — the POST is skipped when storage was recently rotated. Failures log at DEBUG and never abort token fetch. Disable with NOTEBOOKLM_DISABLE_KEEPALIVE_POKE=1 (e.g. networks that block accounts.google.com). Closes #312 (#345, #346).
  • Concurrent RotateCookies poke stampede - The 60 s mtime guard only debounces sequential invocations; under asyncio.gather fan-out, parallel CLI loops, or MCP worker pools, all callers see the same stale storage_state.json mtime and stampede the POST. Three layered protections inside _poke_session: a per-event-loop, per-storage-path async lock registry plus a sync state lock for in-process dedup (an asyncio.gather of 10 fires exactly one POST), a non-blocking LOCK_EX | LOCK_NB flock on the new .storage_state.json.rotate.lock sentinel for cross-process dedup (parallel CLI loops / MCP workers skip silently when another process is rotating), and a failure-stampede protection where the timestamp updates regardless of POST outcome — so a 15 s timeout against a hung accounts.google.com doesn't let 10 fanned-out callers each wait the full timeout. The layer-2 keepalive loop now calls the bare _rotate_cookies directly (it's already self-paced via keepalive_min_interval) and NOTEBOOKLM_DISABLE_KEEPALIVE_POKE continues to disable both layers (#347, #348).
  • Notebook.sources_count parsed but never surfaced - The sources_count field on the public Notebook dataclass is now populated from data[1] on both LIST and GET notebook shapes; previously it always read as 0 regardless of actual source count (#350).
  • Artifact.url unpopulated for media artifacts - The url field on the public Artifact dataclass is now populated for media types (audio, video, infographic; slide-deck exposes the PDF URL — use download_slide_deck(output_format="pptx") for PPTX) so callers no longer need to drop down to download_* to obtain the asset URL (#349, #356).
  • Cross-process and refresh-path save races - Close lifecycle and refresh-path saves now serialize correctly with the keepalive writer; concurrent writers no longer overwrite each other's rotated cookies (#344).
  • Keepalive ↔ close serialization; stop mutating caller Auth - The keepalive task no longer races with __aexit__, and no longer mutates the Auth instance the caller passed in. Callers that share an Auth across multiple clients now get the isolation the API documented (#343).
  • Snapshot keepalive cookie jar; normalize explicit storage_path - The keepalive task now snapshots the live httpx jar before writing (avoiding torn writes when an RPC is mid-flight); an explicit storage_path= argument to NotebookLMClient is normalized onto the Auth instance so the keepalive task writes to the file the caller actually pointed at (#342).
  • Per-domain cookie scoping on file upload - File-upload requests now send only cookies whose Domain attribute applies to the upload host, instead of the full jar. Prevents upload rejection when the jar mixes cookies for google.com, notebooklm.google.com, and googleusercontent.com (#373, #374).
  • Two-tier cookie validation pre-flight - Auth loaders now distinguish "missing-but-recoverable" from "fatal" cookie states before attempting an RPC, surfacing clearer errors and avoiding doomed requests against Google's identity surface (#372).
  • Preserve cookie attributes on load - Domain, Path, Secure, HttpOnly, and SameSite attributes round-trip through storage load, restoring behaviors that depended on cross-host scoping (#365, #368).
  • Unify flat-cookie selection across loaders - Legacy flat-cookie and modern Playwright storage shapes now share a single selection contract; subtle mismatches between the two paths are eliminated (#375, #376).
  • Tolerate non-numeric / out-of-range timestamp values on dataclasses - Notebook.created_at, Source.created_at, and Artifact.created_at now catch TypeError, ValueError, OSError, and OverflowError from datetime.fromtimestamp and resolve to None instead of raising on edge-case server responses (#357).
  • examples/refresh_browser_cookies.py --profile placement - The example invoked ... login --browser-cookies <b> --profile <p> but --profile is a top-level Click option and was rejected after login (Error: No such option: --profile). Now invokes ... --profile <p> login --browser-cookies <b> and works end-to-end against profile-backed storage.

Infrastructure

  • Consolidated URL extraction - _extract_artifact_url, per-type extractors (audio/video/infographic/slide-deck), and _is_valid_artifact_url moved to types.py. Readiness checks, Artifact.url, GenerationStatus.url, and the download paths now share one URL-selection contract: mp4 quality-4 > any mp4 > first valid URL for video. SourcesAPI.get_fulltext fixed for YouTube fulltext URLs at metadata[5][0] along the way (#349, #356).
  • Removed redundant ArtifactsAPI URL helpers - Private _is_valid_media_url and _find_infographic_url shim methods removed; tests now exercise the canonical types.py helpers (#358).
  • E2E --profile pytest flag - pytest --profile <name> scopes the E2E notebook ID cache to a named profile, so parallel multi-profile test runs don't collide on the cached notebook fixture (#340).

Don't miss a new notebooklm-py release

NewReleases is sending notifications on new releases.