github teng-lin/notebooklm-py v0.6.0

2 hours ago

Install: pip install notebooklm-py==0.6.0 · PyPI

Breaking changes

⚠ BREAKING — exception hierarchy symmetry restored.

SourceNotFoundError and ArtifactNotFoundError now inherit from RPCError
in addition to their respective domain bases (SourceError,
ArtifactError), restoring symmetry with NotebookNotFoundError which has
mixed in RPCError since the 0.5.x series. Combined with the new
NotFoundError umbrella (see Added below), the class declarations are
now:

class NotebookNotFoundError(NotFoundError, RPCError, NotebookError): ...
class SourceNotFoundError(NotFoundError, RPCError, SourceError): ...        # new RPCError mixin in 0.6.0
class ArtifactNotFoundError(NotFoundError, RPCError, ArtifactError): ...    # new RPCError mixin in 0.6.0

Migration. Code that catches the broad RPCError before a more
specific SourceNotFoundError / ArtifactNotFoundError clause now routes
"not found" through the broad branch instead of falling through to the
specific one. Reorder your except clauses so the more specific exceptions
come first.

The example below uses client.sources.get_fulltext(...), which raises
SourceNotFoundError for a missing source. (client.sources.get(...)
returns None and does not raise, so it doesn't demonstrate the change.)

# BEFORE — in 0.5.x this layout worked: SourceNotFoundError was NOT an
# RPCError, so it fell through the broad `except RPCError` to the specific
# handler. In 0.6.0 the broad handler catches it first, leaving the
# specific `except SourceNotFoundError` clause unreachable.
try:
    fulltext = await client.sources.get_fulltext(notebook_id, source_id)
except RPCError as e:        # ← in 0.6.0 this also catches SourceNotFoundError
    handle_rpc_failure(e)
except SourceNotFoundError:  # ← in 0.6.0 this branch becomes unreachable
    handle_missing_source()

# AFTER — put the specific exception first so the broad branch only sees
# other RPC failures.
try:
    fulltext = await client.sources.get_fulltext(notebook_id, source_id)
except SourceNotFoundError:
    handle_missing_source()
except RPCError as e:
    handle_rpc_failure(e)

Code that catches SourceNotFoundError / ArtifactNotFoundError directly,
or catches via the domain bases (SourceError, ArtifactError), or via the
shared NotebookLMError base, continues to behave exactly as before. Only
the RPCError-before-specific ordering is affected.

SourceNotFoundError.__init__ and ArtifactNotFoundError.__init__ also
now accept keyword-only method_id / raw_response parameters (forwarded
to the RPCError parent), matching the NotebookNotFoundError signature.
All positional call sites remain source-compatible.

  • notebooklm source stale <ID> now follows the standard CLI exit-code convention by default. Exit 0 indicates the freshness check succeeded (regardless of whether the source is fresh or stale); exit 1 indicates an error. Previously the command used an inverted predicate (0 = stale, 1 = fresh) so the shell idiom if notebooklm source stale ID; then refresh; fi worked naturally. Migration: scripts that depended on the inverted predicate can opt back into the legacy semantics with the new --exit-on-stale flag (if notebooklm source stale --exit-on-stale ID; then refresh; fi). Scripts written for the new default should branch on the JSON stale/fresh fields or stdout text. See docs/cli-exit-codes.md for the full rationale + the new Exit code semantics summary.
  • NotebookLMClient.rpc_call(...) no longer accepts source_path, _is_retry, or operation_variant — the three kwargs deprecated in v0.5.0 (docs/improvement.md §7.4, docs/deprecations.md) were removed after one MINOR cycle. The public escape hatch's primary contract (client.rpc_call(method, params)) is unchanged and the default-shape call keeps working with no migration. Migration:
    • Keyword callers: drop the removed kwarg from the call. The previous default-shape behavior (source_path="/", _is_retry=False, operation_variant=None) is now what every call gets unconditionally — source_path was a leaky internal seam, _is_retry was an internal retry-loop flag, and operation_variant is part of the mutating-RPC idempotency registry. Calls that genuinely needed a non-"/" source_path or a specific operation_variant were already on the wrong layer; build a typed method on a sub-client instead, or open an issue describing the workflow.
    • Positional callers (rare): the positional order of the remaining parameters is (method, params, allow_null, *, disable_internal_retries=...), so a previously-positional source_path / _is_retry argument now binds to a different parameter slot. A pre-cut client.rpc_call(method, params, "/", True) (which passed source_path="/", allow_null=True) becomes client.rpc_call(method, params, allow_null=True) after the cut — switch to keyword arguments for allow_null to avoid this footgun.
    • There is no public replacement for the removed internal-only kwargs (_is_retry, operation_variant); they were never part of the supported surface in the first place.
  • source add --url rejects internal hosts by default (SSRF guard). localhost, 127.0.0.1, RFC-1918, and link-local URLs — and any non-http(s) scheme — are now refused before ingestion. Migration: pass the new --allow-internal flag to ingest an internal http(s) URL intentionally (the scheme allowlist still applies). Full detail in Security below (#1114).
  • source CLI --json output shape changed. source get --json now emits the bare kind value ("type": "url") instead of the leaked Python enum repr ("type": "SourceType.URL"), and source fulltext --json emits a fixed {source_id, title, kind, content, url, char_count} payload instead of a raw asdict(SourceFulltext) dump. Migration: --json consumers parsing source get's type field, or relying on extra fulltext keys, must update. Full detail in Fixed below (#1129).
  • Post-parse CLI validation errors exit 1 (was 2) and print a JSON envelope on stdout under --json. For download flag conflicts, generate validation, research wait --cited-only, and ask --new + --conversation-id, a --json invocation now emits {"error": true, "code": "VALIDATION_ERROR", ...} on stdout and exits 1 instead of Click's stderr usage text + exit 2. Text-mode behavior is unchanged. Migration: automation parsing these --json failures should branch on exit 1 + the JSON body. Full detail in Changed below (ADR-015; #1112, #1115, #1117).

Added

  • notebooklm source stale --exit-on-stale flag — opt-in back-compat for the legacy inverted-predicate exit codes (0 = stale, 1 = fresh). The default behavior is now the standard CLI convention (see Breaking changes above); pass --exit-on-stale to keep if notebooklm source stale --exit-on-stale ID; then refresh; fi shell idioms working.
  • Exit code semantics summary section in docs/cli-exit-codes.md. A normative one-line table — 0 = succeeded as documented, 1 = failed or queried target not found, 2 = Click parser-time error — backing the convention every command obeys outside the documented intentional exceptions. Cross-references the existing tables and ADR-015.
  • NotFoundError cross-domain umbrella exception. Catch NotFoundError to handle any "resource not found" case across notebooks, sources, and artifacts in one except clause — replacing except (NotebookNotFoundError, SourceNotFoundError, ArtifactNotFoundError):. NotebookNotFoundError, SourceNotFoundError, and ArtifactNotFoundError all inherit from NotFoundError. The umbrella itself is additive; the asymmetric inheritance noted on its original introduction has been resolved in the same release — all three subclasses also mix in RPCError (see Breaking changes above for the except-ordering migration).
  • notebooklm notebook delete --json (#1167). notebook delete was the last delete command (and the only list / create / metadata sibling) without a JSON envelope — passing --json crashed with No such option. It now emits the typed success/cancel envelope, refuses to prompt in --json mode (requiring --yes, else a VALIDATION_ERROR envelope + exit 1), and surfaces context_cleared: true when the deleted notebook was the active context (#1193).
  • notebooklm skill install --dry-run / --no-clobber / --force (#1109). Project-scope installs now classify each target as create / up-to-date / overwrite. A target that would be overwritten with different content exits 1 and lists the conflicts unless --force (overwrite) or --no-clobber (skip differing, still create missing) is passed; --dry-run previews intended writes without touching disk. Writes go through an atomic temp-file + os.replace so a crash can't leave a partial SKILL.md. User scope keeps the historical always-overwrite behavior (the new flags error when paired with --scope user).
  • GenerationStatus.is_removed + status="removed" (#1168). A delisted or quota-removed artifact now reports status="removed" (is_removed=True) instead of a synthesized "failed", so callers can distinguish a transient list omission from a server-marked FAILED artifact. is_failed stays False for a removal; is_rate_limited still treats a quota-worded removal as retryable, and CLI exit behavior is unchanged (#1195).
  • Structured media-timeout diagnostics. When an accepted media task (audio / video / cinematic-video / infographic / slide-deck) stays queued or running past the --wait / wait_for_completion budget, the artifact APIs now raise a typed timeout exception that preserves the last poll-status transition and media-not-ready metadata (also surfaced in --json) instead of a bare timeout (#1094).

Changed

  • Media --wait default timeouts raised. generate audio --wait now defaults to 1200 s (#1140) and the video / cinematic-video wait defaults were increased to match empirical generation durations (#1088, #1094), so long generations no longer time out before the artifact is ready under default settings. docs/ now documents the media wait budgets and the manual artifact wait recovery path.
  • notebooklm doctor exits 1 when any check fails (#1160). It previously built status: "fail" entries but always exited 0, so CI health checks, set -e scripts, and monitoring probes read a broken install as green. Overall health is now computed from the final check states (after any --fix) and the process exits 1 if any check still fails (warnings stay non-fatal). The exit happens after the payload/table is emitted, so machine-readable --json output is unaffected; doctor profile JSON errors are also now wrapped in the typed envelope (#1179, #1146).
  • Post-parse CLI validation errors emit the typed JSON envelope under --json (ADR-015). download flag conflicts (--force + --no-clobber, --latest + --earliest, --all + --artifact), generate validation (cinematic --format / --style conflicts, invalid --language / NOTEBOOKLM_HL), research wait --cited-only without --import-all, and ask --new + --conversation-id now route through {"error": true, "code": "VALIDATION_ERROR", ...} on stdout and exit 1 under --json, instead of Click's parser bypassing the envelope to exit 2 with usage text on stderr. Text-mode behavior (usage text, exit 2) is unchanged. Flagged under Breaking changes above for --json automation (#1112, #1115, #1117).

Fixed

  • notebooklm artifact delete <id> --json now requires --yes before deleting (#1197). Without --yes, the command emits the typed VALIDATION_ERROR envelope, includes "deleted": false, exits 1, and leaves the artifact untouched, matching the other destructive delete commands.
  • HTML file uploads now fail client-side with a clear validation error (#1127). notebooklm source add ./article.html and client.sources.add_file(..., "article.html") previously reached NotebookLM's upload endpoint as text/html and surfaced a cryptic upstream 400 Bad Request. The upload pipeline now rejects .html / .htm / .xhtml / .xht / HTML MIME uploads before registering a source, with guidance to convert the page to .txt, .md, or .pdf.
  • notebooklm source fulltext -o FILE no longer silently overwrites existing files (#1173). Existing output paths now auto-rename by default (FILE -> FILE (2), etc.); pass --force to overwrite intentionally or --no-clobber to fail when the path already exists.
  • sources.list() raises on a malformed GET_NOTEBOOK response under strict-decode (the default) (#1159). A drifted or error-enveloped response was previously folded into an empty list, so a sync script could conclude every source had vanished and re-add them all. The hand-rolled list-shape checks now honor NOTEBOOKLM_STRICT_DECODE (logging the drift warning, then raising RPCError); a genuinely empty notebook (a None sources slot) still returns []. Set NOTEBOOKLM_STRICT_DECODE=0 for the legacy warn-and-return-[] fallback (#1178).
  • client.rpc_call(..., allow_null=True) raises on method-ID drift and anti-bot walls (#1158). The decoder gated its entire null-handling block behind not allow_null, so opt-in null callers (CREATE_ARTIFACT, GENERATE_MIND_MAP, DELETE_SOURCE, GET_SUGGESTED_REPORTS, …) silently received None when Google rotated a method ID or served a redirect / anti-bot page. An absent RPC ID (drift) and a body with no RPC frames (anti-bot wall) now always raise; only a present-but-null wrb.fr frame returns None. Null-result error messages now embed the discovered found_ids (#1176).
  • Auth-refresh replay no longer re-issues non-idempotent writes (#1157). After a mid-flight auth error (HTTP 401/403, or an auth-shaped decoded RPCError) on a probe-then-create method (CREATE_NOTEBOOK, CREATE_ARTIFACT, CREATE_NOTE, ADD_SOURCE, SHARE_NOTEBOOK, GENERATE_MIND_MAP), the refresh-and-retry path could duplicate the resource, invite email, or generation quota when the error landed after the server committed the write. Both replay paths (the AuthRefreshMiddleware 401/403 leg and the RpcExecutor decode-time leg) now honor the effective disable_internal_retries classification and propagate the original auth error so the caller's probe-then-create wrapper can disambiguate a commit-lost write (#1177).
  • client.notes.create raises when CREATE_NOTE returns no usable note id (#1162). It previously fell through to a success-shaped Note(id="") that was never finalized via UPDATE_NOTE or persisted server-side, so any later operation keyed on the empty id silently misbehaved. It now raises RPCError, matching the sibling add_source / notebooks.create paths (#1186).
  • Stale authed-POST envelope rebuilt after a 401 → refresh → 429 → retry flow (#1096). The terminal freshness guard's snapshot-equality short-circuit could POST the pre-refresh URL / headers / body against the refreshed cookie jar; the envelope is now rebuilt from a freshly captured auth snapshot on every terminal attempt (byte-identical on the happy path, load-bearing on the post-refresh retry).
  • NotebookLMClient.close(drain=True) no longer hangs on in-flight artifact polls (#1161). Registered drain hooks (which cancel polls parked in operation_scope) now fire before the drain wait, so close() short-circuits a pending poll instead of blocking up to the poll's own 300 s timeout (#1182).
  • Kernel.open() closes the httpx client if the open-time cookie snapshot raises (#1163). A failure while capturing the open snapshot previously propagated with a live, never-closed client (Python skips __aexit__ after a failed __aenter__), leaking the connection pool. open() now aclose()s the partial client and resets it so a retry rebuilds cleanly (#1187).
  • RPC concurrency semaphore gains the loop-affinity guard + close→reopen reset its siblings already have (#1169). The per-client max_concurrent_rpcs semaphore was the only loop-bound primitive without an affinity guard or reset, so reopening a capped client on a different event loop reused the stale semaphore and could raise "bound to a different event loop" or mispark waiters on Python 3.10/3.11 (masked on 3.12+). It is now guarded by the bound-loop assertion and discarded on any bound-loop change (#1196).
  • New conversations are serialized per notebook (#1144). Concurrent chat.ask() calls with no conversation_id against the same notebook are serialized so they no longer race to create duplicate server-side conversations.
  • Auth-refresh lock released if the lock-wait metric raises (#1164). await_refresh recorded the lock-wait metric between acquire() and the try, so a metric-side exception left the auth-refresh lock held forever, deadlocking every subsequent refresh. The metric call moved inside the try / finally: release(), matching the sibling update_auth_tokens (#1188).
  • Source-upload registration fails closed on an unparseable source id (#1143). The resumable-upload path now raises instead of silently accepting a response it can't parse a source id from, while still tolerating the legacy filename-first row shapes.
  • Artifact-generation defaults and null responses hardened (#1063, #1088). Omitting infographic options on the Python client.artifacts.generate_* calls now sends concrete visual defaults (matching the CLI) instead of producing a null CREATE_ARTIFACT result, and a null artifact-generation response is now classified as ArtifactFeatureUnavailableError.
  • source command --json output shape corrected and stabilized (#1129). source get --json previously leaked the Python enum repr ("type": "SourceType.URL") and now emits the bare kind value ("type": "url"); source fulltext --json now emits a fixed {source_id, title, kind, content, url, char_count} payload instead of a raw asdict(SourceFulltext) dump, and its -o envelope gains a kind field. --json consumers that parsed source get's type field or relied on extra fulltext keys must update (flagged under Breaking changes above). Shared serializers keep the shape consistent across the source subcommands going forward.
  • notebooklm source add - (stdin) rejects a non-text --type. Piping content from stdin with an explicit non-text source type now fails with a clear validation error instead of mis-routing the content.
  • notebooklm agent show routes errors to stderr (#1175) so they no longer pollute stdout.
  • Auth-error classification hardened (#1142) — empty RPC code labels no longer slip past the auth-error matcher.
  • Malformed batchexecute chunk records are now counted (#1141) rather than silently dropped, so the client.metrics surface reflects partial-response drift.

Removed

  • NotebookLMClient.rpc_call(source_path=...), NotebookLMClient.rpc_call(_is_retry=...), NotebookLMClient.rpc_call(operation_variant=...) — see Breaking changes above. The corresponding DeprecationWarning emitters in client.py and the tests/unit/test_rpc_call_public_surface.py warning-surface tests were retired in the same change.

Security

  • SSRF guard on source add --url (#1114). The prefix-only startswith(("http://", "https://")) check was replaced with a structural urlsplit parse + scheme allowlist (http / https only) plus a private / loopback / link-local IP guard and a localhost-literal guard. Behavior change: http://localhost, http://127.0.0.1, RFC-1918 hosts, and http://169.254.169.254 are now rejected by default — pass the new --allow-internal flag to ingest an internal URL intentionally (the scheme allowlist still applies). DNS is never resolved at validation time. Flagged under Breaking changes above.
  • Resumable upload URLs validated and redacted (#1130). The server-returned upload session / cancel URLs are validated before use and redacted in error and log output so a credentialed upload URL can't leak.
  • Artifact download allowlist validated by hostname (#1172). Download host-allowlisting now parses the URL hostname structurally instead of matching a string prefix, closing a bypass where a crafted URL (including encoded-slash hosts, hardened further in #1199) could satisfy a prefix check while pointing at an untrusted host.
  • httpx / urllib3 logs redacted for library consumers (#1166). configure_logging() now attaches a logger-level RedactingFilter to the httpx and urllib3 loggers at import, so a consumer who enables httpx DEBUG (e.g. logging.basicConfig(level=logging.DEBUG)) no longer sees the session id in ?f.sid=... request lines. Pure defense-in-depth — no handler is added, so consumers who never enable those loggers see no behavior change (#1191).
  • Bare CSRF / session-id token values redacted in logs (#1165). The scrubber now redacts bare SNlM0e (CSRF) and FdrFJe (session-id) WIZ_global_data markers, the csrf= form alias, and standalone AF1_QpN- CSRF tokens — credential-equivalent shapes that previously passed through scrub_secrets() unredacted (#1189).
  • Playwright login subprocess output sanitized (#1111). ensure_chromium_installed now strips ANSI control sequences and redacts inherited environment-variable secret values (including JSON-nested leaves such as NOTEBOOKLM_AUTH_JSON) from captured subprocess stderr/stdout before surfacing install diagnostics (meta-audit G4).

Don't miss a new notebooklm-py release

NewReleases is sending notifications on new releases.