Documentation consolidation: - Merge docs/contributing/*.md into docs/development.md - Merge docs/reference/internals/*.md into docs/rpc-development.md - Move rpc-ui-reference.md to docs/rpc-reference.md - Consolidate examples/ into docs/examples/ (6 files total) - Remove getting-started.md (content in README) - Remove docs/README.md (navigation implicit) Cleanup: - Remove AGENTS.md (redundant with CLAUDE.md) - Remove RELEASING.md (merged into docs/development.md) - Remove .gemini/ and .github/copilot-instructions.md - Remove investigation files and artifacts - Add gitignore for auto-generated CLAUDE.md files Version bump: 0.1.4 → 0.2.0 (new features per stability.md) Final structure: docs/ ├── cli-reference.md # User docs ├── python-api.md ├── configuration.md ├── troubleshooting.md ├── stability.md ├── development.md # Contributor docs (merged) ├── rpc-development.md # RPC docs (merged) ├── rpc-reference.md ├── examples/ # Consolidated examples └── designs/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.4 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
IMPORTANT: Follow documentation rules in CONTRIBUTING.md - especially the file creation and naming conventions.
Project Overview
notebooklm-py is an unofficial Python client for Google NotebookLM that uses undocumented RPC APIs. The library enables programmatic automation of NotebookLM features including notebook management, source integration, AI querying, and studio artifact generation (podcasts, videos, quizzes, etc.).
Critical constraint: This uses Google's internal batchexecute RPC protocol with obfuscated method IDs that Google can change at any time. All RPC method IDs in src/notebooklm/rpc/types.py are undocumented and subject to breakage.
Development Commands
# Create/recreate venv with uv (recommended - relocatable venvs)
uv venv .venv
uv pip install -e ".[all]"
playwright install chromium
# Activate virtual environment
source .venv/bin/activate
# Run all tests (excluding e2e by default)
pytest
# Run with coverage
pytest --cov
# Run e2e tests (requires authentication)
pytest tests/e2e -m e2e
# CLI testing
notebooklm --help
Pre-Commit Checks (REQUIRED before committing)
IMPORTANT: Always run these checks before committing to avoid CI failures:
# Format code with ruff
ruff format src/ tests/
# Check for linting issues
ruff check src/ tests/
# Type checking with mypy
mypy src/notebooklm --ignore-missing-imports
# Run tests
pytest
Or use this one-liner:
ruff format src/ tests/ && ruff check src/ tests/ && mypy src/notebooklm --ignore-missing-imports && pytest
Architecture
Layered Design
CLI Layer (cli/)
↓
Client Layer (client.py, _*.py APIs)
↓
Core Layer (_core.py)
↓
RPC Layer (rpc/)
-
RPC Layer (
src/notebooklm/rpc/):types.py: All RPC method IDs and enums (source of truth)encoder.py: Request encodingdecoder.py: Response parsing
-
Core Layer (
src/notebooklm/_core.py):- HTTP client management
- RPC call abstraction
- Request counter handling
-
Client Layer (
src/notebooklm/client.py,_*.py):NotebookLMClient: Main async client with namespaced APIs_notebooks.py,_sources.py,_artifacts.py, etc.: Domain APIs
-
CLI Layer (
src/notebooklm/cli/):- Modular Click commands
session.py,notebook.py,source.py,generate.py, etc.
Key Files
| File | Purpose |
|---|---|
client.py |
Main NotebookLMClient class |
_core.py |
HTTP and RPC infrastructure |
_notebooks.py |
client.notebooks API |
_sources.py |
client.sources API |
_artifacts.py |
client.artifacts API |
_chat.py |
client.chat API |
rpc/types.py |
RPC method IDs (source of truth) |
auth.py |
Authentication handling |
cli/ |
CLI command modules |
Repository Structure
src/notebooklm/
├── __init__.py # Public exports
├── client.py # NotebookLMClient
├── auth.py # Authentication
├── types.py # Dataclasses
├── _core.py # Core infrastructure
├── _notebooks.py # NotebooksAPI
├── _sources.py # SourcesAPI
├── _artifacts.py # ArtifactsAPI
├── _chat.py # ChatAPI
├── _research.py # ResearchAPI
├── _notes.py # NotesAPI
├── rpc/ # RPC protocol layer
│ ├── types.py # Method IDs and enums
│ ├── encoder.py # Request encoding
│ └── decoder.py # Response parsing
└── cli/ # CLI implementation
├── __init__.py
├── helpers.py # Shared utilities
├── session.py # login, use, status, clear
├── notebook.py # list, create, delete, rename
├── source.py # source add, list, delete
├── artifact.py # artifact commands
├── generate.py # generate audio, video, etc.
├── download.py # download commands
├── chat.py # ask, configure, history
└── note.py # note commands
API Patterns
Client Usage
# Correct pattern - uses namespaced APIs
async with await NotebookLMClient.from_storage() as client:
notebooks = await client.notebooks.list()
await client.sources.add_url(nb_id, url)
result = await client.chat.ask(nb_id, question)
status = await client.artifacts.generate_audio(nb_id)
CLI Structure
Commands are organized as:
- Top-level:
login,use,status,clear,list,create,ask - Grouped:
source add,artifact list,generate audio,download video,note create
Testing Strategy
- Unit tests (
tests/unit/): Test encoding/decoding, no network - Integration tests (
tests/integration/): Mock HTTP responses - E2E tests (
tests/e2e/): Real API, require auth, marked@pytest.mark.e2e
E2E Test Status
- ✅ Notebook operations (list, create, rename, delete)
- ✅ Source operations (add URL/text/YouTube, rename)
- ✅ Download operations (audio, video, infographic, slides)
- ⚠️ Artifact generation may fail due to rate limiting
Common Pitfalls
- RPC method IDs change: Check network traffic and update
rpc/types.py - Nested list structures: Params are position-sensitive. Check existing implementations.
- Source ID nesting: Different methods need
[id],[[id]],[[[id]]], or[[[[id]]]] - CSRF tokens expire: Use
client.refresh_auth()or re-runnotebooklm login - Rate limiting: Add delays between bulk operations
Documentation
All docs use lowercase-kebab naming in docs/:
docs/cli-reference.md- CLI commandsdocs/python-api.md- Python API referencedocs/configuration.md- Storage and settingsdocs/troubleshooting.md- Known issuesdocs/development.md- Architecture, testing, releasingdocs/rpc-development.md- RPC capture and debuggingdocs/rpc-reference.md- RPC payload structures
When to Suggest CLI vs API
- CLI: Quick tasks, shell scripts, LLM agent automation
- Python API: Application integration, complex workflows, async operations
Pull Request Workflow (REQUIRED)
After creating a PR, you MUST monitor and address feedback:
1. Monitor CI Status
# Check CI status (repeat until all pass)
gh pr checks <PR_NUMBER>
Wait for all checks to pass. If any fail, investigate and fix.
2. Check for Review Comments
# Get review comments
gh api repos/teng-lin/notebooklm-py/pulls/<PR_NUMBER>/comments \
--jq '.[] | "File: \(.path):\(.line)\nComment: \(.body)\n---"'
3. Address Feedback
For each review comment (especially from gemini-code-assist):
- Read and understand the feedback
- Make the suggested fix if it improves the code
- Commit with a descriptive message referencing the feedback
- Push and re-check CI
- Reply to the review thread confirming the fix:
gh api repos/teng-lin/notebooklm-py/pulls/<PR>/comments/<COMMENT_ID>/replies \ -f body="Addressed in commit <SHA>: <brief description>"
4. Verify Final State
# Ensure PR is ready to merge
gh pr view <PR_NUMBER> --json state,mergeStateStatus,mergeable
Important: Do NOT consider a PR complete until:
- All CI checks pass
- All review comments are addressed
mergeStateStatusisCLEAN