v3.2.0 — Video Extraction, Word Support, Pinecone Adaptor
Theme: Video source support, Word document support, Pinecone adaptor, and quality improvements. 94 files changed, +23,500 lines since v3.1.3. 2,540 tests passing.
🎬 Video Extraction Pipeline
Complete video extraction system that converts YouTube videos and local video files into AI-consumable skills.
skill-seekers video --url <youtube-url>— New CLI command for video scrapingskill-seekers create <youtube-url>— Auto-detects YouTube URLs- Transcript extraction — 3-tier fallback: YouTube API → yt-dlp → faster-whisper
- Visual OCR — Multi-engine ensemble (EasyOCR + pytesseract) for code frames
- Panel detection — Splits IDE screenshots into independent sub-sections
- Code timeline — Tracks code evolution across frames with edit history
- Two-pass AI enhancement — Cleans OCR noise using transcript context
- GPU auto-detection —
skill-seekers video --setupdetects CUDA/ROCm/CPU and installs correct PyTorch - 197 tests covering models, metadata, transcript, visual, OCR, and CLI
📄 Word Document (.docx) Support
skill-seekers word --docx <file>— Full pipeline: mammoth → HTML → sections → SKILL.mdskill-seekers create document.docx— Auto-detects .docx files- Smart code detection — Identifies monospace paragraphs as code blocks
- Install:
pip install skill-seekers[docx]
🌲 Pinecone Vector Database Adaptor
skill-seekers package output/ --format pinecone --upload— Direct Pinecone upload- Full CRUD operations with namespace support
- OpenAI and Sentence Transformers embedding support
- Batch upsert with configurable batch sizes
- 764 tests for comprehensive coverage
🐛 Bug Fixes
- 6 OCR quality fixes — Skip webcam frames, clean IDE decorations, fix duplicate lines, filter UI junk
- 15 video pipeline fixes — Timeout handling, MCP integration, filename collisions, dependency management
- Issue #300 — Selector fallback & dry-run link discovery (ReactFlow found 20+ pages, was 1)
- Issue #301 —
setup.shmacOS fix - RAG chunking crash — Fixed
AttributeError: output_dir - Chunk overlap auto-scaling — Scales to
max(50, chunk_tokens // 10) - Reference file limits removed — No more caps on GitHub issues, releases, or code blocks
- See CHANGELOG.md for full details
📦 Install / Upgrade
pip install --upgrade skill-seekers
# With video support
pip install skill-seekers[video]
skill-seekers video --setup # Auto-detect GPU, install deps
# With Word support
pip install skill-seekers[docx]
# With Pinecone
pip install skill-seekers[pinecone]
# Everything
pip install skill-seekers[all]Full Changelog: https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/CHANGELOG.md