wildlifechorus/condenseit v2.0.0 on GitHub

What's Changed

Semantic embeddings - articles are encoded as vectors and scored by cosine similarity to a decay-weighted centroid of your liked/disliked articles. Embeddings are cached in SQLite per URL + content hash + model, so re-runs do not re-embed known articles. Configure via embedding_provider (ollama / openrouter / off), embedding_model, and embedding_preference_weight in Admin > Digest.
LLM topic enrichment - article summarization calls now extract topics, entities, and a novelty score (1-5) at no extra cost. These build a topic preference profile from your ratings: liked topics are boosted, disliked topics penalised. Weight controlled by topic_score_weight.
LLM reranker - one LLM call per digest re-orders the top-K candidates against a compact profile narrative. The LLM relevance score is blended with the classical score (llm_rerank_blend). The rerank reason appears in the "Why ranked here?" panel on every card.
Cold-start bootstrap - Preferences page accepts a plain-text interest description. The LLM derives initial keywords, synonyms, and a profile summary that seed the engine before any ratings exist.

Reddit - Lemmy auto-conversion - Reddit sources are transparently routed through Lemmy.world RSS on save (Reddit's API is blocked on most VPS IPs). The display badge remains "Reddit"; the original subreddit is recorded in extra_json.
YouTube channel ID resolver - new GET /api/sources/youtube/resolve endpoint resolves channel handles and URLs to channel IDs via yt-dlp.
Podcast sources - RSS-based podcast collector with per-episode summaries.

429 retry with backoff - _chat() retries up to 3 times on rate-limit responses, honouring the server's Retry-After header (5s / 15s / 30s fallback ladder). A single rate-limit hit no longer aborts the whole digest.
Parallel summarization - article LLM calls are now parallelised via ThreadPoolExecutor (default 4 workers, configurable via summarize_workers). DB writes remain sequential on the main thread.

Settings page: Semantic embeddings, topic enrichment, and LLM reranker sections with live weight sliders
Preferences page: top liked/disliked LLM topics, embedding status panel, cold-start bootstrap textarea
DigestCard: topic tags, novelty badge, and reranker reason in the "Why ranked here?" breakdown

llm.openrouter_model: qwen/qwen3.5-flash-02-23 (replaces openai/gpt-4o-mini - cheaper, comparable quality)
llm.openrouter_pick_cheapest: false (use the model you configured, not whatever is cheapest that day)
exclude_keywords pre-seeded with "Community Forum", "promotional code", "promotional campaign"