github wildlifechorus/condenseit v2.0.0
2.0.0

latest releases: v2.8.0, v2.7.6, v2.7.5...
one month ago

What's Changed

AI-powered ranking (three independent opt-in layers)

  • Semantic embeddings - articles are encoded as vectors and scored by cosine similarity to a decay-weighted centroid of your liked/disliked articles. Embeddings are cached in SQLite per URL + content hash + model, so re-runs do not re-embed known articles. Configure via embedding_provider (ollama / openrouter / off), embedding_model, and embedding_preference_weight in Admin > Digest.
  • LLM topic enrichment - article summarization calls now extract topics, entities, and a novelty score (1-5) at no extra cost. These build a topic preference profile from your ratings: liked topics are boosted, disliked topics penalised. Weight controlled by topic_score_weight.
  • LLM reranker - one LLM call per digest re-orders the top-K candidates against a compact profile narrative. The LLM relevance score is blended with the classical score (llm_rerank_blend). The rerank reason appears in the "Why ranked here?" panel on every card.
  • Cold-start bootstrap - Preferences page accepts a plain-text interest description. The LLM derives initial keywords, synonyms, and a profile summary that seed the engine before any ratings exist.

Sources

  • Reddit - Lemmy auto-conversion - Reddit sources are transparently routed through Lemmy.world RSS on save (Reddit's API is blocked on most VPS IPs). The display badge remains "Reddit"; the original subreddit is recorded in extra_json.
  • YouTube channel ID resolver - new GET /api/sources/youtube/resolve endpoint resolves channel handles and URLs to channel IDs via yt-dlp.
  • Podcast sources - RSS-based podcast collector with per-episode summaries.

OpenRouter reliability and speed

  • 429 retry with backoff - _chat() retries up to 3 times on rate-limit responses, honouring the server's Retry-After header (5s / 15s / 30s fallback ladder). A single rate-limit hit no longer aborts the whole digest.
  • Parallel summarization - article LLM calls are now parallelised via ThreadPoolExecutor (default 4 workers, configurable via summarize_workers). DB writes remain sequential on the main thread.

Admin UI

  • Settings page: Semantic embeddings, topic enrichment, and LLM reranker sections with live weight sliders
  • Preferences page: top liked/disliked LLM topics, embedding status panel, cold-start bootstrap textarea
  • DigestCard: topic tags, novelty badge, and reranker reason in the "Why ranked here?" breakdown

Defaults updated for new installs

  • llm.openrouter_model: qwen/qwen3.5-flash-02-23 (replaces openai/gpt-4o-mini - cheaper, comparable quality)
  • llm.openrouter_pick_cheapest: false (use the model you configured, not whatever is cheapest that day)
  • exclude_keywords pre-seeded with "Community Forum", "promotional code", "promotional campaign"

Commits since 1.5.1

  • 179dc71 feat: add AI ranking layers, Lemmy sources, and OpenRouter hardening
  • e16a1a3 feat(sources): add podcast source support
  • ec8da51 fix(items): close detail after marking read
  • 3df8298 docs: fix stale and incorrect content across docs and nginx template

Don't miss a new condenseit release

NewReleases is sending notifications on new releases.