github danieldotnl/ha-multiscrape v9.0.0
v9.0.0 ๐Ÿ—๏ธ The big architectural overhaul

pre-release5 hours ago

๐Ÿ‘‹ A quick note

I run The Smart Home Newsletter โ€” a weekly curated digest for smart home enthusiasts.

Each issue highlights the most relevant and interesting smart home articles, news, and projects from the past week.

Since you're here, you're clearly into home automation โ€” so you might genuinely enjoy the newsletter.

๐Ÿ‘‰ Subscribe at smarthomenewsletter.com


๐Ÿ—๏ธ The story behind v9.0.0

It's been a while โ€” v8.0.5 shipped back in February 2025 โ€” and a lot has happened under the hood since then. This release is less about flashy new features and more about a deep overhaul of how Multiscrape works internally, setting the stage for everything that comes next.

A new architecture

When Multiscrape started, it was a humble little integration. As features piled on (form authentication, cookies, multiple sensors per config, XML/JSON/HTML scraping, notifications), the internals quietly grew tangled. So this release is the result of a multi-phase architectural refactor:

  • The Scraper class got a Strategy pattern โ€” HTML, JSON, and XML scraping are now cleanly separated strategies instead of branching if/else soup. Adding new content types is finally pleasant.
  • A new HttpSession unifies all HTTP handling across the integration. Cookies, form submits, retries, and authentication all flow through one consistent layer instead of three subtly-different ones.
  • FormAuthenticator was extracted from HttpSession to restore the Single Responsibility Principle โ€” form-based login is now a self-contained, testable component.
  • A typed ScrapeContext replaces the old loose variable-passing system, giving sensors, triggers, and templates a single, predictable place to read state from.
  • A ScraperRegistry replaces the brittle index-based discovery that used to power sensor wiring, eliminating a whole class of "wrong sensor got the wrong data" bugs.
  • The entity base class was modernized to align with current Home Assistant core conventions (TimestampDataUpdateCoordinator, proper typing, etc.), which means fewer surprises on HA upgrades.

None of this changes the YAML you write โ€” it's all internals โ€” but it means future features (and bug fixes) land faster, safer, and with much better test coverage.

Bug fixes that were lurking

The refactor surfaced a few real bugs that have now been squashed:

  • XML scraping was silently dropping data in some configurations โ€” fixed.
  • Auth failures were not being handled gracefully, sometimes leaving stale state behind โ€” fixed, with cookie clearing on auth failure.
  • The notification flow had a backwards dependency chain that occasionally caused triggers not to fire โ€” fixed.

Tests, tests, tests

Test coverage went from "we have some" to comprehensive. There are now extensive tests for schema validation, HTTP session behavior, cookie persistence, form-submit flows, XML scraping, and end-to-end integration scenarios. This is why a refactor of this scope is shipping with confidence.

Modernization

Multiscrape is now Python 3.14 compatible โ€” tested in CI against the Python version Home Assistant currently ships with, so it stays in lockstep with HA core.

Why a pre-release?

Because this is a lot of internal change. Everything passes tests and existing configurations should work unchanged โ€” but with a refactor this deep, I want brave early adopters to kick the tires before this becomes the default for everyone.


๐Ÿ“‹ Changes

๐Ÿ›๏ธ Architecture & Refactoring

  • Refactor Scraper class using the Strategy pattern (#552) (#568)
  • Unify HTTP handling with HttpSession (#550) (#561)
  • Extract FormAuthenticator from HttpSession to restore SRP (#567)
  • Simplify variable system with typed ScrapeContext (#551) (#563)
  • Replace index-based discovery with ScraperRegistry (#554) (#572)
  • Modernize entity base class for HA core alignment (#557) (#560)
  • Improve typing and use TimestampDataUpdateCoordinator (#558)
  • Architecture Refactoring Plan documentation (#535)

๐Ÿ› Bug Fixes

  • Fix XML scraping data loss, auth failure handling, and clear cookies on auth failure (#577)
  • Fix notification flow by removing backwards dependency chain (#553) (#571)
  • Fix HttpSession review issues from PR #561 (#564)
  • Fix two bugs surfaced before architectural refactoring (#556) (#559)

๐Ÿงช Testing

  • Phase 1: Comprehensive testing infrastructure improvements (#533)
  • Improve test coverage for schema validation and HTTP session (#556) (#573)
  • Add end-to-end integration tests for form submit flow
  • Add integration tests for cookie persistence and form variables

โฌ†๏ธ Dependencies & Tooling

  • Upgrade project to Python 3.13 (#531)
  • Update dependency Python to 3.14 (#526)
  • Update CI test matrix to Python 3.14 (#566)
  • Update pip to v26 [SECURITY] (#546)
  • Update pytest-homeassistant-custom-component to v0.13.326 (#575) and many earlier bumps
  • Update ruff to v0.15.12 (#576) and many earlier bumps
  • Update colorlog to v6.10.1 (#529)
  • Update actions/checkout to v6 (#537), actions/setup-python to v6.2.0 (#543), github/codeql-action to v4 (#527)
  • Update release-drafter/release-drafter action to v7 (#570)
  • Update devcontainer GitHub CLI feature to v1.1.0 (#544)

๐Ÿ“š Documentation

  • Add quick note about the newsletter to README
  • Enhance README with Home Assistant version sensors, advanced split-configuration example, and minor formatting (#549)

Don't miss a new ha-multiscrape release

NewReleases is sending notifications on new releases.