github kobotoolbox/kpi 2.026.07

9 hours ago

What's changed

Features (24)
  • massEmails: exclude trashed users from email lists (#6615)

    Ensure users who cannot log in do not receive emails.

    Filter out any user who is already in the trash bin or who has been
    explicitly set to not active (as distinguished from users who we
    determine to be inactive by a lack of activity).

  • massEmails: exclude users who have submitted from inactive emails (#6618)

    Exclude users who have recently made submissions to any project from
    receiving inactive user emails.

    Previously we only counted users as active if they submitted to their
    own projects, but not projects owned by others.

  • processing: handle new subsequences API on frontend (#6657)

  • processing: handle non-blocking UI for NLP answers (#6671)

    Allow users to enter QA answers very fast while they are saved in
    background.

  • processing: handle "in_progress" status (#6677)

  • processing: toast on saving transcripts and translations (#6681)

  • processing: handle failure errors (#6678)

  • qa: rename qual to manual_qual (#6555)

    Rename the "qual" action to "manual_qual."

  • qual: show most recently created qual answers instead of most recently accepted (#6575)

    Display the most recent QA answers in the data table and exports rather
    than the most recently accepted.

  • qual: add new automatic QA action (#6567)

    Update advanced features API to allow requesting LLM answers to QA
    questions.

    Add a new "automated_chained_qual" action to the advanced features API
    endpoints. It is currently only a stub and will return canned answers
    rather than actually hitting an LLM.

  • submissions: support rootUuid as {id} parameter for data detail endpoints (#6660)

    Allow data detail endpoints to be accessed using rootUuid as the primary
    identifier.

    Description

    This feature adds support for using rootUuid as the `{id} parameter
    when accessing data detail endpoints. This makes it possible to retrieve
    a submission directly by its root UUID. The change improves flexibility
    and consistency when working with submission identifiers, without
    altering existing behavior for clients that continue to use the original
    primary key format.

  • subsequences: add model and new endpoints for advanced actions (#6492)

  • subsequences: show supplemental columns in data table (#6523)

    Add supplemental NLP columns to data table.

    This is just for adding the columns to the data table. They may not be
    populated correctly. If an NLP action is enabled, there will be a column
    for it, even if there are presently no responses.

  • subsequences: stop using _advanced_features field (#6503)

  • subsequences: implement get_output_fields and transform_data_for_output for QualAction (#6504)

    Add implementation of get_output_fields() and
    transform_data_for_output() in QualAction.

    This update enables qualitative analysis results to appear correctly in
    exports or the table view.
    The new logic:

    • Defines the output fields for each qualitative question (including
      labels, types, and choices).
    • Converts stored qualitative results into export-ready values,
      including expanding choice UUIDs into readable label objects.
  • subsequences: migrate old advanced_features (#6545)

  • subsequences: migrate old SubmissionExtras to SubmissionSupplemental (#6422)

  • subsequences: allow hiding of QA questions (#6550)

  • subsequences: add OpenAPI schema for advanded features (#6547)

    Add OpenAPI schema for the /api/v2/assets/{uid_asset}/advanced-features/ endpoint.

    The API schema output files and the generated Orval types have been
    updated with the schema details for the action parameters in the
    QuestionAdvancedFeature model.

  • subsequences: show again in formpack exports (#6561)

  • subsequences: do not allow users to un-accept automatic NLP responses (#6628)

    Do not allow users to un-accept an automatic NLP response.

  • subsequences: add locale field to NLP actions documentation (#6620)

    Updated the API documentation to explicitly include the locale
    parameter for NLP actions.

  • subsequences: do not allow translation of deleted transcripts (#6649)

    This PR implements a new validation rule within the subsequence
    processing flow. It ensures that a translation action cannot proceed if
    its source transcription (Manual or Automatic) has been explicitly
    deleted, regardless of whether a previously accepted version exists in
    the history.

  • viewer: improve user-agent parsing for Enketo and Collect (#6780)

    The user-agent string displayed in submission metadata is now more
    human-readable for ODK Collect, Kobo Collect, and Enketo submissions.

    Previously, submissions from ODK Collect, Kobo Collect, or Enketo would
    show a generic browser/OS string instead of a meaningful identifier.
    This change adds dedicated parsing logic.

Bug Fixes (72)
  • ci: pin pip<25.3 to restore compatibility with pip-tools 7.x (#6435)

    Fixes a CI installation issue caused by an incompatibility between pip 25.3 and pip-tools 7.x.

  • datacollectors: remove links on group delete (#6650)

  • dev: fix formpack version in dependencies files (#6616)

    Ran pip-compile script to update the formpack version to the latest
    commit

  • drawer: icon size (#6654)

  • formbuilder: update tooltips under some formbuilder buttons (#6582)

    Formbuilder header buttons had some incorrect tooltip, this PR updates
    them to at least be relevant to the button they're associated with

  • frontend: ensure useOrganizationAssumed assumption holds (#6608)

    Don't sometimes crash the website at the data table route.

  • hub: [breaking] fix hub merge conflict (#6687)

    Fixes a migration conflict in the hub app

  • languages: unauthorize languages endpoint (#6699)

    Allow anonymous users to access the languages endpoint

  • logging: prevent duplicate logs by disabling propagation to root logger (#6808)

    Celery workers were emitting every log record twice, polluting logs and
    making debugging harder.

    When a logger is configured with propagate: True (the default), log
    records are passed up the logger hierarchy all the way to the root
    logger. Because Django's logging setup attaches a handler both on the
    named logger and on the root logger, each record was processed twice —
    once by the named handler and once by the root handler — resulting in
    every log line appearing duplicated.

    Setting propagate: False on console_logger stops records from
    bubbling up, so each log record is handled exactly once.

  • openAPI: improve schema for assets list response (#6622)

  • openApi: use proper query param name (#6664)

  • openapi: fix advancedfeaturesresponse schema (#6624)

  • openapi: advanced feature response action prop should be an enum (#6626)

  • openapi: allow nullable value for transcription and translation in DataSupplementPayload (#6629)

  • openapi: update DataSupplementResponse to use manual_qual (#6634)

  • openapi: fix nested array in advanced features response (#6635)

  • openapi: fix dataresponse schema (#6625)

  • openapi: add query parameter to AssetsDataListParams (#6639)

  • openapi: update qual to be manual_qual in all of the subsequences schema (#6637)

  • openapi: add and explain deleted option (#6636)

  • openapi: openapi qual params missing option (#6645)

  • openapi: update /api/v2/advanced-features PATCH API schema (#6642)

    Add action and question_xpath to the advanced-features PATCH API
    OpenAPI schema.

  • openapi: fix data supplement response qualitative items data schema (#6640)

  • processing: optimistically update automated transx (#6662)

  • processing: assorted tiny polish and code cleanup (#6663)

  • processing: deselect qa question after acting on it (#6666)

  • processing: saving indicator in analysis tab (#6670)

  • processing: prev/next arrows for edited submissions (#6672)

  • processing: preselect latest translation (automatic or manual) (#6668)

  • processing: prefix "generated" to dates correctly (#6673)

  • processing: confirm on deleting transcript/translation (#6674)

  • processing: uuid prefix messed up caching keys (#6675)

  • processing: isMutating count includes itself (#6676)

  • processing: update audio player source when media url change (#6682)

  • processing: type error on analysis tab (#6683)

  • processing: display _dateAccepted timestamp if present (#6684)

  • processing: avoid a flicker in translation tab (#6688)

  • processing: disable edit button for anonymous users (#6695)

  • processing: don't flicker auto transcribe approval (#6694)

  • processing: content loading fails when audio question is in group (#6726)

    Fix for an issue which will not load content on processing view when
    audio question is in a group

  • qual: prevent overriding answers with failures (#6583)

    Prevent LLM failures from overriding existing QA answers.

  • sidebar: settings styling and options sorting (#6696)

  • style: better disable state for NumberInput, RadioInput and TagsInput (#6698)

  • subsequences: return schema if not migrated (7299c28)

  • subsequences: fix nlp actions in data table (#6530)

    Ensure transcriptions and translations are displayed in the data table.

    Only accepted transcriptions/translations will be displayed.

  • subsequences: better error from creation of features with incorrect params (#6548)

    Validate params before creating new advanced features.

  • subsequences: update background processing to support new _data structure (#6549)

    Restore background and NLP processing by reading values from the new
    _data field.

    This fix updates the background processing logic to support the new data
    structure where value, language, and status (when present) are now
    nested under a _data dictionary. Some automated NLP actions were
    broken because they were still looking for these fields at the top
    level, where they can no longer exist.

  • subsequences: generate accurate OpenAPI schemas (#6535)

  • subsequences: avoid setting _dateAccepted when deleting an action result (#6551)

    Prevent _dateAccepted from being added during deletion of an action
    result.

    This fix corrects the subsequences logic so that _dateAccepted is not
    set when an action result is deleted. Previously, the deletion path
    could incorrectly mark the result as accepted by adding _dateAccepted,
    which conflicted with the intended semantics of a removal. The updated
    behavior ensures that deletion strictly removes the result without
    recording any acceptance metadata, keeping action histories consistent
    and accurate.

    Preview Steps

    Use the snippet provided in the linear task description.
    Try it with refactor-subsequences-2025 and see  _dateAccepted is
    added to the version.
    With this PR is not present.
    You can try other actions (manual_translation, automatic_* and
    qual) and get the same results.

  • subsequences: remove SubsequencesExtras reference and rename model (#6584)

    Remove SubsequencesExtras reference and rename model

  • subsequences: fix 500 error when value is missing in supplement data (#6611)

    Fix to gracefully handle missing value fields in failed
    transcription/translation entries instead of crashing with a 500 error
    when loading the data API endpoint

  • subsequences: return consistent _supplementData for list and detail endpoints (#6617)

  • subsequences: fix OpenAPI schemas with dynamic properties (#6614)

  • subsequences: throw error when deleting a null value in supplement API (#6607)

    This PR adds validation to prevent setting value: null on non-existent
    transcriptions and translations in the submission supplement API.

  • subsequences: fix data attachment schema (#6623)

  • subsequences: ensure deleted actions are removed from _supplementalDetails in data endpoint (#6619)

    Fix inconsistency where deleted actions were still visible in
    _supplementalDetails.

    This fix addresses an inconsistency in the data endpoint where deleting
    an action did not fully remove it from the _supplementalDetails
    property. As a result, clients could still see stale action data even
    after the action was deleted. The cleanup logic has been corrected so
    that deletions are properly reflected everywhere the supplement data is
    exposed, ensuring the data endpoint always returns an accurate and
    up-to-date view.

    Note this PR does not handle the case where the user requests a
    translation after deleting a transcription. Right now it will send off
    an empty string for translation. This will be addressed in a future PR
    (see in Linear)

  • subsequences: require default in labels (#6632)

  • subsequences: [breaking] enforce UUID format in JSON Schema validation (#6641)

    Validate subsequence identifiers strictly as UUIDs to prevent invalid
    data from being accepted

    This change enforces proper UUID format validation in the JSON Schema
    used by subsequences. Previously, invalid or malformed identifiers could
    pass validation, leading to inconsistent data

  • subsequences: exclude XML content-type for Orval while preserving dual schemas for Swagger UI (#6656)

    Keep both JSON and XML schemas in OpenAPI for documentation, but ensure
    Orval generates types from the JSON schema only.

  • subsequences: neighboring submission query same-time handling (#6651)

    Ensures users can navigate all submissions in processing view even if
    submissions have same submission time.

  • subsequences: typo in URLs (d9f50f6)

  • subsequences: fix translation blocking to allow transcript replacement (#6686)

    Prevents new translations from being created when a transcript has been
    deleted, while ensuring they remain available if a new transcript is
    provided later.

    This PR introduces a robust validation rule for transcriptions and
    translations to ensure that deleted data is never accidentally used for
    new translations.

    The system now utilizes a timestamp based arbitration logic. It compares
    the most recent time a transcript was deleted against the most recent
    time a transcript was accepted across all sources (Manual or Automatic).

    • If the latest event is a deletion: The system considers the question
      to have no valid transcript and will block any new translation requests.
    • If the latest event is an acceptance: Even if a deletion occurred
      previously, the system recognizes that the data has been replaced and
      allows translations to proceed normally using the newest valid text.

    This ensures data integrity by preventing the use of ghost transcripts
    while maintaining a flexible workflow for users who need to delete or
    re-transcribe their audio.

  • subsequences: fix null bucket name infinite processing (#6697)

    This PR fixes an issue where automatic transcription and translation
    subsequences could remain indefinitely in the "in_progress" state when
    GS_BUCKET_NAME was not configured.

    The backend now detects missing Google Cloud Storage configuration early
    and returns a clear failed status instead of triggering async processing
    and background polling. This prevents infinite retries caused by
    configuration errors and makes failures explicit and actionable.

  • subsequences: make translations always synchronous (#6762)

    Fixes an issue where if an automatic translation took too long, users
    would be taken back to the translation provider disclaimer screen with
    no explanation.

  • translations: flicker on translation operations (#6693)

  • usage: remove misleading "last update" text (#6665)

    Remove misleading "Last update: " text from usage page.

  • align migrations with production state (mostly no-op) (#6598)

  • fix supplement unit tests (7e601f7)

  • linter (19e1d0f)

  • fix merge artifacts (038fa91)

  • do not create asset version where migrating advanced features schema (3857c71)

  • fix sidebar displays not filtering out transcript (72953b3)

Documentation (2)
  • openApi: explain workaround and link Orval issue (#6554)
  • subsequences: update README and API docs (#6658)

    Improve documentation of advanced features.

Build & Dependencies (2)
  • api: [breaking] remove django-rest-framework frontend files (#6736)

    DRF's browsable HTML API, which is no longer used by our application,
    has been replaced with a dependency-free, standalone HTML renderer to
    address security vulnerabilities associated with outdated third-party
    libraries like Bootstrap.

  • docker: build node app in a separate docker build stage (#6498)

Testing (3)
  • subsequences: fix broken unit tests (#6491)
  • subsequences: port old unit tests (#6448)
  • subsequences: replace hardcoded UUIDs with named constants (#6646)
Security (2)
  • deps: bump lodash from 4.17.21 to 4.17.23 in the minor-and-patch group across 1 directory (#6652)
  • deps-dev: bump webpack from 5.101.3 to 5.105.0 in the minor-and-patch group across 1 directory (#6700)
Refactor (9)
  • addOns: decouple addOns from plans page (#6574)

  • dataCollectors: cleanup (#6612)

  • drawer: remove dead code and migrate to TS (#6601)

    🗒️ Checklist

    1. run linter locally
    2. update developer docs (API, README, inline, etc.), if any
    3. for user-facing doc changes create a Zulip thread at #Support Docs Updates, if any
    4. draft PR with a title <type>(<scope>)<!>: <title>
    5. assign yourself, tag PR: at least Front end and/or Back end
      or workflow
    6. fill in the template below and delete template comments
    7. review thyself: read the diff and repro the preview as written
    8. open PR & confirm that CI passes & request reviewers, if needed
    9. delete this section before merging

    Internal code cleanup around the left Sidebar.

  • formBuilder: migrate editableForm and connected files to TypeScript (#6588)

    Internal refactor of piece(s) of code that connects Form Builder
    (Backbone x CoffeeScript app) with the rest of the UI (React x
    JavaScript x TypeScript app).

  • qual: rename automatic qual action (#6593)

  • subsequences: rename "automated" to "automatic" (#6446)

  • subsequences: update advanced features to use UniqueConstraint (#6534)

  • subsequences: refactor data table for consistency (#6559)

    Updates the /data endpoint to return the answers to QA questions in a
    different format.

  • subsequences: [breaking] clean up codebase, restructure API, and improve backend logic (#6511)

    This PR cleans up the subsequences codebase by removing unused/broken
    code, restructuring the API with better endpoints, simplifying backend
    logic, updating the frontend to use the new API structure, and adding
    comprehensive documentation.

    This comprehensive refactoring addresses multiple aspects of the
    subsequences system to improve code quality, API design, and overall
    maintainability. The primary focus has been on cleaning up accumulated
    technical debt while modernizing the architecture for better long-term
    sustainability.

    Code Cleanup and Backend Simplification

    The refactoring begins with a thorough cleanup of the subsequences
    codebase, removing significant amounts of dead code, unused functions,
    and broken implementations that were no longer serving any purpose.
    Complex backend logic has been simplified and streamlined, reducing
    unnecessary complexity in data processing workflows. The code quality
    improvements include better type annotations, enhanced error handling,
    and improved maintainability patterns throughout the entire module.

    Frontend Integration and User Experience

    The frontend components have been thoroughly updated to integrate
    seamlessly with the restructured API endpoints, ensuring that users
    experience no disruption in functionality while benefiting from improved
    performance. The processing workflows for transcription, translation,
    and qualitative analysis have been enhanced with better error handling
    and more informative user feedback.

    Documentation and Migration Support

    A comprehensive README has been added to the subsequences module,
    providing detailed documentation about the new API structure, usage
    patterns, and clear guidelines for developers. The documentation
    includes a complete API reference with accurate parameter descriptions
    and response formats, along with practical examples showing how to
    transition from old to new endpoints. This documentation serves as both
    a reference for current development and a guide for external integrators
    who need to update their implementations.

Styling (3)
Chores (2)
  • deps: revert formpack to commit e43428525 (refactored subsequences app) (#6807)

    The previous formpack bump introduced a regression; this reverts
    formpack to the commit that includes the refactored subsequences
    Django app.

  • mfa: skip unit test about mfa and token authentication (#6759)

Revert (3)
  • revert mistakenly committed hack (06d7213)
  • revert "fix(qual): prevent overriding answers with failures " (#6590)
  • revert "Revert "fix(qual): prevent overriding answers with failures "" (#6592)
Other (113)
  • Make NumberDoubler action class work (3a6d582)
  • send number_doubler to formpack in super hacky way (23d3b22)
  • Add WIP edits to subsequences README (b8eb530)
  • yay (fba697e)
  • wip (678e8ec)
  • Make unit tests pass again after merging main (6f1a982)
  • Add grievances to README.md (91ad964)
  • Start drafting new README based on what we want

…and with less tiptoeing around what's already there (24a07b4)

  • Begin rewriting manual transcription action (3421691)
  • Continue rewriting manual transcription action (5cd5896)
  • Create fresh subsequences directory, and move…

previous work to subsequences__old (4b85d2f)

  • Remove unused load_params() (5091c64)
  • Add preliminary manual transcription tests (81d4d6c)
  • Move new work to subsequence__new instead, restore previous Django app (7d53d2a)
  • Update revise_field to support new structure (249abad)
  • More manual transcription tests, tweaks to revise_field (dc067cc)
  • typo (4bfdc04)
  • wip (aaa17f1)
  • Use data schema to build result schema (c03ee57)
  • Make result schema more dynamic (9f51715)
  • more (36e864b)
  • even more (e82b0ec)
  • Comment out timezone detection in "utc_datetime_to_simplified_iso8601" (5602de6)
  • Move result_schema to base class (ac0ae75)
  • clean (24cfa0d)
  • add example data for manual translations (9b012f7)
  • Add stripped-down SubmissionExtras model (97a5b7e)
  • WIP new viewset (17ea9e2)
  • add subsequences router thing to process incoming…

data for actions. doesn't do much yet, though, because you can't save a
useful advanced_schema into any asset because of the outdated
ADVANCED_FEATURES_PARAMS_SCHEMA (2f5c531)

  • note that submission uuid will be removed from POST data (c2ed328)
  • PoC action-generated asset-level params schema…

and saving action data into (for now) the old SubmissionExtras model (371801a)

  • Make result_schema an abstract method (8772830)
  • Add usage limit check to action class (91628ca)
  • Add comments (36f1449)
  • More comments (7fe16c5)
  • WIP new endpoint (3f79f0e)
  • add forgotten base.py (7529ada)
  • continue cleanup from forgetting to add base.py (e28efb2)
  • rename _schema to _actionConfigs; expect…

submission_uuid as argument from nested view instead of POST data (36ecc7b)

  • add method for retrieving supplemental data at…

the submission level, and fix method for storing it (c8a9c65)

  • Add submission arg to tests for revise_field() (7ef30fb)
  • Replace the lookup field of data endpoint with "submission_id_or_root_uuid" (01248d1)
  • Fix handle_incoming_data(), again… (121341a)
  • Make handle_incoming_data() return something, …

effectively the same thing as if retrieve_supplemental_data() were
called immediately afterwards (6289e4a)

  • Draft documentation (8561cda)
  • Validate the entire submission supplement (5a303b5)
  • Add forgotten file…again (0eef859)
  • Draft data supplement endpoint documentation (5d65175)
  • Refactor 'routers' logic into new proxy model (598b000)
  • Update exceptions import (fe9811d)
  • Warn about deprecation; clean a few things (178589b)
  • add drf-spectular schema and documentation (54daa42)
  • Replace "transcript" with "value" to be consistent with other actions (976e7b5)
  • Draft manual_translation (2ee23e2)
  • Make BaseAction.revise_data support lists (3f701bb)
  • Make SubmissionSupplement.revise_data support lists (9d9e46d)
  • Shuffle (fa636b8)
  • Rename subsequences to subsequences__old and…

subsequences__new to subsequences (8787be2)

  • Rip out old subsequences references (725d15d)
  • Get data API working minimally (ba36142)
  • Take teeny, tiny step toward reconnecting formpack (122e0d3)
  • Clean up (8f39d01)
  • Lint and format (d73dd89)
  • Stop mutating incoming data (4612f4c)
  • Add FIXME for revise_data() bug (7dedf27)
  • Add forgotten staticmethod decorator (06df33a)
  • Remove uuid: prefix in revise_data() (7222f8b)
  • Kill a little bit more old subsequence django app (f1a5fe1)
  • Make unit tests for refactored subsequence pass (7c806f0)
  • Unit tests, unit tests everywhere!!! (b4c8b46)
  • Introduce LookupConfig dataclass, remove "item_reference_property" (5172321)
  • Prepare and arbitrate supplemental data for output (b5f28e0)
  • Update formpack requirement for new supplemental…

format (9945e56)

  • WIP Draft automatic translation with Google (8eba576)
  • Use Base class for language related actions (5e9700b)
  • Make unit tests support dateAccepted (75e42b6)
  • Update formpack requirement for qualitative…

analysis simplification (89c486c)

  • Deduplicate in supplemental_output_fields (43afcf9)
  • Add support for locale (4e61470)
  • Improved process flow for automated actions (f567a4f)
  • Make automatic process an intern call in revise_data (ad18e8d)
  • Unit tests for automatic Google transcription (3b1122d)
  • add more base classes and mixins (3250c21)
  • Update docstrings to make base classes and mixins purpose more obvious (7d2f5be)
  • Refactor for automatic external services (858d0c3)
  • Make Automatic Google Translation a real thing (d5713fd)
  • Fix rootUuid with suffix (4215c6d)
  • linter (0dbf074)
  • Add unit tests for automatic_google_translation (72f3395)
  • Add validation unit tests on automatic google translation (0e32221)
  • Create new README (98f551d)
  • Update validation unit tests (1112b75)
  • replace Automatic with Automated (f92da1d)
  • Change content JSON structure from _revisions to _version (d7ac2ca)
  • Comments (3202533)
  • Add comments and draft logic for background updates (2fd5770)
  • Correct a few typos (fe7c88e)
  • Add an unique identifier for each version (ef08781)
  • Add dependency and more comments (39dd383)
  • Correct a few typos (e81416c)
  • Fixed typo (7ee999e)
  • WIP - With Celery (9d3680c)
  • Save errors when Google Timeout is reached (8a78a77)
  • Refactor dependencies system (b7134ed)
  • Persist action dependency (d70fc40)
  • Test Celery is triggered when task is in progress (0f29748)
  • Fix dependency field on error (1beb46f)
  • Update README (4d6abfc)
  • Update README (b42f23d)
  • migrate advanced_features and submission supplements (f550303)
  • Reactivate limits (0a92a24)
  • Add schemas for qualitative analysis and nest action data within _data attribute for each version (548008b)
  • Correct name of patched method (7023d27)

Full Changelog: https://github.com/kobotoolbox/kpi/compare/2.026.03f..2.026.07

Don't miss a new kpi release

NewReleases is sending notifications on new releases.