github transformerlab/transformerlab-app v0.11.0
0.11.0

one day ago

Installer + Settings

  • A new Settings UI allows you to configure third-party LLM services like OpenAI and Anthropic. These services can be used throughout the app for tasks like judging or data generation.
  • For debugging purposes, you can view all active engine tasks in Settings → Jobs.
  • The Computer Tab now includes a scrollbar when you have more than 3 GPUs, so you can see all your GPUs (must be nice!).
  • Fixed a bug where having a .python_version file in your home directory would interfere with uv's Python version selection.
  • The new AI Providers tab in Settings lets you add API keys for OpenAI, Anthropic, and custom providers:
    • API keys now persist between app startups
    • Custom Model Provider APIs can be configured here and used for evaluations or generation

Datasets

  • We improved how datasets display, particularly those containing very large fields
  • Datasets requiring a config_name now allow you to specify the config during download

Documents

  • Fixed bugs affecting document deletion and updates for files stored in folders
  • Added the ability to sort documents by name
  • Improved the visual design of the Documents page

Evals

  • Charting
    • Visualize individual evals using line, bar, or radar graphs
    • Compare multiple evals using interactive charts
    • Generate detailed comparative reports between evals
    • Enhanced chart visuals with improved legends and formatting
  • Comparing Evals
    • Compare results across multiple eval runs
    • Generate detailed side-by-side comparisons for each test case, with downloadable CSV reports
    • View summary charts comparing all selected eval jobs
    • Display line graphs with flexible axis options—view by metric or by compared job
  • New Eval Plugins:
    • Basic Evals Plugin: Perform fundamental checks on model outputs, including JSON validation and bullet list detection
      • Create custom evaluations using regex patterns or exact string matching
    • Red Teaming Plugin: Test LLM vulnerabilities using DeepEval's capabilities across multiple categories: Bias, Misinformation, PII Leakage, Personal Safety, Toxicity, Robustness, Unauthorized Access, Illegal Activity, Graphic Content, and Intellectual Property Infringement
    • Common EleutherAI Harness Plugin: Run key benchmarks using a streamlined version of the EleutherAI Harness plugin

Training:

  • Multi-GPU support
    • Multi-GPU training support for Llama Trainer via dedicated plugin
    • GRPO training support with multi-GPU configuration on a separate plugin
  • Pre-training
    • New pre-training plugin using Nanotron for LLMs, initially supporting LlamaForCausalLM config with planned expansion to MoE and Mamba (state-space) models
  • Embedding Models: Fine-tune embedding models with comprehensive dataset types and loss functions

Backend:

  • SQLAlchemy Integration
    • Started migrating database access to SQLAlchemy, enabling future ORM capabilities and Alembic migrations
  • Testing Infrastructure
    • Implemented foundational unit tests using pytest
  • Docker:
    • Updated CPU and GPU Dockerfiles in main branch
    • Published v0.10.2 CPU and GPU images to Transformer Lab dockerhub
    • Updated docker-compose file for direct execution
    • Verified CPU docker images on Mac, Ubuntu, and Windows
  • Fixed local model import issue
  • Enhanced API Security
    • Strengthened API endpoints to ensure compliance with future security audits

Don't miss a new transformerlab-app release

NewReleases is sending notifications on new releases.