transformerlab/transformerlab-app v0.11.0 on GitHub

A new Settings UI allows you to configure third-party LLM services like OpenAI and Anthropic. These services can be used throughout the app for tasks like judging or data generation.
For debugging purposes, you can view all active engine tasks in Settings → Jobs.
The Computer Tab now includes a scrollbar when you have more than 3 GPUs, so you can see all your GPUs (must be nice!).
Fixed a bug where having a .python_version file in your home directory would interfere with uv's Python version selection.
The new AI Providers tab in Settings lets you add API keys for OpenAI, Anthropic, and custom providers:
- API keys now persist between app startups
- Custom Model Provider APIs can be configured here and used for evaluations or generation

We improved how datasets display, particularly those containing very large fields
Datasets requiring a config_name now allow you to specify the config during download

Charting
- Visualize individual evals using line, bar, or radar graphs
- Compare multiple evals using interactive charts
- Generate detailed comparative reports between evals
- Enhanced chart visuals with improved legends and formatting
Comparing Evals
- Compare results across multiple eval runs
- Generate detailed side-by-side comparisons for each test case, with downloadable CSV reports
- View summary charts comparing all selected eval jobs
- Display line graphs with flexible axis options—view by metric or by compared job
New Eval Plugins:
- Basic Evals Plugin: Perform fundamental checks on model outputs, including JSON validation and bullet list detection
  - Create custom evaluations using regex patterns or exact string matching
- Red Teaming Plugin: Test LLM vulnerabilities using DeepEval's capabilities across multiple categories: Bias, Misinformation, PII Leakage, Personal Safety, Toxicity, Robustness, Unauthorized Access, Illegal Activity, Graphic Content, and Intellectual Property Infringement
- Common EleutherAI Harness Plugin: Run key benchmarks using a streamlined version of the EleutherAI Harness plugin

Multi-GPU support
- Multi-GPU training support for Llama Trainer via dedicated plugin
- GRPO training support with multi-GPU configuration on a separate plugin
Pre-training
- New pre-training plugin using Nanotron for LLMs, initially supporting LlamaForCausalLM config with planned expansion to MoE and Mamba (state-space) models
Embedding Models: Fine-tune embedding models with comprehensive dataset types and loss functions
- Supports all dataset types and loss functions documented at https://sbert.net/docs/sentence_transformer/loss_overview.html
- Models are automatically stored locally and imported into Transformer Lab
- Support for FP16 and BF16 options
- Additional loss modifiers for model size and performance optimization

SQLAlchemy Integration
- Started migrating database access to SQLAlchemy, enabling future ORM capabilities and Alembic migrations
Testing Infrastructure
- Implemented foundational unit tests using pytest
Docker:
- Updated CPU and GPU Dockerfiles in main branch
- Published v0.10.2 CPU and GPU images to Transformer Lab dockerhub
- Updated docker-compose file for direct execution
- Verified CPU docker images on Mac, Ubuntu, and Windows
Fixed local model import issue
Enhanced API Security
- Strengthened API endpoints to ensure compliance with future security audits

transformerlab/transformerlab-app v0.11.0 0.11.0 on GitHub