Highlights
- AirLLM for MLX: run models larger than your available GPU memory!
- Fastchat server: Gemma3 support, and Evaluate logprobs now works
- Embeddings: correctly uses selected embedding model, several fixes to training plugin
- Ollama Server is the now the default server plugin for users without a GPU or MLX
Specific additions to Evaluations:
- Custom evals: create your own evals using custom python code
- Red Teaming Plugin: generate detailed reports of potential vulnerabilities in your LLMs
- GEval: Enter a description to DeepEval LLM-as-Judge to create an evaluation metric for you automatically
Full github notes
- Add GEval Widget by @deep1401 in #324
- Change Process Embeddings to use in-built embeddings function by @deep1401 in #323
- Change string comparison to exact equals to prevent collision issues by @deep1401 in #325
- Fixes reset to default embedding and adds embedding to experiment json on new setup by @deep1401 in #327
- Remove ipc for darkmode (it wasn't used) by @aliasaria in #326
- Remove under construction message from visualize logprobs by @deep1401 in #328
- Add option for Code return type in the widget by @deep1401 in #332
- Fix/screen-size-adjustments-for-cloud by @aliasaria in #331
- I don't think preload-cloud was actually doing anything by @aliasaria in #330
- Divide evals menu in dataset-based and model-based by @deep1401 in #334
- Add Azure OpenAI option in settings and enable in ModelProviderWidget by @deep1401 in #338
Full Changelog: v0.11.1...v0.11.2