github blob42/Instrukt v0.6.2
🚀 v0.6.2 - coding assistant with local embeddings from any code base

latest release: v0.6.3
15 months ago

This release brings a major update to document indexes and embeddings
management 📚 .

Note that AI agents still rely on OpenAI for now. The next milestone will bring support local LLM models.

Index management UI:

  • Using custom and local embedding functions.
  • Implemented DirectoryLoader for scanning and indexing any source code base with preliminary support for language parsing (currently python, js) using LanguageParser from langchain.
  • Indexing and searching multiple directories.
  • Scanning and filtering content with custom glob.
  • CUDA/CPU embeddings with SentenceTransformers.

Conversation UI:

  • Many improvements to the UX giving more space to agent outputs.
  • Fuzzy select and edit source documents used for retrieval Q&A.
    Selection is sent as stdin to the program exported in $SELECTOR
  • Edit the agent reply or input using an external $EDITOR.

full changelog 0.6.2

Added

  • output_parsers.parser_lib.get_rich_md to sanitize agent output to markdown
  • wip: REPL input modes and auto completion
  • wip: Programming assistant agent
  • Progress bar for async events on tools
  • select and edit source documents used for retrieval Q&A with ctrl+p
  • Retrieval Q&A with agents returns source documents
  • ProgressBar protocol and wrapper to use Textual progress bar in a thread safe way
    and hook into tqdm update events.
  • Edit input using an external $EDITOR with ctrl+e
  • Capture and redirect logging/output to Instrukt console widget.
  • Generate sphinx doc for online and offline reading from within the app.
  • wip: offline doc reader: jump to anchors (Textualize/textual#2941)
  • Help screen for common keybindings with ?
  • UI: reusable action bar widget.

index management:

  • Embeddings: added bge-base embeddings option
  • Scan and load multiple PDFs from a directory under a single collection
  • AutoDirLoader: scan and load a directory, auto detects file types and assigns
    the appropriate splitter based on the detected content type.
  • link (patch) the index console progress bar to tqdm updates
  • progress bar for loading, splitting and indexing files
  • added a local file system path selection UI
  • chromadb: save/restore used embedding function. You can have multiple indexes using
    different embedding functions.
  • Choose to use local embeddings when creating index.
  • TODO: detect local embeddings when loading an index.

Changed

  • Refactored the prompt input and console output for better UX.
  • Bumped textual to version v0.34.0 (TextLog -> RichLog)
  • Improved key bindings
  • Memory mixin to handle retrieval answers with source documents
  • Retrieval uses MMR search algorithm by default
  • Improved form validation
  • Upgraded dependencies
  • ChromaDB: no more manual call to persist

Fixed

  • index: async delete indexes: ensure deletion happens after index is loaded.
  • improved iPython dev console: avoid term repaints until end of session.
  • Fast preloading of messages when switching agent tab.
  • Chroma: share a single client for all indexes
  • Explicit dependency on sentence-transofmers library for local embeddings.
  • Many fixes related to dependency updates

Don't miss a new Instrukt release

NewReleases is sending notifications on new releases.