github hiyouga/LLaMA-Factory v0.2.1
v0.2.1: Variant Models, NEFTune Trick

latest releases: v0.9.0, v0.8.3, v0.8.2...
12 months ago

New features

  • Support NEFTune trick for supervised fine-tuning by @anvie in #1252
  • Support loading dataset in the sharegpt format - read data/readme for details
  • Support generating multiple responses in demo API via the n parameter
  • Support caching the pre-processed dataset files via the cache_path argument
  • Better LLaMA Board (pagination, controls, etc.)
  • Support push_to_hub argument #1088

New models

  • Base models
    • ChatGLM3-6B-Base
    • Yi (6B/34B)
    • Mistral-7B
    • BlueLM-7B-Base
    • Skywork-13B-Base
    • XVERSE-65B
    • Falcon-180B
    • Deepseek-Coder-Base (1.3B/6.7B/33B)
  • Instruct/Chat models
    • ChatGLM3-6B
    • Mistral-7B-Instruct
    • BlueLM-7B-Chat
    • Zephyr-7B
    • OpenChat-3.5
    • Yayi (7B/13B)
    • Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

  • Pre-training datasets
    • RedPajama V2
    • Pile
  • Supervised fine-tuning datasets
    • OpenPlatypus
    • ShareGPT Hyperfiltered
    • ShareGPT4
    • UltraChat 200k
    • AgentInstruct
    • LMSYS Chat 1M
    • Evol Instruct V2

Bug fix

Don't miss a new LLaMA-Factory release

NewReleases is sending notifications on new releases.