github ConardLi/easy-dataset 1.3.5
[1.3.5] 2025-05-21

latest releases: 1.5.1, 1.5.0, 1.4.0...
5 months ago

如果遇到 Github 下载慢的问题可以使用网盘下载:https://pan.quark.cn/s/194b7eedf16e

🔧 修复

  1. 数据集确认/保存失败
    → 修复因权限校验异常或网络波动导致的数据集保存失败问题,提升操作稳定性。
  2. 修改文本块后筛选条件失效
    → 解决文本块内容更新后,筛选条件(如标签、状态)未同步刷新的问题。
  3. 硅基流动默认 API 错误
    → 修正默认配置中硅基流动 API 地址及认证参数,确保模型调用正常。
  4. 导出自定义格式数据集丢失标签
    → 恢复自定义格式导出时标签字段的正常提取,支持保留完整元数据。

⚡ 优化

  1. Windows 安装路径自定义
    → 安装程序新增路径选择功能,默认不再强制安装至 C 盘,支持用户指定安装目录。
  2. Alpaca 数据集导出配置优化
    • 字段选择:支持切换问题使用 instructioninput 字段,适配不同模型训练需求。
    • 自定义指令:允许手动输入或修改 instruction 内容,提升数据生成灵活性。

🔧 Fixes

  1. Dataset confirmation/saving failures
    → Fixed issues with dataset saving due to permission errors or network fluctuations, improving operational stability.
  2. Filter criteria失效 after text block modification
    → Resolved synchronization issues where filter conditions (e.g., labels, status) failed to update after text block edits.
  3. Default API error for SiliconFlow
    → Corrected the default API endpoint and authentication parameters for SiliconFlow to ensure proper model invocation.
  4. Missing labels in custom-format dataset exports
    → Restored label fields in custom exports to preserve complete metadata during data export.

⚡ Optimizations

  1. Windows installation path customization
    → Added a path selection feature during installation, allowing users to specify a directory instead of forcing C:\ by default.
  2. Alpaca dataset export configuration
    • Field selection: Supported switching between instruction and input fields for questions, adapting to different model training needs.
    • Custom instruction: Allowed manual input or modification of instruction content for more flexible data generation.

Don't miss a new easy-dataset release

NewReleases is sending notifications on new releases.