编译版本,默认配置,android-ndk-r25c,xcode 13.4.1,ubuntu-20.04,ubuntu-22.04,vs2015,vs2017,vs2019,vs2022,emscripten-3.1.28
file | content | arch |
---|---|---|
ncnn-full-source.zip | 包含全部 submodule 代码的完整源码 | |
ncnn-android.zip | android 静态库/动态库 | armeabi-v7a + arm64-v8a + x86 + x86_64 |
ncnn-android-vulkan.zip | android 静态库/动态库,支持 GPU | armeabi-v7a + arm64-v8a + x86 + x86_64 |
ncnn-apple.zip | apple xcframework,ios + ios-simulator + macos + mac-catalyst,with and w/o bitcode | armv7 + arm64 + arm64e + i386 + x86_64 |
ncnn-apple-vulkan.zip | apple xcframework,ios + ios-simulator + macos + mac-catalyst,支持 GPU,with and w/o bitcode | arm64 + arm64e + x86_64 |
ncnn-ios.zip | ios 静态库,with and w/o bitcode | armv7 + arm64 + arm64e |
ncnn-ios-vulkan.zip | ios 静态库,支持 GPU,with and w/o bitcode | arm64 + arm64e |
ncnn-ios-simulator.zip | ios simulator 静态库,with and w/o bitcode | i386 + x86_64 + arm64 |
ncnn-ios-simulator-vulkan.zip | ios simulator 静态库,支持 GPU,with and w/o bitcode | x86_64 + arm64 |
ncnn-macos.zip | macos 静态库 | x86_64 + arm64 |
ncnn-macos-vulkan.zip | macos 静态库,支持 GPU | x86_64 + arm64 |
ncnn-mac-catalyst.zip | mac catalyst 静态库,with and w/o bitcode | x86_64 + arm64 |
ncnn-mac-catalyst-vulkan.zip | mac catalyst 静态库,支持 GPU,with and w/o bitcode | x86_64 + arm64 |
ncnn-ubuntu.zip | ubuntu linux 静态库/动态库,支持 GPU,模型转换工具 | x86_64 |
ncnn-windows.zip | windows 静态库/动态库,支持 GPU,模型转换工具 | x86 + x64 + arm + arm64 |
ncnn-webassembly.zip | webassembly 静态库 | wasm32 + simd + threads + simd-threads |
实现全部的binaryop explicit广播规则类型
x86直接卷积权重变换的avx2/avx512优化
x86 int8直接卷积支持任意elempack和sse2/xop/avx2/avx512/vnni优化
ppc64 power8/power9 vsx工具链支持,编译器检查和intrinsic翻译优化(@JeremyRand)
更新glslang并启用VK_KHR_cooperative_matrix扩展和优化
修复pyncnn自定义layer模型权重加载
c_api新增Mat border/layer_to_index api(@Mek101)
VkCompute::submit_and_wait现在能返回错误值(@Upliner)
修复老版本clang编译时too many microtasks问题
修复clang-cl cpuid函数兼容性(@charlescao460)
修复新版本protobuf c++17编译问题
修复老版本编辑器sleep递归调用错误(@whyb)
编译时检查loongarch lasx扩展支持并自动启用
清理multiheadattention arm优化代码
binaryop支持一维outer axis广播规则,保持旧的兼容行为
benchncnn支持从命令行参数中指定自定义模型和输入(@tpoisonooo)
macos平台静态编译链接需要的系统库(@Baiyuetribe)
更改amd集显上的显存分配策略为仅设备优先,修复在bios设置大显存时分配失败问题
onnx2ncnn遇到不支持transpose类型输出错误信息(@huoshuai-dot)
pnnx支持多算子到多算子的图变换
pnnx新增转换torch.round/trunc/fill/index_put/to/type_as/topk/fmod/cross/t/maximum/minimum
pnnx合并chinese-clip/sam-iamge-encoder attention结构
pnnx合并F.scaled_dot_product_attention
pnnx消除无用的expand/expand_as/type_as
pnnx修正fp16模型在优化时的权重变换错误
pnnx修正负数shape索引越界问题(@Justin62628)
pnnx修复转换后py文件执行时权限错误(@zhenjiaguo)
pnnx转换ncnn global pooling后自动添加reshape
pnnx转换非zero padding模式的卷积到ncnn
pnnx转换2维nn.Linear为ncnn gemm
pnnx转换torch.stack为ncnn concat+reshape
pnnx转换torch.t到ncnn permute(@XiaBing992)
pnnx转换logsigmoid/log_softmax为ncnn sigmoid/softmax+log(@lrw04)
pnnx修复slice_copy输出的类型信息
pnnx修复表达式中int64转换溢出问题
pnnx修复reshape表达式消除后的ghost结点
pnnx合并表达式时折叠shape为1的类似标量的权重
pnnx合并表达式支持max/min
pnnx改善图中有inplace操作时的输出结点连接探测,带来更多的常量折叠
添加ncnn glsl扩展文档以及中文版(@whyb)
修正faq文档错误(@KYShek)
改进cmake寻找vulkan提示用语(@zchrissirhcz)
更新vs2017编译步骤的细节(@brcarry)
新增Intel oneAPI编译步骤(@mizu-bai)
更新loongarch ci工具链,添加loongarch lsx覆盖率
更新python ci版本,新增python-3.12包
更新rpi3b+/rpi4b测试数据
更新huawei kunpeng 920测试数据(@MobtgZhang)
新增3A6000和TH1520 gpu测试数据(@Rabenda)
新增RDK X3 Module测试数据(@LJoson)
修复ios模拟器gpu badge(@732857315)
New Contributors
- @charlescao460 made their first contribution in #4738
- @Justin62628 made their first contribution in #4765
- @zhenjiaguo made their first contribution in #4801
- @732857315 made their first contribution in #4836
- @KYShek made their first contribution in #4837
- @JeremyRand made their first contribution in #4807
- @Upliner made their first contribution in #4828
- @Mek101 made their first contribution in #4855
- @brcarry made their first contribution in #4872
- @Rabenda made their first contribution in #4894
- @Baiyuetribe made their first contribution in #4859
- @XiaBing992 made their first contribution in #4940
- @lrw04 made their first contribution in #4925
Full Changelog: 2023051...2023081