Tencent/ncnn 20220420 on GitHub

编译版本，默认配置，android-ndk-r21d，xcode 12.4，ubuntu-18.04，ubuntu-20.04，vs2015，vs2017，vs2019，vs2022，emscripten-2.0.8

file	content	arch
ncnn-full-source.zip	包含全部 submodule 代码的完整源码
ncnn-android.zip	android 静态库/动态库	armeabi-v7a + arm64-v8a + x86 + x86_64
ncnn-android-vulkan.zip	android 静态库/动态库，支持 GPU	armeabi-v7a + arm64-v8a + x86 + x86_64
ncnn-ios.zip	ios 静态库，with and w/o bitcode	armv7 + arm64 + arm64e + i386 + x86_64
ncnn-ios-vulkan.zip	ios 静态库，支持 GPU，with and w/o bitcode	arm64 + arm64e + x86_64
ncnn-macos.zip	macos 静态库	x86_64 + arm64
ncnn-macos-vulkan.zip	macos 静态库，支持 GPU	x86_64 + arm64
ncnn-ubuntu.zip	ubuntu linux 静态库/动态库，支持 GPU，模型转换工具	x86_64
ncnn-windows.zip	windows 静态库/动态库，支持 GPU，模型转换工具	x86 + x86_64
ncnn-webassembly.zip	webassembly 静态库	wasm32 + simd + threads + simd-threads

conv vulkan im2col+sgemm优化
conv vulkan winograd43优化
conv vulkan implicit gemm优化
deconv vulkan sgemm+col2im优化
conv/deconv vulkan local memory优化
conv vulkan 直接卷积unroll优化
改善conv vulkan winograd23/winograd43选择策略
融合conv vulkan winograd 前后的pad/crop到transform中
innerproduct vulkan 拆分两阶段优化
补全conv 1x1 vulkan任意packing
补全conv 3x3 winograd vulkan任意packing
conv/deconv vulkan pack4 nvidia tensorcore优化
x86 sse/avx 数学函数优化(Yoh-Z)
unaryop x86 优化(Yoh-Z)
floor/ceil/abs x86 sse优化(MouriNaruto)
convoluition/convoluitiondepthwise/innerproduct/padding/pooling/interp/eltwise/crop/reshape/slice/hardsigmoid/swish/binaryop/clip/relu/sigmoid/unaryop x86 avx512优化
conv sgemm avx512优化
conv3x3 winograd avx512优化
deconvolution/deconvolutiondepthwise x86直接反卷积实现
softmax x86 sse/avx/avx512优化
quantize/dequantize/requantize mips msa优化
conv int8/convdw int8/innerproduct int8 mips msa优化
multiheadattention arm neon优化(EdVince)
softmax arm neon优化
conv3x3 winograd transform部分提出为可复用函数
x86 f16c指令集检测和分发
删除没什么用的avx2-fp16相关代码
simpleomp允许最多32个microtask参数
添加loongson mmi头文件和编译支持
新增deconv1d，deconv3d和对应的pnnx转换
修正老版本gcc的avx512编译参数问题
修正sigmoid x86在很大数值输入返回nan的问题
修正gpu推理convdw发生unlocked pool allocator destoryed too early的问题
避免mips msa推理时可能发生浮点数异常
batchnorm加载参数时避免除0异常
为新算子更新modelwriter
copy_make_border添加reflect类型
mali g31/g52启用fp16
修复armhf工具链编译问题
global pooling强制使用fp32累加避免nan问题
修复某些android系统无法dlsym getauxval的问题
修正新版本moltenvk tanh兼容问题
提出vulkan激活函数，glsl中实现include
修复armv7编译单元测试失败的问题(jasonZhang892)
修正conv3x3 winograd矩阵注释(MouriNaruto)
修正how-to-build拼写错误，更新jetson-nano编译文档(tpoisonooo)
更新ios编译文档(mirrorsysu)
一些注释和代码清理和修复编译警告(tpoisonooo)
修正readme中的单词大小写(YoungSx)
更新use-ncnn-with-own-project中的glslang的库列表
ci新增msvc arm/arm64目标
ci新增linux loongarch目标
ci更新windows matrix和vs2022目标
修复vs2019打包
新增yolov5_pnnx例子
新增nanodetplus_pnnx例子
减少yolov5例子中后处理耗时(UNeedCryDear)
修复yolov5.py框位置问题(hariag)
更新ls2k1000的benchmark数据
pnnx支持转换torch unbind/ones/ones_like/full/full_like/randn_like/empty/empty_like/addmm
pnnx支持torch 1.11.0版本
pnnx转换的ncnn模型文件使用fp16保存
pnnx在linux上链接pthread，修复windows minmax编译问题
pnnx新增静态msvc crt cmake选项
修正pnnx hardtanh 参数的ncnn转换
修复pnnx macos动态库加载路径的问题

New Contributors

@MouriNaruto made their first contribution in #3591
@YoungSx made their first contribution in #3655
@hariag made their first contribution in #3656
@EdVince made their first contribution in #3667
@mirrorsysu made their first contribution in #3696
@jasonZhang892 made their first contribution in #3710
@UNeedCryDear made their first contribution in #3649

Full Changelog: 2022021...2022042

Tencent/ncnn 20220420 android ios macos linux windows webassembly 预编译库 20220420 7600270 on GitHub

New Contributors

Tencent/ncnn 20220420
android ios macos linux windows webassembly 预编译库 20220420 7600270

on GitHub