Tencent/ncnn 20220216 on GitHub

编译版本，默认配置，android-ndk-r21d，xcode 12.4，ubuntu-18.04，ubuntu-20.04，vs2015，vs2017，vs2019，emscripten-2.0.8

file	content	arch
ncnn-full-source.zip	包含全部 submodule 代码的完整源码
ncnn-android.zip	android 静态库/动态库	armeabi-v7a + arm64-v8a + x86 + x86_64
ncnn-android-vulkan.zip	android 静态库/动态库，支持 GPU	armeabi-v7a + arm64-v8a + x86 + x86_64
ncnn-ios.zip	ios 静态库，with and w/o bitcode	armv7 + arm64 + arm64e + i386 + x86_64
ncnn-ios-vulkan.zip	ios 静态库，支持 GPU，with and w/o bitcode	arm64 + arm64e + x86_64
ncnn-macos.zip	macos 静态库	x86_64 + arm64
ncnn-macos-vulkan.zip	macos 静态库，支持 GPU	x86_64 + arm64
ncnn-ubuntu.zip	ubuntu linux 静态库/动态库，支持 GPU，模型转换工具	x86_64
ncnn-windows.zip	windows 静态库/动态库，支持 GPU，模型转换工具	x86 + x86_64
ncnn-webassembly.zip	webassembly 静态库	wasm32 + simd + threads + simd-threads

conv sgemm pack4/pack1to4/pack4to1 x86 sse2/avx优化
conv3x3s1 winograd pack4/pack4to1 x86 sse2/avx优化
conv int8 gemm pack8to4/pack8to1/pack1to8 x86 xop/avx2/avx512-vnni/avx-vnni优化
conv3x3s1 int8 winograd pack8to4/pack8to1 x86 xop/avx2/avx512-vnni/avx-vnni优化
scale x86 avx优化(Yoh-Z)
interp x86 avx优化(Yoh-Z)
conv pack arm neon优化
x86 avx512基础架构
默认启用x86 avx512编译和运行时检测
解耦合x86 fma和avx2
不依赖libgcc的x86 cpu指令集探测
支持动态权重的卷积
修正可能因Mat成员函数没有内联导致的非法指令问题
修正可能因函数对象实例没有内联导致的非法指令问题
修正单元测试比较函数错误(yyuzhong)
binaryop/unaryop/reduction支持4维输入
新增Tile层和torch.repeat的转换
新增MatMul层和torch.matmul的转换
armv8.2 dot编译为运行时可选
支持sw_64平台(wzyforgit)
增加c-api的cmake开关
c-api增加默认mat构造函数(tpoisonooo)
简化binaryop的函数对象代码(tpoisonooo)
修正interp nearest在有非常规scale_factor参数计算错误的问题
简化c-api自定义层forward_n参数类型
删除非avx2编译时退化sse2的警告(kagurazakakotori)
在64位编译时使用_mm_cvtsi128_si64降低内存访问(kagurazakakotori)
修正low-level op api文档错误(FeiGeChuanShu)
修正crop test缺失的doffset参数(xh-liu-tech)
修正arm convolution pack1to4 int8权重重排(cmdbug)
简化get_current_time平台相关宏(cmdbug)
修正armv7无neon编译时计算错误的问题
增加c906 v223工具链(zchrissirhcz)
添加第二个qq技术交流群答案(LJoson)
python ci禁用tools和examples构建
ci动态库编译禁用LTO
ci更新swiftshader-20220211
删除travis ci和readme相关条目(proydakov)
新增yolo-fastest模型benchmark(dog-qiuqiu)
更新来自Q-engineering树莓派/jetson-nano等benchmark数据
benchmark增加zynq-7020/z8350/n5105
pnnx支持转换torch dequantize/quantize_per_tensor/quantized.linearrelu/argmax/argmin/clone/normal/expand/var/amax/amin/logsumexp/prod/sum/arange/matmul/zeros_like/expand_like/deformconv2d/roialign/norm/stack/repeat/zeros/roll/remainder
pnnx自动删除dropout算子
pnnx自动删除无pads的pad和noop算术表达式
pnnx常量折叠
pnnx转换4维常量数据
pnnx支持half数据类型导出的模型
pnnx转ncnn时删除尾部的reshape/permute
pnnx合并conv1d-bn convtranspose1d-bn
pnnx合并单一维度全select为unbind
pnnx确保算子名唯一性
修正pnnx转ncnn时遇到无法展开的表达式发生崩溃的问题
pnnx转ncnn支持负数pads的F.pad
pnnx转ncnn合并transpose-matmul
pnnx转ncnn在pooling123d前后增加升维和降维的reshape模拟nn.MaxPool123d处理无batch维数据的行为
pnnx命令行参数的shape指定输入类型
pnnx自动寻找pytorch安装目录(Yutyrannus)
pnnx ci自动拷贝dll文件(Yutyrannus)
添加pnnx命令行工具用法说明(ling0322)

New Contributors

@wzyforgit made their first contribution in #3421
@dog-qiuqiu made their first contribution in #3470
@xh-liu-tech made their first contribution in #3475
@ling0322 made their first contribution in #3487
@kagurazakakotori made their first contribution in #3527
@LJoson made their first contribution in #3532
@Yoh-Z made their first contribution in #3540
@yyuzhong made their first contribution in #3556

Full Changelog: 2021120...2022021

Tencent/ncnn 20220216 android ios macos linux windows webassembly 预编译库 20220216 6b2495c on GitHub

New Contributors

Tencent/ncnn 20220216
android ios macos linux windows webassembly 预编译库 20220216 6b2495c

on GitHub