- Residual block fusion optimization for cudnn backend, that depends on
custom_winograd=true
. Enabled by default only for networks with up to 384 filters in fp16 mode and never in fp32 mode. Default can be overridden with--backend-opts=res_block_fusing=false
to disable (or=true
to enable). - New experimental cuda backend without cudnn dependency (
cuda-auto
,cuda
andcuda-fp16
are available).