New Features:
Implement FMA
SIMD operations are optimized with FMA instructions to reduce operations and increasee accuracy. The gennerated CPU instructions are reduced. All matrix and related-vector operations are optimized.
FMA must be enable for these optimizations (with -mfma flag on GCC and Clang, /arch:AVX2 /O1-2 on MSVC)
- optimize mat4 SSE operations with FMA
- optimize mat3 SSE operations with FMA
- optimize mat2 SSE operations with FMA
- optimize affine mat SSE operations with FMA
- optimize vec4 muladd and muladds operations with FMA
New glmm functions (SSE + NEON):
glmm_vhadd()
- broadcast-ed haddglmm_fmadd(a, b, c)
- fused multiply addglmm_fnmadd(a, b, c)
- fused negative multiply addglmm_fmsub(a, b, c)
- fused multiply subglmm_fnmsub(a, b, c)
- fused negative multiply sub
New glmm functions (AVX):
glmm256_fmadd(a, b, c)
- fused multiply add AVXglmm256_fnmadd(a, b, c)
- fused negative multiply add AVXglmm256_fmsub(a, b, c)
- fused multiply sub AVXglmm256_fnmsub(a, b, c)
- fused negative multiply sub AVXglm_mat4_scale_avx(mat4 m, float s)
- scale matrix with scalar (if AVX enabled)
CMake
- #183: add CMake interface library target ( thanks to @legends2k )
Use as header-only library with your CMake project
This requires no building or installation of cglm.
- Example:
cmake_minimum_required(VERSION 3.8.2)
project(<Your Project Name>)
add_executable(${PROJECT_NAME} src/main.c)
target_link_libraries(${LIBRARY_NAME} PRIVATE
cglm_headers)
add_subdirectory(external/cglm/ EXCLUDE_FROM_ALL)