Agner's CPU blog
Agner's CPU blog • VMOVHLPS missing from instruction tables?
Agner's CPU blog • Performance implications of mixing AVX-512 and AVX2 code on modern Intel CPUs.
Agner's CPU blog • Advanced Vector Extensions 3 (AVX-3) - Practical Performance Considerations and Optimizations.
Agner's CPU blog • converting between 16bit usigned and 32bit signed data using VCL (v2)
Agner's CPU blog • What's new about Zen 5 and Arrow Lake?
Agner's CPU blog • Testp and github
Agner's CPU blog • Is using BSF instruction instead of using GNU C __builtin_ctz inefficient?
Agner's CPU blog • Efficiency of array<Vec32uc, 8> vs. ContainerV<Vec32uc, 8>
Agner's CPU blog • Suggestion: Stop using "vector" for computer science
Agner's CPU blog • Testp Question