indii.org / blog
A static for loop in C++
Improving C++ Code Coverage with Gcov, Gcovr and Doxide
Limit clock speed and memory speed on an Nvidia GPU
Matrix Multiplication On GPU: Part 3, Coding for Speed
Matrix Multiplication On GPU: Part 2, Tiling
Matrix Multiplication on GPU: Faster than Nvidia, Sometimes
Gradients of Softmax and Logsumexp
C++: Pattern Matching Template Types
C++: Overloading the Spaceship Operator, A Recipe
C++: Check if a type is an instantiation of a given class template