RSS.Social

Ash's Blog

follow: @[email protected]

Posts

Fork Union: Beyond OpenMP in C++ and Rust?

Calling CUDA in 3000 Words

The Longest Nvidia PTX Instruction

Hiding x86 Port Latency for 330 GB/s/core Reductions 🫣

Parsing JSON in C & C++: Singleton Tax

10x Faster C++ String Split, 16 Years Later πŸ‘΄πŸ»

The Next 31 Years of Developing Unum

Understanding SIMD: Infinite Complexity of Trivial Problems πŸ”₯

5x Faster Set Intersections: SVE2, AVX-512, & NEON 🀐

35% Discount on Keyword Arguments in Python 🐍

NumPy vs BLAS: Losing 90% of Throughput

The Painful Pitfalls of C++ STL Strings 🧡

USearch Molecules: 28 Billion Chemical Embeddings on AWS βš—οΈ

Binding a C++ Library to 10 Programming Languages πŸ”Ÿ

Python, C, Assembly - 2'500x Faster Cosine Similarity πŸ“

GCC Compiler vs Human - 119x Faster Assembly πŸ’»πŸ†šπŸ§‘β€πŸ’»

Accelerating JavaScript arrays by 10x for Vector Search 🏹

Our CPython bindings got 5x faster without PyBind11 🐍

SciPy distances... up to 200x faster with AVX-512 & SVE πŸ“

Combinatorial Stable Marriages for DBMS Semantic Joins πŸ’

StringZilla: 5x faster strings with SIMD & SWAR πŸ¦–

Abusing Vector Search for Texts, Maps, and Chess β™ŸοΈ

Counting Strings in C++: 30x Throughput Difference πŸ’¬

We went through life with a smile πŸ’”

Mastering C++ with Google Benchmark ⏱️

Failing to Reach DDR4 Bandwidth 🚌

Crushing CPUs with 879 GB/s Reductions in CUDA

Apple to Apple Comparison: M1 Max vs Intel 🍏

Hyperscaler Shopping List: 2022 Data Center Tech Frenzy ☁️

Only 1% of Software Benefits from SIMD Instructions

Artsakh Must Be Independent πŸ—ΊοΈ

The 7 Sins of Turkish Autocracy πŸ‡ΉπŸ‡·

Armenia, Azerbaijan, Turkey. Who's the Aggressor? βš”οΈ

Come to Armenia πŸ‡¦πŸ‡²

Positive Outlook on the COVID-19 Crisis 😷

Building AI Safely

What's Wrong with WWDC 2016 Keynote?

Hey, I'm Ash!

Talks & Lectures