Chris McCormick
Output Latent Spaces in Multihead Attention
Reading and Writing with Projections
The Inner Workings of Multihead Latent Attention (MLA)
Patterns and Messages - Part 6 - Vocabulary-Based Analysis
Patterns and Messages - Part 5 - The Residual Stream
Patterns and Messages - Part 4 - Attention as a Dynamic Neural Network
Patterns and Messages - Part 3 - Alternative Decompositions
Patterns and Messages - Part 2 - Token Communication
Patterns and Messages - Part 1 - The Missing Subscript
Patterns and Messages: A New Framing of Transformer Attention
The Inner Workings of DeepSeek-V3
How Reasoning Works in DeepSeek-R1
Continuing Pre-Training on Raw Text
Fine-Tuning Llama 3 for Sentence Classification
QLoRA and 4-bit Quantization
Colab GPUs Features & Pricing
Summarizing Long PDFs with ChatGPT
Choosing a Sampler for Stable Diffusion
Classifier-Free Guidance (CFG) Scale
Steps and Seeds in Stable Diffusion
How Stable Diffusion Works
How img2img Diffusion Works
What You Can Reasonably Expect from Stable Diffusion
Combining Categorical and Numerical Features with Text in BERT
How To Build Your Own Question Answering System
2020 NLP and NeurIPS Highlights
How to Apply BERT to Arabic and Other Languages
Smart Batching Tutorial - Speed Up BERT Training
GPU Benchmarks for Fine-Tuning BERT
Domain-Specific BERT Models
Existing Tools for Named Entity Recognition
Trivial BERsuiT - How much trivia does BERT know?
Question Answering with a Fine-Tuned BERT
BERT Research - Ep. 1 - Key Concepts & Sources
GLUE Explained: Understanding BERT Through Benchmarks
Matrix Operations in NumPy vs. Matlab
XLNet Fine-Tuning Tutorial with PyTorch
BERT Fine-Tuning Tutorial with PyTorch
BERT Word Embeddings Tutorial
The Inner Workings of word2vec
Applying word2vec to Recommenders and Advertising
Product Quantizers for k-NN Tutorial Part 2
Product Quantizers for k-NN Tutorial Part 1
k-NN Benchmarks Part I - Wikipedia
Concept Search on Wikipedia
Getting Started with mlpack
Word2Vec Tutorial Part 2 - Negative Sampling
DBSCAN Clustering
Interpreting LSI Document Similarity
Word2Vec Resources
Word2Vec Tutorial - The Skip-Gram Model
Google's trained Word2Vec model in Python
Latent Semantic Analysis (LSA) for Text Classification Tutorial
Jeff Dean Keynote at GTC 2015
Notes on PageRank
Matrix Multiplication with cuBLAS Example
RBFN Tutorial Part II - Function Approximation
Document Clustering Example in SciKit-Learn
MinHash Tutorial with Python Code
Experiences Renting GPU Instances
Understanding the DeepLearnToolbox CNN Example
What is an L2-SVM?
Fast Euclidean Distance Calculation with Matlab Code
Gaussian Mixture Models Tutorial and MATLAB Code
Intuition Behind Whitening Image Patches
Mahalanobis Distance
Deep Learning Tutorial - Convolutional Neural Networks
Deep Learning Tutorial - Self-Taught Learning & Deep Networks
Deep Learning Tutorial - Softmax Regression
Deep Learning Tutorial - PCA and Whitening
Deep Learning Tutorial - Sparse Autoencoder
Stanford Deep Learning Tutorial
Gradient Descent Derivation
Kernel Regression
Stereo Vision Tutorial - Part I
AdaBoost Tutorial
OpenCV HOG Detector: Result Clustering
RBF Network MATLAB Code
Radial Basis Function Network (RBFN) Tutorial
The Gaussian Kernel
K-Fold Cross-Validation, With MATLAB Code
HOG Descriptor in MATLAB
HOG Person Detector Tutorial
Gradient Vectors
SVM Tutorial - Part I
Stanford Machine Learning - Lecture 6
Stanford Machine Learning - Lecture 5
Stanford Machine Learning - Lecture 3
Stanford Machine Learning - Lecture 2
Stanford Machine Learning - Lecture 1
Stanford Machine Learning Course
UCF Lecture 6 - Optical Flow
Canny Edge Detector
Laplacian Of Gaussian (Marr-Hildreth) Edge Detector
Gaussian Filter
Filter Masks
Image Derivative
UCF Lecture 01 - Introduction To Computer Vision
UCF Computer Vision Lecture Series
OpenCV SIFT Tutorial
OpenCV Setup in Visual Studio 2010
Hand Pose Recognition With Microsoft Kinect and CogniMem V1KU