Naomi Saphra
How to visualize training dynamics in neural networks
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
The AI Researcher's Guide to a Non-Boring Bluesky Feed
Distributional Scaling Laws for Emergent Capabilities
Sometimes I am a Tree: Data drives fragile hierarchical generalization
Attribute Diversity Determines the Systematicity Gap in VQA
Benchmarks as Microscopes: A Call for Model Metrology
Causation Does Not Imply Correlation: A Study of Circuit Mechanisms and Model Behaviors
ChatGPT Doesn't Trust Chargers Fans: Guardrail Sensitivity in Context
Dynamic Masking Rate Schedules for MLM Pretraining
Fast Forwarding Low-Rank Training
First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models
Loss in the Crowd: Hidden Breakthroughs in Language Model Training
Mechanistic?
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
Transcendence: Generative Models Can Outperform The Experts That Train Them
Understanding biological active sensing behaviors by interpreting learned artificial agent policies
The Parable of the Prinia's Egg: An Allegory for AI Science
Delays, Detours, and Forks in the Road: Latent State Models of Training Dynamics
Interpretability Creationism
Linear Connectivity Reveals Generalization Strategies
Shapley Interactions for Complex Feature Attribution
State-of-the-art generalisation research in NLP: a taxonomy and review
Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations
Interpretability Creationism
Learning Transductions to Test Systematic Compositionality
One Venue, Two Conferences: The Separation of Chinese and American Citation Networks
The MultiBERTs: BERT Reproductions for Robustness Analysis
Against Monodomainism
A Non-Linear Structural Probe
LSTMs Compose---and Learn---Bottom-Up
Pareto Probing: Trading Off Accuracy for Complexity
Understanding Privacy-Related Questions on Stack Overflow
What Does a Coder Do If They Can't Type?
Carbon AI and the Concentration of Computational Work
Sparsity Emerges Naturally in Neural Language Models
Understanding Learning Dynamics Of Language Models with SVCCA
Model Scheduling
DyNet: The Dynamic Neural Network Toolkit
Evaluating Informal-Domain Word Representations with UrbanDictionary
AMRICA: an AMR Inspector for Cross-language Alignments
A framework for (under) specifying dependency syntax without overloading annotators
An Algerian Arabic-French Code-Switched Corpus
Understanding Objects in Detail with Fine-grained Attributes
Understanding Latent Dirichlet Allocation