Statistical Modeling, Causal Inference, and Social Science
If you’re interested in the Box-Cox power transformation . . .
Seven-parameter drift-diffusion pdfs and cdfs now in Stan
A slew of improvements to NUTS
“We conclude that apparent effects of growth mindset interventions on academic achievement are likely attributable to inadequate study design, reporting flaws, and bias.”
StanCon 2026 registration and abstract submission are now open
Is there a through line from B.S. numbers in junk science to B.S. numbers coming from the government?
Survey Statistics: divine probabilities
“it has been argued that current chatbots may pose a risk of amplifying delusional thinking in vulnerable users, due to their tendency to sycophantic and overly validating behaviour”
More on school reform, this time New Orleans
My new class this spring: POLS 4280, Rationalizing the World: The Hopes and Disappointments of American Social Science from 1900 to the Present
The life of the artist “is a constant–and constantly losing–battle to keep at bay, on one side, the permanent shortfall of physical and mental abilities in the context of the perfection that art strives to be, and, on the other side, the inevitable arrival of silence and death.”
This guy hates sociology.
An idea for getting approximately calibrated 50% subjective probability ranges
The Sapolsky Sanction
When the numbers don’t look right, check them! (Mississippi education update)
From Bayesian inference to LLMs (Steve Bronder’s 2025 CppCon talk)
Gerd Gigerenzer on the legacy of Daniel Kahneman
Survey Statistics: probability samples vs epsem samples vs SRS samples
Donald Trump and Jeffrey Epstein in a remake of The Big Clock
Effective sample size depends on the quantity
How much of “Mississippi’s education miracle” is an artifact of selection bias?
Sermet Pekin’s open-source project that discovers blogs through recursive network exploration
What’s it like to be the child of a white-collar criminal?
Flagging when the prior distribution is informative
Larry Summers, Ken Starr, Jeffrey Epstein, and everyone else
Effective sample size
The three funniest items on the Kroger recall list
The purpose of science vs. the purpose of scientists
Survey Statistics: quantity vs quality
Problems with the so-called gender equality paradox
Some thoughts on empirical distributions of z-scores
Who has the lowest Erdos-Bacon-Epstein number?
The signal-to-noise ratio in statistics
“The Limits of Ethical AI”
Who wants kid vax mandate?
“What do you think is the ideal number of children for a family to have?” Two different statistical measurement challenges arise from this one question on the General Social Survey.
Three meta-principles of statistics: the information principle, the methodological attribution problem, and different applications demand different philosophies
StatRetro: The twitter feed that spits out our old blog posts, one at a time, every 8 hours
This guy’s mad about fake research, and he should be. Research incompetence, research fraud, and the promotion of fraudulent or incompetent work . . . these are not victimless crimes.
Survey Statistics: sampling the sample
Under what sort of systematic reporting errors will science be self-correcting, or not? And do gardening programs reduce obesity?
The Aristocrats! (Found poetry in the email archive)
A Borgesian blog idea (and nothing to do with forking paths)
Sociology of science: What does it take for erroneous or fraudulent claims to take hold?
The density of fraud
The fifth anniversary of a viral histogram
How is it that this problem, with its 21 data points, is so much easier to handle with 1 predictor than with 16 predictors?
The Democrats were lucky that the election was last week and not next week.
The value of close reading: Larry Summers edition
Noted economist likes to talk about demographics but he doesn’t know the actual facts.
“Belief in the law of small numbers” as a way to understand the replication crisis and silly researchers who continue to cite discredited behavioral research
Survey Statistics: weights and MRP for voters
From the three branches of government to the bidirectional nature of legal reasoning in a way that is similar to how statistics works, and should work, in the real world
The accupuncture paradox and its resolution
What intro stats textbook to use?
“Science and Religious Dogmatism”
MSc and PhD programs in statistics at the University of British Columbia
Conflicting statistical evidence on the long-term effects of children on being whacked by their parents
The model underlying R-hat and a Bayesian estimator
If Cuomo had been able to run against Mamdani head-to-head, would he have won?
The theoretical appeal of the Cuomo non-party campaign for mayor
Discrepancies between polls and election outcomes in 2025
Predictive Modelling for Football Analytics is available!
Survey Statistics: continued struggles with equivalent weights
Polls & Betting odds & Nonsampling errors & Win probabilities & Vote margins
The Netherlands Food and Consumer Product Authority at the Netherlands Food and Consumer Product Authority is looking for an applied statistician with expertise in Bayesian statistics or causal inference
Donald Trump and Joe McCarthy
An economist reports that the cost data from Medicare are “completely irrelevant. It is clearly measures of net costs that matter, but only gross costs (analogous to sticker prices) are provided. Similar issues arise with recent requirements for hospital price transparency.”
Hilarious Ted Talk bio: “he sold the second most expensive picture at his first exhibition — without really being able to paint.”
Studying sex ratios is just a lot harder than you think: effects are tiny and variation is large.
The last time it seemed that the country was coming apart
The WAR war and the electoral benefits of running more moderate candidates for political office
No, this is not “the most unpredictable race for mayor that New York City has seen in decades”
Survey Statistics: Blue Rose Research is hiring !
The stated purpose of a program is not always the same as its real purpose
Reading like it’s 1937
Hey Philip Larkin, I don’t get what you were saying here!
Even the easiest data requests can require some effort
Assistant professor positions at USI in Lugano
David Owen writes about hearing aids
Was Admiral Poindexter a terrorist? (Who’s in charge of your prediction market?)
Here’s a statistics research project for you: Is the skewness of the distribution of the empirical correlation coefficient asymptotically proportional to the correlation?
Survey Statistics: individualism doesn’t work
Sabbatical and pre-faculty positions at Flatiron Institute in NYC
Reanalysis of that Nobel prizewinning study of patents and innovation
Bayesian probability, like frequentist probability, is a model-based activity that is mathematically anchored by physical randomization at one end and calibration to a reference set at the other
Collective sensemaking event in NYC, October 26
Aversive statistical methods explain differences in “dark” publication in PNAS across subject areas
The war on data, 2025 edition
Separating the whack from the chaff in critiques of decision theory
The importance of essentialism in children’s and adults’ conceptions of the world
Generalizing Treatment Effects from Trials to EHR Populations (Qixuan Chen’s talk this Tues morning)
“All Our Default Models Are Wrong: Causal inference for varying treatment effects”: my talk this Saturday morning in Ottawa
Questions about statistical claims in paper from recent Nobel prize winners; some general challenges in trying understand nonlinear patterns using quadratic regression
Survey Statistics: MRPW
Stockholm Syndrome
StanCon 2026 in Uppsala, Sweden
“The Impossible Man”: Patchen Barss’s biography of Roger Penrose
This is what a degree in cannabis studies will get ya
7 reasons to use Bayesian inference!
Columbia fake U.S. News statistics update: They paid $9 million and are still, bizarrely, refusing to admit misreporting of data, even though everybody knows they misreported data.
The worst research papers I’ve ever published
Prior distributions for regression coefficients
Selection bias in junk science: Which junk science gets a hearing?
Aki looking for a doctoral student to develop Bayesian workflow
Survey Statistics: struggles with equivalent weights
Historical American Political Finance Data at the National Archives
“300 Paintings”
When rich people believe, or pretend to believe, stupid things (tennis edition)
Uncanny academic valley: Brian Wansink as proto-chatbot
Unusual consulting request
It’s a JAX, JAX, JAX, JAX World
Adding noise to the data to reduce overfitting . . . How does that work?
“It’s horrible that they’re sucking young researchers into this vortex. It’s Gigo and Gresham all the way down.”
Yes, your single vote really can make a difference! (in Canada)
Survey Statistics: beyond balancing
“Dangerous Fictions” and the norm of entertainment
Behind-the-Scenes Seminar on social science this Fri 3 Oct
Game theory corner: did Eric Adams play his hand well? (It’s a little like Murder on the Orient Express, it’s a little like The Sting.)
“Veridical (truthful) Data Science”: Another way of looking at statistical workflow
In music, literature, and technical writing, the relation of large-scale structure to the local action
A Selective History of Political Polling and Election Forecasting
The Dodgers are hiring
“On the poor statistical properties of the P-curve meta-analytic procedure”
More on the decline and fall of Steven Levitt
Survey Statistics: Fat Bear Week
Bridging prediction and intervention in social systems
World’s greatest 404 page
Protecting data from the public and ourselves
It’s JAMA time, baby! Junk science presented as public health research
Who gets listed first on a collaborative article or book?
Monty Hall and generative modeling: Drawing the tree is the most important step
When thinking about causal inference, mechanistic or process models are important. I think that the association of “causal” with black-box models leads to lots of problems.
Condition numbers for HMC and the funnel
More Howl, after Allen Ginsberg for the AI-headed hipsters
“Why probability probably doesn’t exist (but it is useful to act like it does)”
The Miami Marlins are hiring
Hey, Nature magazine! Reputation is a two-way street.
Survey Statistics: random sampling is not leaving
Stats and ML postdoc and permanent hiring season officially open at Flatiron
Softverse: Auto-compute Citations to Software From Replication Files
The Desperation of Causal Inference in Ecology
Princeton Consumer Research reports a 93.94% success rate . . . not quite as good as Harvard, which gets you to “statistically indistinguishable from 100%”!
Helen DeWitt says, “programming occupies a place similar to that of literacy in mediaeval England.”
BDA3 for free
Draw it with your eyes closed: the art of the statistics assignment
That external validity question: How to think about a 3-year UBI study?
Hot social science topics 20 years ago and hot social science topics now
Howl, after Allen Ginsberg (for the AI-headed hipsters)
Going beyond naive individualistic models of social science
Survey Statistics: Imputation II
Show, don’t tell: ChatGPT 5 marginalizing Gelman’s measurement error model in Stan
Hypertext as constructed and hypertext as read
You learn about possible plagiarism in a literary work. How does that affect your view of it? (The A. J. Finn story)
This post is not about Newt Gingrich and Fox news, nor is it about Michio Kaku and string theory.
Weighting of evidence and conflict of interest at the FDA and elsewhere
Experimentation and thinking at the level of a program of experiments
Generate but verify: Reconciling the evidence utility of chatbots in many settings with chatbots’ evident lack of understanding
Blogging’s a great way to express your ideas.
“Assembling an unbiased jury”?