Statistical Modeling, Causal Inference, and Social Science

Posts

If you’re interested in the Box-Cox power transformation . . .

Seven-parameter drift-diffusion pdfs and cdfs now in Stan

“We conclude that apparent effects of growth mindset interventions on academic achievement are likely attributable to inadequate study design, reporting flaws, and bias.”

StanCon 2026 registration and abstract submission are now open

Is there a through line from B.S. numbers in junk science to B.S. numbers coming from the government?

Survey Statistics: divine probabilities

“it has been argued that current chatbots may pose a risk of amplifying delusional thinking in vulnerable users, due to their tendency to sycophantic and overly validating behaviour”

More on school reform, this time New Orleans

My new class this spring: POLS 4280, Rationalizing the World: The Hopes and Disappointments of American Social Science from 1900 to the Present

The life of the artist “is a constant–and constantly losing–battle to keep at bay, on one side, the permanent shortfall of physical and mental abilities in the context of the perfection that art strives to be, and, on the other side, the inevitable arrival of silence and death.”

This guy hates sociology.

An idea for getting approximately calibrated 50% subjective probability ranges

The Sapolsky Sanction

When the numbers don’t look right, check them! (Mississippi education update)

From Bayesian inference to LLMs (Steve Bronder’s 2025 CppCon talk)

Gerd Gigerenzer on the legacy of Daniel Kahneman

Survey Statistics: probability samples vs epsem samples vs SRS samples

Donald Trump and Jeffrey Epstein in a remake of The Big Clock

Effective sample size depends on the quantity

How much of “Mississippi’s education miracle” is an artifact of selection bias?

Sermet Pekin’s open-source project that discovers blogs through recursive network exploration

What’s it like to be the child of a white-collar criminal?

Flagging when the prior distribution is informative

Larry Summers, Ken Starr, Jeffrey Epstein, and everyone else

Effective sample size

The three funniest items on the Kroger recall list

The purpose of science vs. the purpose of scientists

Survey Statistics: quantity vs quality

Problems with the so-called gender equality paradox

Some thoughts on empirical distributions of z-scores

Who has the lowest Erdos-Bacon-Epstein number?

The signal-to-noise ratio in statistics

“The Limits of Ethical AI”

Who wants kid vax mandate?

“What do you think is the ideal number of children for a family to have?” Two different statistical measurement challenges arise from this one question on the General Social Survey.

Three meta-principles of statistics: the information principle, the methodological attribution problem, and different applications demand different philosophies

StatRetro: The twitter feed that spits out our old blog posts, one at a time, every 8 hours

This guy’s mad about fake research, and he should be. Research incompetence, research fraud, and the promotion of fraudulent or incompetent work . . . these are not victimless crimes.

Survey Statistics: sampling the sample

Under what sort of systematic reporting errors will science be self-correcting, or not? And do gardening programs reduce obesity?

The Aristocrats! (Found poetry in the email archive)

A Borgesian blog idea (and nothing to do with forking paths)

Sociology of science: What does it take for erroneous or fraudulent claims to take hold?

The density of fraud

The fifth anniversary of a viral histogram

How is it that this problem, with its 21 data points, is so much easier to handle with 1 predictor than with 16 predictors?

The Democrats were lucky that the election was last week and not next week.

The value of close reading: Larry Summers edition

Noted economist likes to talk about demographics but he doesn’t know the actual facts.

“Belief in the law of small numbers” as a way to understand the replication crisis and silly researchers who continue to cite discredited behavioral research

Survey Statistics: weights and MRP for voters

From the three branches of government to the bidirectional nature of legal reasoning in a way that is similar to how statistics works, and should work, in the real world

The accupuncture paradox and its resolution

What intro stats textbook to use?

“Science and Religious Dogmatism”

MSc and PhD programs in statistics at the University of British Columbia

Conflicting statistical evidence on the long-term effects of children on being whacked by their parents

The model underlying R-hat and a Bayesian estimator

If Cuomo had been able to run against Mamdani head-to-head, would he have won?

The theoretical appeal of the Cuomo non-party campaign for mayor

Discrepancies between polls and election outcomes in 2025

Predictive Modelling for Football Analytics is available!

Survey Statistics: continued struggles with equivalent weights

Polls & Betting odds & Nonsampling errors & Win probabilities & Vote margins

The Netherlands Food and Consumer Product Authority at the Netherlands Food and Consumer Product Authority is looking for an applied statistician with expertise in Bayesian statistics or causal inference

Donald Trump and Joe McCarthy

An economist reports that the cost data from Medicare are “completely irrelevant. It is clearly measures of net costs that matter, but only gross costs (analogous to sticker prices) are provided. Similar issues arise with recent requirements for hospital price transparency.”

Hilarious Ted Talk bio: “he sold the second most expensive picture at his first exhibition — without really being able to paint.”

Studying sex ratios is just a lot harder than you think: effects are tiny and variation is large.

The last time it seemed that the country was coming apart

The WAR war and the electoral benefits of running more moderate candidates for political office

No, this is not “the most unpredictable race for mayor that New York City has seen in decades”

Survey Statistics: Blue Rose Research is hiring !

The stated purpose of a program is not always the same as its real purpose

Reading like it’s 1937

Hey Philip Larkin, I don’t get what you were saying here!

Even the easiest data requests can require some effort

Assistant professor positions at USI in Lugano

David Owen writes about hearing aids

Was Admiral Poindexter a terrorist? (Who’s in charge of your prediction market?)

Here’s a statistics research project for you: Is the skewness of the distribution of the empirical correlation coefficient asymptotically proportional to the correlation?

Survey Statistics: individualism doesn’t work

Sabbatical and pre-faculty positions at Flatiron Institute in NYC

Reanalysis of that Nobel prizewinning study of patents and innovation

Bayesian probability, like frequentist probability, is a model-based activity that is mathematically anchored by physical randomization at one end and calibration to a reference set at the other

Collective sensemaking event in NYC, October 26

Aversive statistical methods explain differences in “dark” publication in PNAS across subject areas

The war on data, 2025 edition

Separating the whack from the chaff in critiques of decision theory

The importance of essentialism in children’s and adults’ conceptions of the world

Generalizing Treatment Effects from Trials to EHR Populations (Qixuan Chen’s talk this Tues morning)

“All Our Default Models Are Wrong: Causal inference for varying treatment effects”: my talk this Saturday morning in Ottawa

Questions about statistical claims in paper from recent Nobel prize winners; some general challenges in trying understand nonlinear patterns using quadratic regression

Survey Statistics: MRPW

Stockholm Syndrome

StanCon 2026 in Uppsala, Sweden

“The Impossible Man”: Patchen Barss’s biography of Roger Penrose

This is what a degree in cannabis studies will get ya

7 reasons to use Bayesian inference!

Columbia fake U.S. News statistics update: They paid $9 million and are still, bizarrely, refusing to admit misreporting of data, even though everybody knows they misreported data.

The worst research papers I’ve ever published

Prior distributions for regression coefficients

Selection bias in junk science: Which junk science gets a hearing?

Aki looking for a doctoral student to develop Bayesian workflow

Survey Statistics: struggles with equivalent weights

Historical American Political Finance Data at the National Archives

“300 Paintings”

When rich people believe, or pretend to believe, stupid things (tennis edition)

Uncanny academic valley: Brian Wansink as proto-chatbot

Unusual consulting request

It’s a JAX, JAX, JAX, JAX World

Adding noise to the data to reduce overfitting . . . How does that work?

“It’s horrible that they’re sucking young researchers into this vortex. It’s Gigo and Gresham all the way down.”

Yes, your single vote really can make a difference! (in Canada)

Survey Statistics: beyond balancing

“Dangerous Fictions” and the norm of entertainment

Behind-the-Scenes Seminar on social science this Fri 3 Oct

Game theory corner: did Eric Adams play his hand well? (It’s a little like Murder on the Orient Express, it’s a little like The Sting.)

“Veridical (truthful) Data Science”: Another way of looking at statistical workflow

In music, literature, and technical writing, the relation of large-scale structure to the local action

A Selective History of Political Polling and Election Forecasting

The Dodgers are hiring

“On the poor statistical properties of the P-curve meta-analytic procedure”

More on the decline and fall of Steven Levitt

Survey Statistics: Fat Bear Week

Bridging prediction and intervention in social systems

World’s greatest 404 page

Protecting data from the public and ourselves

It’s JAMA time, baby! Junk science presented as public health research

Who gets listed first on a collaborative article or book?

Monty Hall and generative modeling: Drawing the tree is the most important step

When thinking about causal inference, mechanistic or process models are important. I think that the association of “causal” with black-box models leads to lots of problems.

Condition numbers for HMC and the funnel

More Howl, after Allen Ginsberg for the AI-headed hipsters

“Why probability probably doesn’t exist (but it is useful to act like it does)”

The Miami Marlins are hiring

Hey, Nature magazine! Reputation is a two-way street.

Survey Statistics: random sampling is not leaving

Stats and ML postdoc and permanent hiring season officially open at Flatiron

Softverse: Auto-compute Citations to Software From Replication Files

The Desperation of Causal Inference in Ecology

Princeton Consumer Research reports a 93.94% success rate . . . not quite as good as Harvard, which gets you to “statistically indistinguishable from 100%”!

Helen DeWitt says, “programming occupies a place similar to that of literacy in mediaeval England.”

BDA3 for free

Draw it with your eyes closed: the art of the statistics assignment

That external validity question: How to think about a 3-year UBI study?

Hot social science topics 20 years ago and hot social science topics now

Howl, after Allen Ginsberg (for the AI-headed hipsters)

Going beyond naive individualistic models of social science

Survey Statistics: Imputation II

Show, don’t tell: ChatGPT 5 marginalizing Gelman’s measurement error model in Stan

Hypertext as constructed and hypertext as read

You learn about possible plagiarism in a literary work. How does that affect your view of it? (The A. J. Finn story)

This post is not about Newt Gingrich and Fox news, nor is it about Michio Kaku and string theory.

Weighting of evidence and conflict of interest at the FDA and elsewhere

Experimentation and thinking at the level of a program of experiments

Generate but verify: Reconciling the evidence utility of chatbots in many settings with chatbots’ evident lack of understanding

Blogging’s a great way to express your ideas.

“Assembling an unbiased jury”?