Biased and Inefficient
Interviewing your laptop
AIC and combined discrete/continuous models
Single Transferable Vote and blocs
Terminology: asymptotically unbiased
Included-variable bias
Ihaka lectures 2025
Two-stage least squares
Strata and clusters
Iris classification: the next generation
Coupling simulations and the "reparametrisation trick"
Ordinal data: taking transformation invariance seriously
The piranha and the polypill
Collinearity four more times
Brute force and ignorance
Two approaches to approximating sums of chisquareds
National Land Transport Plan graphs
Ihaka Lectures
The missing test in survey regression models
Another way to not sample without replacement
A Bayesian t-test, again
Stage vs phase
Estimator vs estimate
Automatic transformation of standard errors?
S3 method dispatch on other arguments
Crossvalidation in complex survey data
Choosing frame weights in dual-frame surveys
Another update on non-transitive dice
Multiple frame sampling
Importance weights
Assumptions
Quantitative graphics?
Symbolically nested
Factors as factors
Small-area estimates by smoothing direct estimates
New in the survey package
Ordinal outcomes: the LOCT DOOR
Recurrent events: increased susceptibility or latent risk?
Asymptotics for linear mixed models
Why do the Rao-Scott tests have good size?
How good is the leading eigenvalue approximation to quadratic forms?
Why not REML?
Sparse correlation and the Central Limit Theorem
svy2lme: the preprint
Linear mixed models with pairwise likelihood
Benchmark Archaeology
Quoting and requoting
Blank-cheque inheritance and statistical methods objects
Pairwise likelihood and cluster sizes
New in the survey package
Ranks in survey data
Class imbalance: bug or feature?
Which infinite sequence?
The fourth-root thing
Determinant of correlation matrix
Sandwiches and aggregation
When is population mean rank a thing?
Checking proportionality of odds
Linkage and multiple imputation
Pairwise and joint independence
A short note on effect sizes
The sandwich and the t-test
Bus pruning
Improving a graph
Code archaeology: polynomial distributed lags
A plug-in uniform law of large numbers
Looking back
Tracking down a Real Data Set(tm)
ASCII and beyond — will it play in Peoria?
Tidying rimu
Combining a survey and other data
Self-promotion: an actual multiwave two-phase design
Getting strings into code in base R
Design |> Data: Ihaka Lectures 2022
Tables with zeroes
stringsAsFactors=do_you_feel_lucky
Nine and sixty ways
Comparing tests for generalised linear models in survey data
Optimal design for raking/AIPW estimation
Per capita, in mice
Top posts from 2021
Is it binary?
Crossed clustering and parallel invention
Score tests: surprisingly annoying
Ordinal data, metadata, and models
Pictures of code are not code
Wellington buses
The New Oil
Maintenance of Headway
Subsets and subpopulations in survey inference
What's new in the survey package
Not all strictly monotone functions are additive
Housing unaffordability hexmaps
Generalisability, prediction, and causation
A modest proposal for matrix multiplication
Phobos and Deimos and public speaking
Two-phase sampling notation
Co-linearity
They're back!
Emma Lathen e-books
Top posts in 2020
Planning a new data management course
Neyman Allocation, only exact
You will probably not be eaten by a grue
When the sky didn't fall
MOAR survey regression models
A Bayesian t-test?
Weights in statistics
Sourdough happens
New in the survey package
Changing strata mid-stream
Mapping NZ cases of COVID-19
Not cross buns
Quadratic trend tests in survey package
The Ihaka Lectures, Episode 4
Survey package news
Multifactor interventions and interactions
Computer says no
What is 'Data Science Practice'?
How many giraffes?
Hexmaps for NZ District Health Boards
Some things I don’t like about the Oxford-Munich Code of Conduct
How to review a book
(What’s up with the brackets?)
Why isn't rimu tidy?
A package for multiple-response data
Adding new functions to the survey package
Denominator degrees of freedom in svyglm
Wald, score, LRT: the picture
Analysing the mouse microbiome autism data
Confidence intervals: not a very strong property
Design degrees of freedom: brief note
Mean People Tweet
The Reeferendum
Local asymptotic minimax, and nearly-true models
Survey package update
That’s for remembrance
Handling ‘plausible values’ in surveys
Progress on linear mixed models for surveys
Hypergraph network meta-analysis
The school climate strike
Normal horizontiles
Displaying bus punctuality
Absolutely no warranty?
What have I got against the Shapiro-Wilk test?
How do you tell what packages to trust?
Recognising when you don’t know
Two quick survey items
Another way to see why mixed models in survey data are hard:
The Ihaka Lectures 3: Rise of the Machine Learners
Bayesian Surprise — the Shiny app
What are packages for?
svycontrast
Finding principal components without even looking?
Come work with us
Progress on svy2lme
Survey package update
The Kiwi PRNG
How to write a racist AI in R without really trying
Journalism and cyber-bullying
What can data science add to statistics education?
ISCB/ASC talk
Leaflet and buses
Testing probability distribution generators
Quoting and macros in R
e-bike: the reboot
Interlingual
Spell my name with a ‘v'
Statistical software matters
Survey analysis in SQL
New blog home
Biased and Inefficient
Graduation
svylme
Small p hacking
Chebyshev’s inequality and `UCL’
Why pairwise likelihood?
Faster generalised linear models in largeish data
Useful debugging trick
More tests for survey data
The Ihaka Lectures
As far as it goes
breakInNamespace
e-bike-onomics
Statistics on pairs
How to add chi-squareds
Secret Santa collisions
When all U-shaped curves look the same to you
Means of maximums
Haere mai, statistical computing folks
A genome analogy
Bayesian surprise
Visual design of diagnostics
Causes and counterfactuals
Wilcoxon and polymath: another update
The bus bot
Psychoactive substances and Peter Dunne
Tail bounds under sparse correlation
Information and control
Probabilities not bounded away from zero
Two-day course: survival analysis
A possibly unsurprising bootstrap observation
Stupid word games
Pipeable survey analysis in R
Peer review and community endorsement
A ‘polymath’ project on the Wilcoxon test?
Value of a degree
Prerequisites
Come work with us
Flat Earthers
Why I like the Convolution Theorem
Case-control efficiency
Order and quotient topologies
“Meritocracy” and “public good”
Hearing things
The Ihaka Lectures
When the bootstrap doesn’t work
Te Reo Māori in schools
Case-control sampling and pseudo-Rsquareds
A bus-watching bot
Mature and premature optimisation
Fixing an infelicity in ‘leaps’
Learning the Monty Hall problem
The ‘iris’ data
Making survey statistics boring and inefficient
Brief quake summary for overseas people
Changes in turnout and preference
Cuts to ‘Growing Up in New Zealand’
Terms to eschew
Large quadratic forms
The hard problem of AI and other stories
Come work with us
On permuting all the things
The lithium-powered space bike
“The” multiple comparisons problem
Like a crossword
Simulations and modes of convergence
Etymology
A modest proposal: Lazy Ambiguous Single Transferable Vote
One scoRe years
How do we prove the Central Limit Theorem?
Computing the (simplest) sandwich estimator incrementally
Are there any news?
Size matters
Sufficiently advanced technology
The Great Kiwi Cherry Ripe Scandal
Mostly dead
Artistic verisimilitude
The conservative Bonferroni correction
Trace estimators and impact factors
A gene for celibacy?
Truthy and Sciency
Coding linear splines
Cheap tricks
Two cheers for crowdfunding
No-one’s forcing you to read the Herald
Stochastic SVD
Is it that time of day?
Another view of the ‘nearly true’ model
What does ‘design-consistent’ even mean?
Circumspice
Superfood sourcing
The Muntab Question Strikes Back
Potential energy and kinetic energy
Case-control estimation is more complicated than you think
A simple probability problem
The Muntab Question
Serious tongue-twister
Poetry visualisation
Should SPRINT have stopped?
Prefiltering very large numbers of tests
Double robustness
Convergent evolution and NZ Bird of the Year
NZ Flag Referendum pseudorandom numbers
Oranges and lemons
(high-dimensional) Space is Big.
Good reasons for assuming a spherical cow
Net Reclassification Index: surprisingly weird.
A conservation tragedy
Colour names from XKCD in R
Fox fails statistics; does NYT?
JSM2015: notes on Seattle from an ex-resident
Pianos, heaps, and ethics of randomisation
Te Wiki o Te Reo Māori
stringsAsFactors = <sigh>
Pi day
A much-needed gap
Countermatching
Zero-inflated Poisson from complex samples
Call me, Ishmael
Superefficiency
Precise answers, but not necessarily to the right question
What’s the right proof of the Continuous Mapping Theorem?
Eppur si muove
Pharmacy ethics
Paper helicopters at a science fair
What does measurability mean?
How hard did you look: equivalence and non-inferiority
Clinically proven ingredients
Science and statistical inference
Assumptions and testing
A transitive test is a test for a univariate parameter
New header picture
Tomato, tomato
Different questions can have different answers
Variation explained and log transformation
How not to treat Ebola
Citations: credit or blame
What science should everyone know?
It depends on what you mean by 'cost'
This is just to say
A people set apart
Miasma and Contagion
Semiparametric efficiency and nearly-true models
Broman's Socks and the Nature of Scientific Reporting
Is it good or bad when confounding adjustment makes no difference?
On dialect
Rhetorical sensitivity analysis
O necessary sinpi
Taking meta-analysis heterogeneity seriously
Survey package update
Feynman and the Suck Fairy
Herd Immunity simulations
Monotonicity and smoothness
Anchoring bias
Randomisation without consent
Einstein, Wikiquote, and fact checking
My likelihood depends on your frequency properties
Chemical nerdview
This is a wug. Now you have two of them.
At risk of vanishing
Moving the goalposts?
From labhacks: the $25 scrunchable scientific poster
A diversity of gifts, but the same spirit
Interaction: 'real' and statistical
Barren proxies
Google completions and sexism
Do you know where it's been?
Rock, paper, scissors, Wilcoxon test
Today we have shaming of prats
Auckland's top news story
Statins and the causal Markov property
PBRF consultation response consultation
An absolutely minimal way to increase invited speaker diversity
What I said on StatsChat only shorter and with more swearing
On the persistence of variation in horn size among Soay sheep
A layperson's view of a science communication problem
SPEED sessions at JSM 2013
In defense of theory
Some failure modes of statistics research talks
Graphs and counterfactuals
Welfare as an addictive drug
Big data linear models
Sparse linear systems and calibration of weights
Problems with faithfulness and the causal Markov property (II)
Problems with faithfulness and the causal Markov property (I)
Upcoming talks and stuff
Two simple notes on error in regression models
When is Bayesian introductory statistics better?
My Setup
Hello World
Talks in the near future
Lorem Ipsum