Biased and Inefficient

follow: @[email protected]

Posts

Stage vs phase, again

Are predictive models enough?

New in the survey package

Do predictive models need to be causal?

Gauss is Not Mocked

Simulation and CLT

Horses or Zebras?

Does svyglm use robust standard errors?

Laws and Orders

Interviewing your laptop

AIC and combined discrete/continuous models

Single Transferable Vote and blocs

Terminology: asymptotically unbiased

Included-variable bias

Ihaka lectures 2025

Two-stage least squares

Strata and clusters

Iris classification: the next generation

Coupling simulations and the "reparametrisation trick"

Ordinal data: taking transformation invariance seriously

The piranha and the polypill

Collinearity four more times

Brute force and ignorance

Two approaches to approximating sums of chisquareds

National Land Transport Plan graphs

Ihaka Lectures

The missing test in survey regression models

Another way to not sample without replacement

A Bayesian t-test, again

Stage vs phase

Estimator vs estimate

Automatic transformation of standard errors?

S3 method dispatch on other arguments

Crossvalidation in complex survey data

Choosing frame weights in dual-frame surveys

Another update on non-transitive dice

Multiple frame sampling

Importance weights

Assumptions

Quantitative graphics?

Symbolically nested

Factors as factors

Small-area estimates by smoothing direct estimates

New in the survey package

Ordinal outcomes: the LOCT DOOR

Recurrent events: increased susceptibility or latent risk?

Asymptotics for linear mixed models

Why do the Rao-Scott tests have good size?

How good is the leading eigenvalue approximation to quadratic forms?

Why not REML?

Sparse correlation and the Central Limit Theorem

svy2lme: the preprint

Linear mixed models with pairwise likelihood

Benchmark Archaeology

Quoting and requoting

Blank-cheque inheritance and statistical methods objects

Pairwise likelihood and cluster sizes

New in the survey package

Ranks in survey data

Class imbalance: bug or feature?

Which infinite sequence?

The fourth-root thing

Determinant of correlation matrix

Sandwiches and aggregation

When is population mean rank a thing?

Checking proportionality of odds

Linkage and multiple imputation

Pairwise and joint independence

A short note on effect sizes

The sandwich and the t-test

Bus pruning

Improving a graph

Code archaeology: polynomial distributed lags

A plug-in uniform law of large numbers

Looking back

Tracking down a Real Data Set(tm)

ASCII and beyond — will it play in Peoria?

Tidying rimu

Combining a survey and other data

Self-promotion: an actual multiwave two-phase design

Getting strings into code in base R

Design |> Data: Ihaka Lectures 2022

Tables with zeroes

stringsAsFactors=do_you_feel_lucky

Nine and sixty ways

Comparing tests for generalised linear models in survey data

Optimal design for raking/AIPW estimation

Per capita, in mice

Top posts in 2020

Planning a new data management course

Neyman Allocation, only exact

You will probably not be eaten by a grue

When the sky didn't fall

MOAR survey regression models

A Bayesian t-test?

Weights in statistics

Sourdough happens

New in the survey package

Changing strata mid-stream

Mapping NZ cases of COVID-19

Not cross buns

Quadratic trend tests in survey package

The Ihaka Lectures, Episode 4

Survey package news

Multifactor interventions and interactions

Computer says no

What is 'Data Science Practice'?

How many giraffes?

Hexmaps for NZ District Health Boards

Some things I don’t like about the Oxford-Munich Code of Conduct

How to review a book

(What’s up with the brackets?)

Why isn't rimu tidy?

A package for multiple-response data

Adding new functions to the survey package

Denominator degrees of freedom in svyglm

Wald, score, LRT: the picture

Analysing the mouse microbiome autism data

Confidence intervals: not a very strong property

Design degrees of freedom: brief note

Mean People Tweet

The Reeferendum

Local asymptotic minimax, and nearly-true models

Survey package update

That’s for remembrance

Handling ‘plausible values’ in surveys

Progress on linear mixed models for surveys

Hypergraph network meta-analysis

The school climate strike

Normal horizontiles

Displaying bus punctuality

Absolutely no warranty?

What have I got against the Shapiro-Wilk test?

How do you tell what packages to trust?

Recognising when you don’t know

Two quick survey items

Another way to see why mixed models in survey data are hard:

The Ihaka Lectures 3: Rise of the Machine Learners

Bayesian Surprise — the Shiny app

What are packages for?

svycontrast

Finding principal components without even looking?

Come work with us

Progress on svy2lme

Survey package update

The Kiwi PRNG

How to write a racist AI in R without really trying

Journalism and cyber-bullying

What can data science add to statistics education?

ISCB/ASC talk

Leaflet and buses

Testing probability distribution generators

Quoting and macros in R

e-bike: the reboot

Interlingual

Spell my name with a ‘v'

Statistical software matters

Survey analysis in SQL

New blog home

Biased and Inefficient

Graduation

svylme

Small p hacking

Chebyshev’s inequality and `UCL’

Why pairwise likelihood?

Faster generalised linear models in largeish data

Useful debugging trick

More tests for survey data

How to add chi-squareds

Secret Santa collisions

When all U-shaped curves look the same to you

Means of maximums

Haere mai, statistical computing folks

A genome analogy

Bayesian surprise

Visual design of diagnostics

Causes and counterfactuals

Wilcoxon and polymath: another update

The bus bot

Psychoactive substances and Peter Dunne

Tail bounds under sparse correlation

Information and control

Probabilities not bounded away from zero

Two-day course: survival analysis

A possibly unsurprising bootstrap observation

Stupid word games

Pipeable survey analysis in R

Peer review and community endorsement

A ‘polymath’ project on the Wilcoxon test?

Why I like the Convolution Theorem

Case-control efficiency

Order and quotient topologies

“Meritocracy” and “public good”

Hearing things

The Ihaka Lectures

When the bootstrap doesn’t work

Te Reo Māori in schools

Case-control sampling and pseudo-Rsquareds

A bus-watching bot

Mature and premature optimisation

Fixing an infelicity in ‘leaps’

Learning the Monty Hall problem

The ‘iris’ data

Making survey statistics boring and inefficient

Brief quake summary for overseas people

Changes in turnout and preference

Cuts to ‘Growing Up in New Zealand’

Terms to eschew

Large quadratic forms

The hard problem of AI and other stories

Come work with us

On permuting all the things

The lithium-powered space bike

“The” multiple comparisons problem

Like a crossword

Simulations and modes of convergence

Etymology

A modest proposal: Lazy Ambiguous Single Transferable Vote

One scoRe years

How do we prove the Central Limit Theorem?

Computing the (simplest) sandwich estimator incrementally

Are there any news?

Size matters

Sufficiently advanced technology

The Great Kiwi Cherry Ripe Scandal

Mostly dead

Artistic verisimilitude

The conservative Bonferroni correction

Trace estimators and impact factors

A gene for celibacy?

Truthy and Sciency

Coding linear splines

Cheap tricks

Two cheers for crowdfunding

No-one’s forcing you to read the Herald

Stochastic SVD

Is it that time of day?

Another view of the ‘nearly true’ model

What does ‘design-consistent’ even mean?

Circumspice

Superfood sourcing

The Muntab Question Strikes Back

Potential energy and kinetic energy

Case-control estimation is more complicated than you think

A simple probability problem

The Muntab Question

Serious tongue-twister

Poetry visualisation

Should SPRINT have stopped?

Prefiltering very large numbers of tests

Double robustness

Convergent evolution and NZ Bird of the Year

NZ Flag Referendum pseudorandom numbers

Oranges and lemons

(high-dimensional) Space is Big.

Good reasons for assuming a spherical cow

Net Reclassification Index: surprisingly weird.

A conservation tragedy

Colour names from XKCD in R

Fox fails statistics; does NYT?

JSM2015: notes on Seattle from an ex-resident

Pianos, heaps, and ethics of randomisation

Te Wiki o Te Reo Māori

stringsAsFactors = <sigh>

Pi day

A much-needed gap

Countermatching

Zero-inflated Poisson from complex samples

Call me, Ishmael

Superefficiency

Precise answers, but not necessarily to the right question

What’s the right proof of the Continuous Mapping Theorem?

Eppur si muove

Pharmacy ethics

Paper helicopters at a science fair

What does measurability mean?

How hard did you look: equivalence and non-inferiority

Clinically proven ingredients

Science and statistical inference

Assumptions and testing

A transitive test is a test for a univariate parameter

New header picture

Tomato, tomato

Different questions can have different answers

Variation explained and log transformation

How not to treat Ebola

Citations: credit or blame

What science should everyone know?

It depends on what you mean by 'cost'

This is just to say

A people set apart

Miasma and Contagion

Semiparametric efficiency and nearly-true models

Broman's Socks and the Nature of Scientific Reporting

Is it good or bad when confounding adjustment makes no difference?

On dialect

Rhetorical sensitivity analysis

O necessary sinpi

Taking meta-analysis heterogeneity seriously

Survey package update

Feynman and the Suck Fairy

Herd Immunity simulations

Monotonicity and smoothness

Anchoring bias

Randomisation without consent

Einstein, Wikiquote, and fact checking

My likelihood depends on your frequency properties

Chemical nerdview

This is a wug. Now you have two of them.

At risk of vanishing

Moving the goalposts?

From labhacks: the $25 scrunchable scientific poster

A diversity of gifts, but the same spirit

Interaction: 'real' and statistical

Barren proxies

Google completions and sexism

Do you know where it's been?

Rock, paper, scissors, Wilcoxon test

Today we have shaming of prats

Auckland's top news story

Statins and the causal Markov property

PBRF consultation response consultation

An absolutely minimal way to increase invited speaker diversity

What I said on StatsChat only shorter and with more swearing

On the persistence of variation in horn size among Soay sheep

A layperson's view of a science communication problem

SPEED sessions at JSM 2013

In defense of theory

Some failure modes of statistics research talks

Graphs and counterfactuals

Welfare as an addictive drug

Big data linear models

Sparse linear systems and calibration of weights

Problems with faithfulness and the causal Markov property (II)

Problems with faithfulness and the causal Markov property (I)

Upcoming talks and stuff

Two simple notes on error in regression models

When is Bayesian introductory statistics better?

My Setup

Hello World

Talks in the near future

Lorem Ipsum