The Shape of Code
Predicting reports of new faults by counting past reports
Advertised prices of desktop computers during the 1990s
Maximum Adds per second for 1950s/early 1960s computers
Chinese research in software engineering
70% of new software engineering papers on arXiv are LLM related
Identifier names chosen to hold the same information
Analysis of some C/C++ source file characteristics
Relative performance of computers since the 1990s
Investigating an LLM generated C compiler
Algorithm complexity and implementation LOC
Dennard scaling a necessary condition for Moore’s law
Public documents/data on the internet sometimes disappears
Number of calls to/from functions vs function length
Formal methods and LLM generated mathematical proofs
Distribution of small project completion times
Modelling time to next reported fault
My 2025 in software engineering
Programming Punched card machines
Naming convergence in a network of pairwise interactions
Christmas books for 2025
Lifetime of coding mistakes in the Linux kernel
Decline in downloads of once popular packages
Occurrence of binary operator overloading in C++
Fifth anniversary of Evidence-based Software Engineering book
Best tool for measuring lots of source code
Distribution of method chains in Java and Python
Finding links between gcc source code and the C Standard
Modeling the distribution of method sizes
Early research on economies of scale for computer systems
Data+code for book: The New C Standard
Distribution of integer literals in text/speech and source code
ISO C++ committee has a new chief sheep herder
Percentage of methods containing no reported faults
Halstead/McCabe: a complicated formula for LOC
Half-life of Open source research software projects
Positive and negative descriptions of numeric data
Predicted impact of LLM use on developer ecosystems
Impact of developer uncertainty on estimating probabilities
A process to find and extract data-points from graphs in pdf files