RSS.Social

Erik Bernhardsson

follow: @[email protected]

Posts

It's hard to write code for computers, but it's even harder to write code for humans

Predicting solar eclipses with Python

Simple sabotage for software

What I have been working on: Modal

We are still early with the cloud: why software development is overdue for a change

σ-driven project management: when is the optimal time to give up?

Storm in the stratosphere: how the cloud will be reshuffled

What is the right level of specialization? For data teams and anyone else.

Building a data team at a mid-stage startup: a short story

Software infrastructure 2.0: a wishlist

What's Erik up to?

Giving more tools to software engineers: the reorganization of the factory

Developer experience as a competitive advantage

Mortality statistics and Sweden's "dry tinder" effect

How to set compensation using commonsense principles

Never attribute to stupidity that which is adequately explained by opportunity cost

How to hire smarter than the market: a toy model

What can startups learn from Koch Industries?

We're hiring at Better

Buffet lines are terrible, but let's try to improve them using computer simulations

Miscellaneous unsolicited (and possibly biased) career advice

Modeling conversion rates using Weibull and gamma distributions

Why software projects take longer than you think: a statistical model

Headcount goals, feature factories, and when to hire those mythical 10x people

Data architecture vs backend architecture

The hacker's guide to uncertainty estimates

I don't want to learn your garbage query language

Business secrets from terrible people

New approximate nearest neighbor benchmarks

Missing the point about microservices: it's about testing and deploying independently

Interviewing is a noisy prediction problem

Waiting time, load factor, and queueing theory: why you need to cut your systems a bit of slack

Lessons from content marketing myself (aka blogging) for five years

New benchmarks for approximate nearest neighbors

I'm looking for data engineers

Books I consumed in 2017

Plotting author statistics for Git repos using Git of Theseus

Toxic meeting culture

Learning from users faster using machine learning

Annoy 1.10 released, with Hamming distance and Windows support

Why conversion matters: a toy model

On the Equifax breach and how to really prevent identity theft

The number of letters in the word for each number

The software engineering rule of 3

Machine, Platform, Crowd

Google diversity memo, global warming, Pascal's wager, and other stuff

Fun with trigonometry: the world's most twisted coastline

Optimizing for iteration speed

Blogroll

Conversion rates – you are (most likely) computing them wrong

The mathematical principles of management

The eigenvector of "Why we moved from language X to language Y"

Why I went into the mortgage industry

Language pitch

Functional programming is the libertarianism of software engineering

The half-life of code & the ship of Theseus

Are data sets the new server rooms?

Pareto efficency

State drift

When machine learning matters

Subway waiting math

Approximate nearest news

What is your motivation?

Dollar cost averaging

Why organizations fail

NYC subway math

Exploding offers are bullshit

Meta-blogging

Iterate or die

My issue with GPU-accelerated deep learning

Some more font links

Analyzing 50k fonts using deep neural networks

I believe in the 10x engineer, but...

Books I read in 2015

More MCMC – Analyzing a small dataset with 1-5 ratings

There is no magic trick

Installing TensorFlow on AWS

Looking for smart people

MCMC for marketing data

Interview with a Data Scientist: Erik Bernhardsson

Nearest neighbors and vector models – epilogue – curse of dimensionality

Nearest neighbors and vector models – part 2 – algorithms and data structures

Nearest neighbor methods and vector models – part 1

Presentations about Spotify music recommendations

Antipodes

Software Engineers and Automation

coin2dice

Benchmark of Approximate Nearest Neighbor libraries

More Luigi alternatives

3D in D3

The hardest challenge about becoming a manager

The lane next to you is more likely to be slower than yours

Better precision and faster index building in Annoy

Annoy – now without Boost dependencies and with Python 3 Support

Ping the world

Black Box Machine Learning in the Cloud

It's called Berkson's paradox!

Norvig's claim that programming competitions correlate negatively with being good on the job

Pinterest open sources Pinball

The relationship between commit size and commit message size

My favorite management failures

Leaving Spotify

Scala Data Pipelines for Music Recommendations

Everything I learned about technical debt

I already found the best gifs

A brief history of Hadoop at Spotify

Luigi Presentation @ NYC Data Science, Dec 16, 2014

Luigi talk tomorrow

Deep learning for… Go

Deep learning for… chess (addendum)

Deep learning for... chess

Optimizing things: everything is a proxy for a proxy for a proxy

Luigi conquering the world

Annoying blog post

The Filter Bubble is Silly and you Can't Guess What Happened Next

Detecting corporate fraud using Benford's law

Running Theano on EC2

In defense of false positives (why you can't fail with A/B tests)

Recurrent Neural Networks for Collaborative Filtering

Where do locals go in NYC?

How to build up a data team (everything I ever learned about recruiting)

The power of ensembles

MLConf 2014

Music recommendations using cover images (part 1)

Luigi success

Welcome Echo Nest!

Momentum strategies

Ratio metrics

Benchmarking nearest neighbor libraries in Python

More recommender algorithms

Microsoft's new marketing strategy: give up

Bagging as a regularizer

Model benchmarks

statself.com

Implicit data and collaborative filtering

Vote for our SXSW panel!

What's up with music recommendations?

3D

2D embedding of 5k artists = WIN

Delivering Music Recommendations

ML+Hadoop at NYC Predictive Analytics

HubSpot's Picture Shows how to Maintain Monocultures in the 21st Century

More Luigi: Presentation from OSCON

Optimizing over multinomial distributions

More Luigi!

hdfs2cass

NoDoc

Wikiphilia

Spotify's Discovery page

Fermat's principle

Snakebite

Stuff that bothers me: “100x faster than Hadoop”

Presentation about Luigi

Being data driven

Annoy

More Luigi!

ML at Twitter

I'm featured in Mashable

Slides from NYC Machine Learning talk

NYC Machine Learning meetup

Momentum and mean reversion might just be volatility bias

Calculating cosine similarities using dimensionality reduction

Tumblr's awesome project names

A neat little trick with time decay

Luigi: complex pipelines of tasks in Python

About

Domains for sale

Home

Home

Open source

Top posts