Yanir Seroussi – AI/ML Engineering Consultant

follow: @[email protected]

Posts

Beyond good vibes: Securing AI agents by design

Posting into the void – with guardrails

Data moats, stealthy AI, and more: AI Con 2024 notes

Don't build AI, build with AI

In praise of inconsistency: Ditching weekly posts

Data, AI, humans, and climate: Carving a consulting niche

Juggling delivery, admin, and leads: Monthly biz recap

AI hype, AI bullshit, and the real deal

Giving up on the minimum viable data stack

Keep learning: Your career is never truly done

First year lessons from a solo expertise biz in Data & AI

AI/ML lifecycle models versus real-world mess

Your first Data-to-AI hire: Run a lovable process

Learn about Dataland to avoid expensive hiring mistakes

Exploring an AI product idea with the latest ChatGPT, Claude, and Gemini

Stay alert! Security is everyone's responsibility

Five team-building mistakes, according to Patty McCord

Is your tech stack ready for data-intensive applications?

Dealing with endless data changes

AI ain't gonna save you from bad data

The rules of the passion economy

Startup data health starts with healthy event tracking

How to avoid startups with poor development processes

Plumbing, Decisions, and Automation: De-hyping Data & AI

Adapting to the economy of algorithms

Question startup culture before accepting a data-to-AI role

Probing the People aspects of an early-stage startup

Business questions to ask before taking a startup data role

Mentorship and the art of actionable advice

Assessing a startup's data-to-AI health

AI does not obviate the need for testing and observability

LinkedIn is a teachable skill

My experience as a Data Tech Lead with Work on Climate

The data engineering lifecycle is not going anywhere

Artificial intelligence, automation, and the art of counting fish

Atomic Habits is full of actionable advice

Questions to consider when using AI for PDF data extraction

Two types of startup data problems

Avoiding AI complexity: First, write no code

Building your startup's minimum viable data stack

The three Cs of indie consulting: Confidence, Cash, and Connections

Nudging ChatGPT to invent books you have no time to read

Future software development may require fewer humans

Substance over titles: Your first data hire may be a data scientist

New decade, new tagline: Data & AI for Impact

Psychographic specialisations may work for discipline generalists

The power of parasocial relationships

Positioning is a common problem for data scientists

Transfer learning applies to energy market bidding

Supporting volunteer monitoring of marine biodiversity with modern web and data tools

Our Blue Machine is changing, but we are not helpless

You don't need a proprietary API for static maps

Lessons from reluctant data engineering

Artificial intelligence was a marketing term all along – just call it automation

The lines between solo consulting and product building are blurry

Google's Rules of Machine Learning still apply in the age of large language models

My rediscovery of quiet writing on the open web

The Minimalist Entrepreneur is too prescriptive for me

Revisiting Start Small, Stay Small in 2023 (Chapter 2)

Revisiting Start Small, Stay Small in 2023 (Chapter 1)

Email notifications on public GitHub commits

The rule of thirds can probably be ignored

Using YubiKey for SSH access

Making a TIL section with Hugo and PaperMod

You can't save time

Was data science a failure mode of software engineering?

How hackable are automated coding assessments?

Remaining relevant as a small language model

ChatGPT is transformative AI

Causal Machine Learning is off to a good start, despite some issues

The mission matters: Moving to climate tech as a data scientist

Building useful machine learning tools keeps getting easier: A fish ID case study

Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials

Use your human brain to avoid artificial intelligence disasters

Migrating from WordPress.com to Hugo on GitHub + Cloudflare

My work with Automattic

Some highlights from 2020

Many is not enough: Counting simulations to bootstrap the right way

Software commodities are eating interesting data science work

A day in the life of a remote data scientist

Bootstrapping the right way?

Hackers beware: Bootstrap sampling may be harmful

The most practical causal inference book I’ve read (is still a draft)

Reflections on remote data science work

Defining data science in 2018

Advice for aspiring data scientists and other FAQs

State of Bandcamp Recommender, Late 2017

My 10-step path to becoming a remote data scientist with Automattic

Exploring and visualising Reef Life Survey data

Customer lifetime value and the proliferation of misinformation on the internet

Ask Why! Finding motives, causes, and purpose in data science

If you don’t pay attention, data can drive you off a cliff

Is Data Scientist a useless job title?

Making Bayesian A/B testing more accessible

Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions

The rise of greedy robots

Why you should stop worrying about deep learning and deepen your understanding of causality instead

The joys of offline data collection

This holiday season, give me real insights

The hardest parts of data science

Migrating a simple web application from MongoDB to Elasticsearch

Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling

The wonderful world of recommender systems

You don’t need a data scientist (yet)

Goodbye, Parse.com

Learning about deep learning through album cover classification

Deep learning resources

Hopping on the deep learning bandwagon

First steps in data science: author-aware sentiment analysis

My divestment from fossil fuels

My PhD work

The long road to a lifestyle business

Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)

Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)

Automating Parse.com bulk data imports

Stochastic Gradient Boosting: Choosing the Best Number of Iterations

SEO: Mostly about showing up?

Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary)

BCRecommender Traction Update

What is data science?

Greek Media Monitoring Kaggle competition: My approach

Applying the Traction Book’s Bullseye framework to BCRecommender

Bandcamp recommendation and discovery algorithms

Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)

Building a Bandcamp recommender system (part 1 – motivation)

How to (almost) win Kaggle competitions

Data’s hierarchy of needs

Kaggle competition tips and summaries

Kaggle beginner tips

About Yanir: AI/ML Engineering Consultant

Book a free fifteen-minute call

Causal inference resources

Free Guide: Data-to-AI Health Check for Startups & Scaleups

Helping climate & nature tech scaleups succeed with AI/ML engineering

Speaking engagements by Yanir: AI/ML Engineering Consultant

Stay in touch