When looking for examples of large scale machine learning system design, a cursory search did not yield much. However, I knew that the occasional blog post and conference talk aggregated could fix the scarcity of open examples in large scale machine learning system design, at least for me, and hopefully for you too!

This is a living list: you can submit examples you want listed to j at rubinovitz dot com.



Meet Michelangelo: Uber’s Machine Learning Platform
HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow


Market Insights

Helping Guests Make Informed Decisions with Market Insights
Hive, Spark

Delivering Insights to Hosts

How we deliver insights to hosts – Airbnb Engineering & Data Science – Medium
Redis, unnamed key-value store

Real-time Alert Infrastructure

BinaryAlert: Real-time Serverless Malware Detection
Uses: Serverless AWS infrastructure

Risk Analysis

Measuring Transactional Integrity in Airbnb’s Distributed Payment Ecosystem
Hadoop HDFS, Druid, Hive, S3, Airflow


Recommendation (2013)

System Architectures for Personalization and Recommendation
AWS, Hadoop, Hive, Pig, MySQL, Cassandra, EVCache


Deep Learning Research Infrastructure

Infrastructure for Deep Learning
Kubernetes, AWS
Scaling Kubernetes to 2500 nodes
Kubernetes, AWS