2. Table of Contents
Yahoo Confidential & Proprietary
▪Scale at Yahoo!
▪Scale and Performance
▪Features - Usability
▪High availability and Load Balancing
▪Customer Asks
3. Scale at Yahoo!
▪Busiest cluster
› 1 million+ workflows per month
› 45 - 55K workflows per day
› 40 - 50K coord actions per day
› 800 - 900 coordinators (5m, 15m, 30m, hourly, daily and weekly)
› 30 - 40 bundles
▪Most complex bundle - 230 coordinators
▪Most complex workflow - 85 forks
▪Video Transcoding - 100-300 workflows per min
4. Scale and Performance
▪ Database
› CLOB to BLOB to compress and store inline
› Remove unnecessary hadoop config stored in protoActionConf
› Select only needed columns instead of loading whole row
› Partition tables by created time (in Oracle)
▪ Other
› Huge improvements to materialization of coordinator actions
› Reduce Launcher overhead
• Merge the number of small files created per action to one sequence file
• Launcher libraries shipped only once to HDFS
• Uber mode launcher with Hadoop 2.x
› Synchronously execute commands without queueing to speed up action transition
› Automatically killing abandoned coordinator job
› gzip compression for Rest API
5. Features - Usability
▪UI improvements
› Active Jobs, Custom Global Filters, Child Jobs for Pig/Hive actions
▪Faster log streaming with more filters
▪Updating coordinator definition on the fly
▪Rerun workflows without having to specify all properties again
▪Mark coordinator and actions as ignored
▪Sharelib Enhancements
› Update on the fly without failing jobs
› Command to list different sharelib available
› Specify directories using metafile instead of single share lib directory
6. High Availability and Load Balancing
▪HCat integration
▪SLA
▪Sharelib
▪Server-server authentication
▪Distributed sequence
7. Customer Asks
▪Coordinator dependency management
› Ability to view dependencies and rerun part of a pipeline
▪Better error handling and automatic retries
▪Ability to Suspend/Turn off SLA alerting
▪One-click launcher log viewing
▪Zero downtime