Abstract: Netflix designs our systems and deployment processes to help the service survive both catastrophic events like zone and regional outages and less catastrophic events like network latency and random instance death. This system has previously been described as "dream devops". In our data centers we had monolithic systems and centralized operations. When we moved to the cloud we fully embraced the distributed services and the devops model. Now, with experience, we've uncovered real-world challenges with the devops model and, as a result, have embraced more effective hybrid approaches. More specifically, how do we reconcile local agility and ownership with the achievement of system-wide objectives, such as the overall quality and reliability of large scale distributed environment? Topics will include our software lifecycle from code checkin to automated machine image baking to deployment, monitoring and alerting, and how Netflix uses self service tools to enable our developers to maintain maximum code velocity
via DevNation 2014 - Jeremy Edberg - How Netflix Uses DevOps for Reliability and Developer Velocity.