As part of QCon London's architecture qualities track (which is also being run at the upcoming
QCon San Francisco Nov 7-9), eBay Technical Fellow Dan Pritchett presented on how to achieve operational manageability in large scale systems: "You're confident that your software will handle horizontal scale to thousands of servers. But how about your operational team? Have you also architected for managing that large collection of servers?" Dan Pritchett presented lessons learned at eBay and lead a discussion on how to ensure your transactional scalability doesn't ignore your architecture's manageability.
Watch:
Dan Pritchett on Operational Manageability (42 min)
Dan's session covers a number of issues, including:
- Scaling Fallacies
- The role of the ops team
- Designing for Operations
- Managing Configuration
- Deploying on the Fly
- Design to Monitor
- Design for Failure
- Dependencies and Failures
- Power and Software
- Green Operations
- Disaster Planning
- Active/Active Deployments
- Grid Considerations
Dan Pritchett is a Technical Fellow at eBay, and is involved in solving some of the more challenging engineering problems found anywhere on the web. His engineering career spans 25 years and includes research on relational databases, designing geographic map software, building email products, and creating scalable web applications. Dan also wrote an article on InfoQ on
the challenges of latency and was
interviewed by Martin Fowler and Floyd Marinescu.