This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, and manage cloud-born clusters efficiently. You’ll learn how to architect clusters that work with cloud-provider features—not just to avoid pitfalls, but also to take full advantage of these services. You’ll also compare the Amazon, Google, and Microsoft clouds, and learn how to set up clusters in each of them.
- Learn how Hadoop clusters run in the cloud, the problems they can help you solve, and their potential drawbacks
- Examine the common concepts of cloud providers, including compute capabilities, networking and security, and storage
- Build a functional Hadoop cluster on cloud infrastructure, and learn what the major providers require
- Explore use cases for high availability, relational data with Hive, and complex analytics with Spark
- Get patterns and practices for running cloud clusters, from designing for price and security to dealing with maintenance
About the author
Bill Havanki is a software engineer working for Cloudera, where he has contributed to Hadoop components as well as systems for deploying Hadoop clusters into public Cloud services. Prior to joining Cloudera he worked for 15 years developing software for government contracts, focusing mostly on analytic frameworks and authentication and authorization systems. He earned his B.S. in Electrical Engineering from Rutgers University and his M.S. in Computer Engineering from North Carolina State University. A New Jersey native, he currently lives near Annapolis, Maryland with his family.