Site Reliability Engineering: How Google Runs Production Systems

25
Free sample

The overwhelming majority of a software system’s lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems?

In this collection of essays and articles, key members of Google’s Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You’ll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient—lessons directly applicable to your organization.

This book is divided into four sections:

  • Introduction—Learn what site reliability engineering is and why it differs from conventional IT industry practices
  • Principles—Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)
  • Practices—Understand the theory and practice of an SRE’s day-to-day work: building and operating large distributed computing systems
  • Management—Explore Google's best practices for training, communication, and meetings that your organization can use
Read more

About the author

Niall Murphy leads the Ads Site Reliability Engineering team at Google Ireland. He has been involved in the Internet industry for about 20 years, and is currently chairperson of INEX, Ireland’s peering hub. He is the author or coauthor of a number of technical papers and/or books, including "IPv6 Network Administration" for O’Reilly, and a number of RFCs. He is currently cowriting a history of the Internet in Ireland, and is the holder of degrees in Computer Science, Mathematics, and Poetry Studies, which is surely some kind of mistake. He lives in Dublin with his wife and two sons.

Betsy Beyer is a Technical Writer for Google Site Reliability Engineering in NYC. She has previously written documentation for Google Datacenters and Hardware Operations teams. Before moving to New York, Betsy was a lecturer on technical writing at Stanford University.

Chris Jones is a Site Reliability Engineer for Google App Engine, a cloud platform-as-a-service product serving over 28 billion requests per day. Based in San Francisco, he has previously been responsible for the care and feeding of Google’s advertising statistics, data warehousing, and customer support systems. In other lives, Chris has worked in academic IT, analyzed data for political campaigns, and engaged in some light BSD kernel hacking, picking up degrees in Computer Engineering, Economics, and Technology Policy along the way. He’s also a licensed professional engineer.

Jennifer Petoff is a Program Manager for Google’s Site Reliability Engineering team and based in Dublin, Ireland. She has managed large global projects across wide-ranging domains including scientific research, engineering, human resources, and advertising operations. Jennifer joined Google after spending eight years in the chemical industry. She holds a PhD in Chemistry from Stanford University and a BS in Chemistry and a BA in Psychology from the University of Rochester.

Read more
4.9
25 total
Loading...

Additional Information

Publisher
"O'Reilly Media, Inc."
Read more
Published on
Mar 23, 2016
Read more
Pages
552
Read more
ISBN
9781491951170
Read more
Language
English
Read more
Genres
Computers / Software Development & Engineering / Project Management
Computers / Software Development & Engineering / Quality Assurance & Testing
Computers / Software Development & Engineering / Systems Analysis & Design
Computers / System Administration / Disaster & Recovery
Computers / System Administration / General
Computers / System Administration / Linux & UNIX Administration
Computers / Systems Architecture / Distributed Systems & Computing
Read more
Content Protection
This content is DRM free.
Read more
Read Aloud
Available on Android devices
Read more
Eligible for Family Library

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
Even bad code can function. But if code isn’t clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn’t have to be that way.

Noted software expert Robert C. Martin presents a revolutionary paradigm with Clean Code: A Handbook of Agile Software Craftsmanship . Martin has teamed up with his colleagues from Object Mentor to distill their best agile practice of cleaning code “on the fly” into a book that will instill within you the values of a software craftsman and make you a better programmer–but only if you work at it.

What kind of work will you be doing? You’ll be reading code–lots of code. And you will be challenged to think about what’s right about that code, and what’s wrong with it. More importantly, you will be challenged to reassess your professional values and your commitment to your craft.

Clean Code is divided into three parts. The first describes the principles, patterns, and practices of writing clean code. The second part consists of several case studies of increasing complexity. Each case study is an exercise in cleaning up code–of transforming a code base that has some problems into one that is sound and efficient. The third part is the payoff: a single chapter containing a list of heuristics and “smells” gathered while creating the case studies. The result is a knowledge base that describes the way we think when we write, read, and clean code.

Readers will come away from this book understanding
How to tell the difference between good and bad code How to write good code and how to transform bad code into good code How to create good names, good functions, good objects, and good classes How to format code for maximum readability How to implement complete error handling without obscuring code logic How to unit test and practice test-driven development This book is a must for any developer, software engineer, project manager, team lead, or systems analyst with an interest in producing better code.
An incredible, true-life adventure set on the most dangerous frontier of all—outer spaceIn the nearly forty years since Neil Armstrong walked on the moon, space travel has come to be seen as a routine enterprise—at least until the shuttle Columbia disintegrated like the Challenger before it, reminding us, once again, that the dangers are all too real.
Too Far from Home vividly captures the hazardous realities of space travel. Every time an astronaut makes the trip into space, he faces the possibility of death from the slightest mechanical error or instance of bad luck: a cracked O-ring, an errant piece of space junk, an oxygen leak . . . There are a myriad of frighteningly probable events that would result in an astronaut’s death. In fact, twenty-one people who have attempted the journey have been killed.
Yet for a special breed of individual, the call of space is worth the risk. Men such as U.S. astronauts Donald Pettit and Kenneth Bowersox, and Russian flight engineer Nikolai Budarin, who in November 2002 left on what was to be a routine fourteen-week mission maintaining the International Space Station.
But then, on February 23, 2003, the Columbia exploded beneath them. Despite the numerous news reports examining the tragedy, the public remained largely unaware that three men remained orbiting the earth. With the launch program suspended indefinitely, these astronauts had suddenly lost their ride home.
Too Far from Home chronicles the efforts of the beleaguered Mission Controls in Houston and Moscow as they work frantically against the clock to bring their men safely back to Earth, ultimately settling on a plan that felt, at best, like a long shot.
Latched to the side of the space station was a Russian-built Soyuz TMA-1 capsule, whose technology dated from the late 1960s (in 1971 a malfunction in the Soyuz 11 capsule left three Russian astronauts dead.) Despite the inherent danger, the Soyuz became the only hope to return Bowersox, Budarin, and Pettit home.
Chris Jones writes beautifully of the majesty and mystique of space travel, while reminding us all how perilous it is to soar beyond the sky.
©2018 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google|Location: United StatesLanguage: English (United States)
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.