000 07850nam a22003977a 4500
001 22699554
003 IIITD
005 20230616154457.0
008 230616b xxu||||| |||| 00| 0 eng d
010 _a 2022276249
015 _aGBC0K1553
_2bnb
016 7 _a020047577
_2Uk
020 _a9789385889516
035 _a(OCoLC)on1156176748
040 _aYDX
_beng
_cYDX
_erda
_dBDX
_dEXR
_dOCLCO
_dMNW
_dOCLCF
_dEXH
_dUKMGB
_dOCLCO
_dDLC
_dIIITD
042 _alccopycat
050 0 0 _aTA169
_b.N56 2020
082 _a620.001
_bSTO-9
245 0 0 _a97 things every SRE should know :
_bcollective wisdom from the experts
_cedited by Emil Stolarsky and Jaime Woo
246 3 _aNinety-seven things every Site Reliability Engineer should know
260 _aMumbai :
_bO'Reilly,
_c©2021
300 _axvii, 231 p. :
_bill. ;
_c24 cm.
504 _aIncludes bibliographical references and index.
505 0 0 _tNew to SRE.
_tSite reliability engineering in six words /
_rAlex Hidalgo --
_tDo we know why we really want reliability? /
_rNiall Murphy --
_tBuilding self-regulating processes /
_rDenise Yu --
_tFour engineers of an SRE seder /
_rJacob Scott --
_tThe reliability stack /
_rAlex Hidalgo --
_tInfrastructure: it's where the power is /
_rCharity Majors --
_tThinking about resilience /
_rJustin Li --
_tObservability in the development cycle /
_rCharity Majors and Liz Fong-Jones --
_tThere is no magic /
_rBouke van der Bijl --
_tHow Wikipedia is served to you /
_rEffie Mouzeli --
_tWhy you should understand ( a little) about TCP /
_rJulia Evans --
_tThe importance of a management interface /
_rSalim Virji --
_tWhen it comes to storage, think distributed /
_rSalim Virji --
_tThe role of cardinality /
_rCharity Majors and Liz Fong-Jones --
_tSecurity is like an onion /
_rLucas Fontes --
_tUse your words /
_rTanya Reilly --
_tWhere to SRE /
_rFatema Boxwala --
_tDear future team /
_rFrances Rees --
_tSustainability and burnout /
_rDenise Yu --
_tDon't take advice from Graybeards /
_rJohn Looney --
_tFacing that first page /
_rAndrew Louis --
_tZero to one.
_tSRE, at any size, is cultural /
_rMatthew Huxtable --
_tEveryone is an SRE in a small organization /
_rMatthew Huxtable --
_tAuditing your environment for improvements /
_rJoan O'Callaghan --
_tWith incident response, start small /
_rThai Wood --
_tSolo SRE: effecting large-scale change as a single individual /
_rAshley Poole --
_tDesign goals for SLO measurement /
_rBen Sigelman --
_tI have an error budget- now what? /
_rAlex Hidalgo --
_tHow to change things /
_rJoan O'Callaghan --
_tMethodological debugging /
_rAvishai Ish-Shalom and Nati Cohen --
_tHow startups can build an SRE mindset /
_rTamara Miner --
_tBootstrapping SRE in Enterprises /
_rVanessa Yiu --
_tIt's okay not to know, and it's okay to be wrong /
_rTodd Palino --
_tStorytelling is a superpower /
_rAnita Clarke --
_tGet your work recognized: write a brag document /
_rJulie Evans and Karla Burnett --
_tOne to ten.
_tMaking work visible /
_rLorin Hochstein --
_tAn overlooked engineering skill /
_rMurali Suriar --
_tUnpacking the on-call divide /
_rJason Hand --
_tThe maestros of incident response /
_rAndrew Louis --
_tEffortless incident management /
_rSuhail Patel, Miles Bryant, and Chris Evans --
_tIf you're doing runbooks, do them well /
_rSpike Lindsey --
_tWhy I hate our playbooks /
_rFrances Rees --
_tWhat machines do well /
_rMichelle Brush --
_tIntegrating empathy into SRE tools /
_rDaniella Niyonkuru --
_tUsing ChatOps to implement empathy /
_rDaniella Niyonkuru --
_tMove fast to unbreak things /
_rMichelle Brush --
_tYou don't know for sure until it runs in production /
_rIngrid Epure --
_tSometimes the fix is the problem /
_rJake Pittis --
_tLegendary /
_rElise Gale --
_tMetrics are not SLIs (the measure everything trap) /
_rBrian Murphy --
_tWhen SLOs attack: pathological SLOs and how to fix them /
_rNarayan Desai --
_tHolistic approach to product reliability /
_rKristine Chen and Bart Ponurkiewicz --
_tIn search of the lost time /
_rIngrid Epure --
_tUnexpected lessons from office hours /
_rTamara Miner --
_tBuilding tools for internal customers that they actually want to use /
_rVinessa Wan --
_tIt's about the individuals and interactions /
_rVinessa Wan --
_tThe human baseline in SRE /
_rEffie Mouzeli --
_tRemotely productive or productively remote /
_rAvleen Vig --
_tOf margins and individuals /
_rKurt Andersen --
_tThe importance of margins in systems /
_rKurt Andersen --
_tFewer spreadsheets, more napkins /
_rJacob Bednarz --
_tSneaking in your DevOps deliciously /
_rVinessa Wan --
_tEffecting SRE cultural changes in enterprise /
_rVanessa Yiu --
_tTo all the SREs I've loved /
_rFelix Glaser --
_tComplex: the most overloaded word in technology /
_rLaura Nolan --
_tTen to hundred.
_tThe best advice I can give to teams /
_rNicole Forsgren --
_tCreate your supporting artifacts /
_rDaria Barteneva and Eva Parish --
_tThe order of operations for getting SLO buy-in /
_rDavid K. Rensin --
_tHeroes are necessary, but hero culture is not /
_rLei Lopez --
_tOn-call rotations that people want to join /
_rMiles Bryant, Chris Evans, and Suhail Patel --
_tStudy of human factors and team culture to improve paper fatigue /
_rDaria Barteneva --
_tOptimize for MTTBTB (mean time to back to bed) /
_rSpike Lindsey --
_tMitigating and preventing cascading failures /
_rRita Lu --
_tOn-call health: the metric you could be measuring /
_rCaitie McCaffrey --
_tThe SRE as a diplomat /
_rJohnny Boursiquot --
_tTest your disaster plan /
_rTanya Reilly --
_tWhy training matters to an SRE practice and SRE matters to your training program /
_rJennifer Petoff --
_tThe power of uniformity /
_rChris Evans, Suhail Patel, and Miles Bryant --
_tBytes per user value /
_rArshia Mufti --
_tMake your engineering blog a priority /
_rAnita Clarke --
_tDon't let anyone run code in your context /
_rJohn Looney --
_tTrading places: SRE and product /
_rShubheksha Jalan --
_tYou see teams, I see product /
_rAvleen Vig --
_tThe performance emergency fund /
_rDawn Parzych --
_tImportant but not urgent: roadmaps for SREs /
_rLaura Nolan --
_tThe future of SRE.
_tThat 50% thing /
_rTanya Reilly --
_tFollowing the path of safety-critical systems /
_rHeidy Khlaaf --
_tThe importance of formal specification /
_rHillel Wayne --
_tRisk and rot in sociotechnical systems /
_rLaura Nolan --
_tSRE in crisis /
_rNiall Murphy --
_tExpected risk limitations /
_rBlake Bisset --
_tBeyond local risk: accounting for Angry Birds /
_rBlake Bisset --
_tA word from software safety nerds /
_rJ. Paul Reed --
_tIncidents: a window into Gaps /
_rLorin Hochstein --
_tThe third age of SRE /
_rBjörn "Beorn" Rabenstein.
520 _a"Site reliability engineering (SRE) is more relevant than ever. Knowing how to keep systems reliable has become a critical skill. With this practical book, newcomers and old hats alike will explore a broad range of conversations happening in SRE. You'll get actionable advice on several topics, including how to adopt SRE, why SLOs matter, when you need to upgrade your incident response, and how monitoring and observability differ. Editors Jaime Woo and Emil Stolarsky, co-founders of Incident Labs, have collected 97 concise and useful tips from across the industry, including trusted best practices and new approaches to knotty problems. You'll grow and refine your SRE skills through sound advice and thought-provoking questions that drive the direction of the field."--
650 0 _aReliability (Engineering)
650 0 _aEngineering
_xManagement.
650 0 _aCyberinfrastructure
_xManagement.
650 0 _aAllied operations
700 1 _aStolarsky, Emil
_eeditor
700 1 _aWoo, Jaime
_eeditor
856 _uhttps://www.google.co.in/books/edition/97_Things_Every_SRE_Should_Know/CdwKEAAAQBAJ?hl=en&gbpv=1&dq=97+things+every+SRE+should+know&printsec=frontcover
906 _a7
_bcbc
_ccopycat
_d2
_encip
_f20
_gy-gencatlg
942 _2ddc
_cBK
999 _c171376
_d171376