8 results found Sort:
- Filter by Primary Language:
- HTML (3)
- Python (1)
- +
A curated list of Site Reliability and Production Engineering resources.
Created
2016-04-12
639 commits to master branch, last one 2 years ago
Compilation of public failure/horror stories related to Kubernetes
This repository has been archived
(exclude archived)
Created
2019-01-19
92 commits to master branch, last one 4 years ago
A collection of postmortem templates
Created
2017-05-29
35 commits to master branch, last one 2 years ago
A curated list of Site Reliability and Production Engineering Tools
sre
list
devops
awesome
monitoring
postmortem
production
post-mortem
reliability
availability
awesome-list
devops-tools
monitoring-tools
incident-responce
incident-management
reliability-engineering
service-level-agreement
service-level-objective
service-level-monitoring
site-reliability-engineering
Created
2020-03-09
213 commits to master branch, last one about a month ago
A comprehensive list of Game Design related learning materials, examples and tools.
Created
2020-06-27
19 commits to master branch, last one 9 months ago
A role-playing game for incident management training
Created
2018-06-02
59 commits to master branch, last one 3 years ago
Calculate how much downtime should be permitted in your Service Level Agreement or Objective
Created
2017-05-29
17 commits to master branch, last one 3 years ago
coredumpy saves your crash site for post-mortem debugging
Created
2024-04-21
25 commits to master branch, last one 7 months ago