Passed exam, w00t
This commit is contained in:
parent
bd5304a097
commit
9765410f11
@ -318,7 +318,7 @@ So, if we take A = S/T, S = A*T and change our target availability to 99.8%
|
|||||||
<br>
|
<br>
|
||||||
100% - 99.8% = 0.2%, 0.2% (in decimal) == 0.002
|
100% - 99.8% = 0.2%, 0.2% (in decimal) == 0.002
|
||||||
<br>
|
<br>
|
||||||
Sucessful Requests (Really this is allowed errors based on target availability) = 0.002 * 1,000,000 = 2000 Errors, 1,000,000 - 2000 = 998,000 Successful requests
|
Successful Requests (Really this is allowed errors based on target availability) = 0.002 * 1,000,000 = 2000 Errors, 1,000,000 - 2000 = 998,000 Successful requests
|
||||||
<br>
|
<br>
|
||||||
|
|
||||||
##### Error budgets, what are they good for?
|
##### Error budgets, what are they good for?
|
||||||
|
|||||||
10
Part_4.md
10
Part_4.md
@ -822,14 +822,14 @@ sudo service google-fluentd start
|
|||||||
|
|
||||||
<u>Logs Viewer Query Interface</u>
|
<u>Logs Viewer Query Interface</u>
|
||||||
|
|
||||||
- View logsa through queries
|
- View logs through queries
|
||||||
- Basic and Advanced query interface
|
- Basic and Advanced query interface
|
||||||
- Basic
|
- Basic
|
||||||
- Dropdown menus - simple searches
|
- Dropdown menus - simple searches
|
||||||
- Advanced
|
- Advanced
|
||||||
- View across log categories - advanced search capabilities
|
- View across log categories - advanced search capabilities
|
||||||
|
|
||||||
<u>Basic and Advanced Filter Queries
|
<u>Basic and Advanced Filter Queries</u>
|
||||||
- Different query formats
|
- Different query formats
|
||||||
- Search field syntax fifferent for each method
|
- Search field syntax fifferent for each method
|
||||||
- Basic query
|
- Basic query
|
||||||
@ -915,7 +915,7 @@ sudo service google-fluentd start
|
|||||||
#### Routing and Exporting Logs
|
#### Routing and Exporting Logs
|
||||||
|
|
||||||
- Main premise - route a copy of logs from Cloud Logging to somewhere else
|
- Main premise - route a copy of logs from Cloud Logging to somewhere else
|
||||||
- BigQuery, Clous Storage, Pub/Sub, another logging bucket and more
|
- BigQuery, Cloud Storage, Pub/Sub, another logging bucket and more
|
||||||
- Can export all logs, or certain logs based on defined criteria
|
- Can export all logs, or certain logs based on defined criteria
|
||||||
|
|
||||||
<u>Why Route/Export logs?</u>
|
<u>Why Route/Export logs?</u>
|
||||||
@ -1017,7 +1017,7 @@ Custom logs based distribution metrics
|
|||||||
- https://sre.google/workbook/alerting-on-slos/
|
- https://sre.google/workbook/alerting-on-slos/
|
||||||
|
|
||||||
<u>Alerts Review - Why we need them</u>
|
<u>Alerts Review - Why we need them</u>
|
||||||
- Somethign is not working correctly
|
- Something is not working correctly
|
||||||
- Action is necessary to fix it
|
- Action is necessary to fix it
|
||||||
- Alerts inform relevant personnel that action is necessary when specified conditions met
|
- Alerts inform relevant personnel that action is necessary when specified conditions met
|
||||||
|
|
||||||
@ -1038,7 +1038,7 @@ Precision | Recall | Detection time | Reset time
|
|||||||
- Reset time: How long alerts persist after issue is resolved
|
- Reset time: How long alerts persist after issue is resolved
|
||||||
- Longer reset time = confusion/'white noise'
|
- Longer reset time = confusion/'white noise'
|
||||||
|
|
||||||
<u>How to we balance these parameters?</u>
|
<u>How do we balance these parameters?</u>
|
||||||
- Window Length: Time period measured
|
- Window Length: Time period measured
|
||||||
- % of errors over (x) time period
|
- % of errors over (x) time period
|
||||||
- Example: average CPU utilization per minute vs. per hour
|
- Example: average CPU utilization per minute vs. per hour
|
||||||
|
|||||||
18
exam.md
Normal file
18
exam.md
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
|
||||||
|
Practice these scenario type questions:
|
||||||
|
1. You look after a system with a well-defined SLO...
|
||||||
|
2. You look after a web-site that's experiencing latency xyz?
|
||||||
|
3. How would you define an SLI for an application where you've tracked down high latency to the record generating systems, where the Persistent disk is resized to fix?
|
||||||
|
1. IO
|
||||||
|
2. A proportion
|
||||||
|
3.
|
||||||
|
4. Kubernetes deployment strategy to roll out new version of application to half of the web nodes
|
||||||
|
1. StatefulSet
|
||||||
|
2. ReplicaSet
|
||||||
|
3. Rolling-release with ???
|
||||||
|
5. How to track billing of systems?
|
||||||
|
1. Simply examine them within the cloud console?
|
||||||
|
2. Add labels to groupings of resources and export to big query?
|
||||||
|
6. So as not to impact 3rd-party developers and users, how would you plan the roll-out of an updated API?
|
||||||
|
1. Steps to follow e.g. Announce new api, notify user stil using old one, deprecate old one, provide support
|
||||||
|
2.
|
||||||
Loading…
Reference in New Issue
Block a user