Professional Cloud DevOps Engineer
π Passed: March 30, 2025
Exam Reviewβ
I took the exam on March 30, 2025, and passed successfully. It was a baptism by fire with a doubleheader, but I made it through.
## Exam Summary
- 50 questions. I finished the first pass in about 50 minutes and submitted after a 10-minute review.
- I estimate I scored around 90%.
## Exam Impressions
- Most questions were either directly from the Udemy course or could be answered immediately with a solid understanding of SRE principles.
- There might have been some superficial questions from the Architect or other exams, but most were foundational.
- The first half felt easy, but some parts in the second half made me a bit anxious, so it's possible I just got lucky with familiar questions.
## Topics Covered
- Agent Policies β This came up about twice, and I realized my understanding was lacking.
- Private pools
- Cloud Build (in general)
- Foundational SRE concepts
- A few questions on IAM and GKE service accounts
- Questions on defining SLOs with a margin (e.g., 50β60, 100β110...).
Exam Information (March 30, 2025) β π Passedβ
- Exam Name: Professional Cloud DevOps Engineer
- Date: March 30, 2025
- Time: 03:45 PM
Exam Day TODO (Best Practices from PSE / PNE / PDE / PCA Exams) ββ
The Day Before:
- Get a good night's sleep.
- Set up eye mask, earplugs, and pillow.
On the Day:
-
Wake up by 9 AM (being well-rested is crucial).
- β Woke up at 11 AM.
-
Final review at a cafe.
- Doutor Odori store (for the mood).
- Review incorrect answers from the Official Practice Exam.
- Review weak topic areas.
- Review incorrect answers from practice question sets.
- Read through other important articles.
-
Leave the cafe at 14:00.
-
Print the exam confirmation email.
- Forward the email to my phone app.
-
Take a 10-minute nap before arriving at the venue to refresh my brain.
- Also, consume enough sugar.
-
Arrive at the venue by 15:00 (30 minutes before the test starts) and complete check-in.
- Remember to read the options first.
- Remember to save time for review.
π₯ Study Strategy π₯β
- Read the official reference materials:
Learning Strategy:β
-
Important Materials:
- Official Exam Guide
- π‘ SRE Book Updates, by Topic | This is a must-read as it's a free summary of three Google SRE books.
- sre.google/books | Read the three books online for free.
- Best practices for speeding up builds
-
Practice Questions:
-
Whizlabs (Free Questions Only):
- 1st Round: Mar 24, 2025 | 73%
- 2nd Round:
-
Udemy Practice Exams:
- Link to course
- 1st Round:
- Practice Test 1 | Mar 23, 2025 | 52%
- Practice Test 2 | Mar 27, 2025 | 53%
- Practice Test 3 | Mar 25, 2025 | 43%
- Review:
- Practice Test 1
- Practice Test 2
- Practice Test 3
- 2nd Round:
- Practice Test 1 | Mar 28, 2025 | 73%
- Practice Test 2 | Mar 29, 2025 | 81%
- Practice Test 3 | Mar 29, 2025 | 72%
-
Official Practice Exam:
- Link
- Mar 24, 2025 | 70%
- Question 12 - The answer in the practice set might differ from the official practice exam.
- | |
-
-
If time permits, read Google SRE articles:
Weak Areasβ
Overview of private pools (Cloud Build)
Private pools are private, dedicated pools of workers that offer greater customization over the build environment, including the ability to access resources in a private network. Like default pools, private pools are hosted and fully managed by Cloud Build, and can scale up to and down to zero with no infrastructure to set up, upgrade, or scale. Because private pools are customer-specific resources, you can configure them in more ways.
Using a cached Docker image (Cloud Build)
The simplest way to speed up a Docker image build is to specify a cached image that can be used for subsequent builds. You can specify the cached image by adding the
--cache-from
argument in your build config file, which will instruct Docker to build using that image as a cache source.
Caching directories with Google Cloud Storage (Cloud Build)
To increase the speed of your build, you can reuse the results from a previous build. You can do this by copying the results from a previous build to a Google Cloud Storage bucket, making calculations from those results, and then copying the new results back to the bucket. Use this method when your build takes a long time, and produces a small number of files that do not take long to copy to and from Google Cloud Storage.
Unlike the Docker-specific
--cache-from
, Google Cloud Storage caching can be used for any builder supported by Cloud Build.
Use a separate organization for experimenting (Organizations)
For experimental activities, use a separate organization as a sandbox environment. Using a separate organization allows you to work without being constrained by the policies, configurations, and automation that you use in your production organization.
Do not use a staging organization for testing. A staging environment must use similar IAM policies and organization policies as your production organization. Therefore, a staging environment might have the same limitations as your production environment. At the same time, loosening policies to allow for testing would defeat the purpose of the staging organization.
images (Cloud Build)
The
images
field in the build config file specifies one or more Linux Docker images that Cloud Build will push to Artifact Registry or Container Registry. You might have a build that runs tasks without producing a Linux Docker image; however, if you do not build and push an image to a registry, the image is discarded upon build completion. If any of the specified images are not produced at build time, the build will fail. For more information on storing images, see Storing artifacts in Artifact Registry.
artifacts (Cloud Build)
The
artifacts
field in the build config file specifies one or more non-container artifacts to store in Cloud Storage. For more information on storing non-container artifacts, see Storing build artifacts in Cloud Storage.
Log sampling rate (VPC Flow Logs)
- For VMs, 50% of the log entries are kept by default. You can set this parameter from
1.0
(100%, keep all log entries) to0.0
(0%, keep no logs).- For VLAN attachments and Cloud VPN tunnels, 100% of the log entries are kept by default. You can set this parameter from
1.0
to a value greater than0.0
.
Agent policies (Google Cloud CLI)
You can create and manage agent policies by using the gcloud beta compute instances ops-agents policies command group in the Google Cloud CLI or by using the agent-policy Terraform module. Agent policies manage OS policies by using the Compute Engine VM Manager tool suite. These policies automate the deployment and maintenance of software configurations, such as the Google Cloud Observability agents (the Ops Agent, the legacy Monitoring agent, and the legacy Logging agent).
Configure Terraform to store state in a Cloud Storage bucket (Terraform)
By default, Terraform stores state locally in a file named
terraform.tfstate
. This default configuration can make Terraform difficult for teams to use, especially when many users run Terraform at the same time and each machine has its own understanding of the current infrastructure.To help you avoid these issues, this section shows you how to configure a remote state that points to a Cloud Storage bucket. Remote state is a feature of backends, and in this tutorial, is configured in the
backend.tf
file. For example: ...
IAM Conditions overview (IAM Conditions)
Example date/time expression When you use the following condition expression in a role binding, it grants access until midnight on January 1, 2021.
Building VM images with Packer (Cloud Build)
Packer is an open-source tool for creating identical machine images for multiple platforms from a single source configuration. This page shows you how to create a VM image to be used in Compute Engine by Packer and Cloud Build.
Triggering on Pub/Sub Messages (Spinnaker)
Important to recognize the integration between Pub/Sub and Spinnaker.
create_before_destroy (Terraform)
create_before_destroy
(bool) - By default, when Terraform must change a resource argument that cannot be updated in-place due to remote API limitations, Terraform will destroy the existing object and then create a new replacement object with the new configured arguments.The
create_before_destroy
meta-argument changes this behavior so that the replacement object is created first, and the original object is destroyed after.
Summary of Common HTTP Status Errorsβ
HTTP Status Code | Error Message | Context/Cause |
---|---|---|
400 | Bad Request | The request format is invalid (e.g., missing parameters, invalid values). |
401 | Unauthorized | Invalid authentication credentials or insufficient access permissions. |
403 | Forbidden | Access is denied, often due to insufficient IAM permissions. |
404 | Not Found | The specified resource does not exist (e.g., incorrect resource ID). |
409 | Conflict | Resource conflict, such as trying to create an instance with a name that already exists. |
429 | Too Many Requests | The API request rate limit has been exceeded. |
500 | Internal Server Error | An unexpected error occurred on the server side. |
502 | Bad Gateway | An error occurred at a gateway or proxy server. |
503 | Service Unavailable | The service is down for maintenance or is temporarily unavailable. |
504 | Gateway Timeout | The request timed out (e.g., after waiting a long time for a response). |
Reported invalid URLs in Udemy on March 29, 2025.
Practice Test 2: Still Invalid URLs References
You oversee an application operating within Google Kubernetes Engine (GKE) and employing the blue/green deployment approach. Portions of the Kubernetes manifests are provided below: ...
Practice Test 2: The following correct answer choice is wrong:
Your organization aims to elevate the availability target of an application from 99.9% to 99.99% with a $2,000 investment. ...