Skip to main content

Professional Cloud Data Engineer

πŸ† Passed: December 28, 2024

Related:

Exam Information (December 28, 2024)​

Reason for taking: My exam voucher was expiring on this date. I chose this exam as I had the most knowledge in this area and the best chance of passing. I opted for the remote proctored exam due to a tight schedule.

  • Exam Name: Google Cloud Certified - Professional Data Engineer (Japanese)
  • Date: December 28, 2024
  • Time: 09:15 AM
  • Method: Remote Proctored Online Exam

Preparation:

  • Prepared to take the exam in my kitchen. Decided it was best to hang a sheet on the wall to create a clear background.
  • Download the secure browser (to be done at home).
  • ID needed: Driver's license, insurance card, credit card.

Exam Notes (Passed on December 28, 2024)​

This section contains notes and resources I found important either during or immediately after the exam.

Key Topics & Resources Reviewed​

Here are links to the official documentation for topics that were prominent in the exam.

Exam Impressions & Key Takeaways​

  • The corporate case study questions have been removed from the exam format.
  • There were numerous applied questions related to windowing functions.
  • Many questions focused on the system architecture of Dataproc.
  • Topics like networking, deployment automation, and security were frequently integrated with data service questions.

Post-Exam Clarification​

  • Question: In Dataproc, are intermediate datasets stored on HDFS or in memory?
  • Answer: It's recommended to use HDFS for intermediate datasets to ensure persistence.

πŸ”₯ Study Strategy πŸ”₯​

After failing the networking exam:


Self-Study: Memos for Creating Practice Questions with Prompts​

November 30, 2024

  • I have configured a custom GPT for this. However, during use, I find it necessary to reinforce the prompt as needed. For instance, the quality can degrade in longer outputs, and it's often better to start a new chat for subsequent prompts. These are AI weaknesses that need to be managed.

Self-Study: Final Working Prompt

  • As a last resort, I sometimes had to include the specific question number and text in the prompt to get the desired output.
"Regarding the questions in the 'Knowledge Base' document, under 'Chapter 4: Building and Operating Data Pipelines (Questions 151-200)', please improve the next five questions starting from question 165 as instructed. You must 'search' the 'Knowledge Base' for confirmation to correctly understand the correspondence between the original and improved questions and the intent of the instructions (e.g., format and explanation improvements).
- The explanations should be detailed and helpful for a Google Cloud beginner.
- In the explanation section, for any important Google Cloud service names that appear, write them in the format 'Official Name (e.g., Japanese Translation if applicable)'."

---
Reference information for searching within the 'Knowledge Base':
Question 165
What is the most common concept used for state management in streaming processing in Dataflow?
Self-Study: GPTs Tips​

Most Effective Method for Long Text Generation

  • When the output becomes erratic, create a new chat and prompt again.
    • β†’ This seems more effective than refining the prompt, possibly because it reduces the system's context load.

Second Best Method for Long Text Generation

  • Split the source material into smaller chunks before uploading it to the knowledge base.
    • β†’ I suspect this is particularly effective for modification and formatting tasks that need to follow the order and content of the original material.

The "Magic Words" (Prompt) for Contradictions:

  • "You must 'search' the 'Knowledge Base' for confirmation."
    • β†’ By forcing the GPT to search the registered documents, it's possible to generate output that better adheres to the format, content, order, and instructions of the reference material.

Fundamental Strategy

  • Make prompts specific and precise.
    • β†’ In GPTs, you can pre-register instructions, so it's best to create clear instructions from the start.
    • β†’ It's also important to supplement the registered instructions in the prompt window during the task.
Self-Study: Tips for Generating Applied Questions​

December 14, 2024

Improving Practice Exams:

Copy the official practice exam and save it as a text file. β†’ Upload it to GPTs and clean up the formatting. β†’ Set a system prompt in Gemini (Studio) and progressively improve the questions.

Improving Foundational Questions:

Generate a set of basic questions based on the exam guide using GPTs. β†’ Set a system prompt in Gemini (Studio) and progressively develop them into more applied, scenario-based questions.