Building Batch Data Pipelines on GCP | Quiz Answers

About Building Batch Data Pipelines on GCP Course

Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform, or Extract-Transform-Load paradigms. Building Batch Data Pipelines on the GCP course describes which paradigm should be used and when for batch data.

Furthermore, this course covers several technologies on Google Cloud Platform for data transformation including BigQuery, executing Spark on Cloud Dataproc, pipeline graphs in Cloud Data Fusion, and serverless data processing with Cloud Dataflow.

Enroll on Coursera

Building Batch Data Pipelines on GCP – All Quiz Answers

EL, ELT, ETL Quiz Answers

Q1. Which of the following is the ideal use case for Extract and Load (EL)

  • Ans: Scheduled periodic loads of log files (e.g. once a day)

Executing Spark on Cloud Dataproc Quiz Answers

Q1. Which of the following statements are true about Cloud Dataproc?

  • Lets you run Spark and Hadoop clusters with minimal administration
  • Helps you create job-specific clusters without HDFS

Q2. Match each of the terms with what they do when setting up clusters in Cloud Dataproc:

Term Definition

__ 1. Zone – A. Costs less but may not be available always

__ 2. Standard Cluster mode – B. Determines the Google data center where compute nodes will be

__ 3. Preemptible – C. Provides 1 master and N workers

  • B
  • C
  • A

Q3. Cloud Dataproc provides the ability for Spark programs to separate compute & storage by:

  • Reading and writing data directory from/to Cloud Storage

Cloud Data Fusion and Cloud Composer Quiz Answers

Q1. Cloud Data Fusion is the ideal solution when you need

  • to build visual pipelines

Data Processing with Cloud Dataflow Quiz Answers

Q1. Which of the following statements are true?

  • Dataflow executes Apache Beam pipelines
  • Dataflow transforms support both batch and streaming pipelines

Q2. Match each of the Dataflow terms with what they do in the life of a dataflow job:

Term Definition

__ 1. Transform A. Output endpoint for your pipeline

__ 2. PCollection B. A data processing operation or step in your pipeline

__ 3. Sink C. A set of data in your pipeline

  • B
  • C
  • A

More Quiz Answers

MLOps (Machine Learning Operations) Fundamentals | Quiz Answers

Security Best Practices in Google Cloud | Quiz Answers

Leave a Reply

Your email address will not be published. Required fields are marked *