ETL and Data Pipelines with Shell, Airflow and Kafka Quiz Answer

All Weeks ETL and Data Pipelines with Shell, Airflow and Kafka Quiz Answer

After taking this course, you will be able to describe two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application.

Enroll on Coursera

ETL and Data Pipelines with Shell, Airflow and Kafka Week 01 Quiz Answer

Graded Quiz: ETL and ELT Processes

Q1. ETL process consists of Extract > Transform > Load. Which of these three processes is also known as data wrangling? 

  • Load 
  • Extraction 
  • Data wrangling is a term for another data warehouse process  
  • Transform 

Q2. The ELT process has no information loss. What is the main reason for this benefit?

  • Separates the data pipeline from processing
  • Data replication
  • Separation between moving and processing data
  • Data source integration

Q3. ETL processes include a storage facility called a staging area. In ELT the staging area fits the description of what?

  • Data mart
  • Electronic repository
  • Data lake
  • Data warehouse

Q4. Which of the following pain points does ELT address?

  • Lack of secure data
  • Challenges imposed by Big Data
  • Cost effectiveness
  • Request for fixed processes

Q5. There are many techniques for extracting data. Choosing the technique usually depends on what?

  • Intended use
  • Optical or analog
  • Operating system
  • Type of client

Q6. Extracting data from IoT devices involves large volumes of redundant data. What is used to decrease the data volume of redundant data and only extract features of interest from raw data?

  • Edge computing
  • SQL languages
  • APIs
  • Biometric sensors

Q7. ETL uses the schema-on-write approach and ELT uses the schema-on-read approach. What is the biggest difference in these two approaches?

  • Limited versatility vs. versatility
  • Consistency
  • Stability
  • More data access

Q8. Which of the following examples of information loss during transformation can involve false negatives?

  • Aggregation
  • Filtering
  • Lossy data compression
  • Edge computing

Q9. Which of the following loading techniques is between batch and stream loading?

  • On-demand loading
  • Incremental loading
  • Micro-batch loading
  • Parallel loading

Q10. Which of the following loading techniques can split a single file into smaller chunks?

  • Parallel loading
  • Stream loading
  • Batch loading
  • Scheduled loading

ETL and Data Pipelines with Shell, Airflow and Kafka Week 02 Quiz Answer

Graded Quiz 01 : ETL using Shell Scripts

Q1. What is the first stage of the ETL process?

  • Cleaning
  • Loading
  • Transformation
  • Extraction

Q2. Which of these transformations is correctly described?

  • Data Structuring: Fixing any errors or missing values
  • Sorting: selecting only what is needed
  • Normalizing: Converting data to common units
  • Cleaning: merging disparate data sources

Q3. Which of these is NOT an example of a system in the data load phase?

  • A scanned medical document
  • An Excel spreadsheet
  • A comma separated file
  • A data warehouse

Q4. Select the correct statement regarding ETL workflows as data pipelines.

  • Bottlenecks within the pipeline can often be handled by anonymizing slower tasks.
  • Data is fed through a data pipeline in large packets.
  • Overall accuracy of the ETL workflow has been a more important requirement than speed.
  • With conventional ETL pipelines data is processed in real time.

Q5. Select the correct statement regarding batch processing.

  • Batch processing triggers are rarely on demand.
  • Data is processed in batches, usually on a weekly schedule.
  • Batch processing intervals can be triggered by events.
  • When an event of interest occurs, such as an intruder alert, the interval would be periodic.

Q6. ETL pipelines are frequently used to integrate data from disparate and usually _____ systems within the enterprise.

  • siloed
  • batched
  • aggregating
  • simultaneous

Q7. Select the correct statement regarding Apache Airflow.

  • Apache Airflow represents the workflow in DAGs, but not in code.
  • Apache Airflow is a workflow orchestration tool.
  • Apache Airflow is a well-known commercial tool.
  • Apache Airflow tasks can be expressed as Python, but not Bash.

Q8. Bash uses _____ to turn your file into a Bash shell script.

  • loadstat
  • getstat
  • shebang
  • crontab

Q9. SSIS, Amazon Redshift, IBM InfoSphere Information Server, and Oracle GoldenGate are examples of _____.

  • Popular commercial ETL tools.
  • Popular commercial ELT tools.
  • Popular open-source ELT tools
  • Popular open-source ETL tools.

Q10. ETL jobs can be run on a schedule using _____.

  • shebang
  • crontab
  • loadstat
  • getstat

Graded Quiz 02 : An Introduction to Data Pipelines

Q1. How does data flow through pipelines?

  • Processing threads
  • Files
  • Software processes
  • Data packets

Q2. Which of the following pipeline monitoring considerations affects the amount of data that passes through the pipeline over time?

  • Throughput
  • Latency
  • Utilization
  • Logging and alerting system

Q3. Which of the following data pipelines corresponds with the fraud detection use case?

  • Streaming data pipeline
  • Batch data pipeline
  • Micro-batch data pipeline
  • Lambda architectures

Q4. Which streaming data pipeline tool allows you to build applications using the Streams Processing Language (SPL)?

  • SQLstream
  • Apache Samza
  • Apache Spark
  • IBM Streams

Q5. Pipelines that incorporate parallelism are referred to as being_____ ?

  • Aligned
  • Linear
  • Dynamic or non-linear
  • Static

Q6. Batch data pipelines usually run periodically on fixed schedules. Which of the following is another method to run these?

  • Triggers
  • Error occurrence
  • Flags
  • Manually

Q7. Which of the following common features of modern ETL and ELT products is known as “no-code”?

  • Security
  • Data crawling
  • Drag-and-drop
  • Fully automated

Q8. Which of the following data pipeline use cases is the simplest?

  • File backup
  • Raw data preparation
  • Send/receive messages
  • Transactional record movement

Q9. Latency is the total time it takes for a single packet of data to pass through the pipeline. Which of the following limits latency?

  • Small data packets
  • Bad data
  • Data leak
  • Slowest process

Q10. Micro-batch data pipelines decrease the batch size. Which of the following do micro-batch pipelines increase?

  • Latency
  • Simple transformation
  • Storage
  • Batch process refresh rate

ETL and Data Pipelines with Shell, Airflow and Kafka Week 03 Quiz Answer

Graded Quiz: Using Apache Airflow to build Data Pipelines

Q1. Apache Airflow pipelines are built on four main principles. Which of the following principles include parameterization?

  • Scalable
  • Extensible
  • Lean and explicit
  • Dynamic

Q2. Which of the following Apache Airflow use cases involves coordination of data in data warehouses?

  • Define machine learning pipeline dependencies
  • Decoupled batch processes
  • Scheduling tool
  • Orchestrate SQL transformation in data warehouses

Q3. Apache Airflow DAGs are a python script consisting of logical blocks. Which of the following logical blocks might use the ‘from airflow import DAG’ command?

  • Library imports
  • DAG definition
  • DAG arguments
  • Task pipeline

Q4. Sensors are a class of DAG operators. Which is another type of operator that defines DAG tasks?

  • Email
  • Python
  • Bash
  • All of the above

Q5. Which of the following advantages of Apache Airflow expressing workflows as code enables Git to track them?

  • Versionable
  • Testable
  • Maintainable
  • Collaborative

Q6. The ‘Task Instance Context Menu’ can be accessed from any of the DAG views that display what?

  • Tree view
  • Details
  • Task instances
  • Gantt

Q7. The final block in your Airflow pipeline script is where you specify the dependencies for your workflow. How do you specify the order of task 1 and task 2?

  • >
  • >>
  • >=
  • //

Q8. Which block specifies the DAG start date?

  • DAG definition
  • DAG arguments
  • Task definitions
  • Task pipeline

Q9. Which of the following Airflow metrics could fluctuate?

  • Timers
  • Gauges
  • Counters
  • None, they all can increase

Q10. Which of the following Apache Airflow basic components serves the interactive UI?

  • DAG directory
  • Executor
  • Scheduler
  • Web Server

ETL and Data Pipelines with Shell, Airflow and Kafka Week 04 Quiz Answer

Graded Quiz: Using Apache Kafka to build Pipelines for Streaming Data

Q1. ESPs are a middle layer between multiple event sources and destinations. ESPs may have different architectures and components but also some common components. Which of the following common components receives and consumes events?

  • Analytic engine
  • Query engine
  • Event storage
  • Event broker

Q2. The core component of any ESP is the event broker. Which event broker sub-component performs encryption on data?

  • Storage
  • Processor
  • Consumption
  • Ingester

Q3. The Kafka server side is a cluster with many associated servers. What are the associated servers called?

  • Associates
  • Sub-servers
  • Brokers
  • Controllers

Q4. Which of the following Kafka main features provides consumption without a deadline?

  • Distribution system
  • Reliability
  • Open source
  • Permanent persistency

Q5. Which of the following Kafka core components publish events into topics?

  • Partitions
  • Producers
  • Consumers
  • Brokers

Q6. Which of the Kafka CLI script files manages topics?

  • Kafka-console-producer
  • Kafka-console-consumer
  • Kafka-console
  • Kafka-topics

Q7. Which of the following is Kafka Streams API based on?

  • Java
  • Gantt chart
  • Transformational graph
  • Computational graph

Q8. Which of the following do stream processors do?

  • Extracts, transforms, and loads
  • Extracts, loads, and transforms
  • Receives, transforms, and forwards
  • Processes and forwards

Q9. Kafka Streams API is based on a computational graph called a stream processing topology. And in the topology, each node is a stream processor, while edges are the I/O streams. In this topology we find two special types of processors: What are they called?

  • Aggregation and stream processor
  • Source and sink processor
  • Stream and topic processor
  • Mapping and transformation processor

Q10. Once events are published and properly stored in topic partitions, you can create _________ to read them.

  • Partitions
  • Consumers
  • Producers
  • Brokers
ETL and Data Pipelines with Shell, Airflow and Kafka Coursera Course Review:

In our experience, we suggest you enroll in the ETL and Data Pipelines with Shell, Airflow and Kafka Coursera and gain some new skills from Professionals completely free and we assure you will be worth it.

ETL and Data Pipelines with Shell, Airflow and Kafka Coursera course is available on Coursera for free, if you are stuck anywhere between quiz or graded assessment quiz, just visit Networking Funda to get ETL and Data Pipelines with Shell, Airflow and Kafka Quiz Answers.

This Course is a part of the IBM Data Warehouse Engineer Professional Certificate

Conclusion:

I hope this ETL and Data Pipelines with Shell, Airflow and Kafka Quiz Answers would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Coursera Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about Data Science in a business context; there are no prerequisite courses.

Keep Learning!

Get all Course Quiz Answers of IBM Data Warehouse Engineer Professional Certificate

Introduction to Data Engineering Coursera Quiz Answers

Introduction to Relational Databases (RDBMS) Quiz Answers

Databases and SQL for Data Science with Python Quiz Answers

Hands-on Introduction to Linux Commands and Shell Scripting Quiz Answers

Relational Database Administration (DBA) Quiz Answers

ETL and Data Pipelines with Shell, Airflow and Kafka Quiz Answers

Getting Started with Data Warehousing and BI Analytics Quiz Answers

4 Comments

  1. Hello, you did not tell about the answers in the quiz.
    Answers are not bold in this case.
    can you please have a look and update it ?

    • Regret!
      There are few courses available with no answers.
      Unfortunately, we’ve not found answers yet, our team working on it to get answers ASAP, Contact Us or Comment if you know any answers which are not highlighted here. however, we’ve marked a few correct answers which you may find helpful!

    • Regret!
      There are few courses available with no answers.
      Unfortunately, we’ve not found answers yet, our team working on it to get answers ASAP, Contact Us or Comment if you know any answers which are not highlighted here. however, we’ve marked a few correct answers which you may find helpful!

Leave a Reply

error: Content is protected !!