Introduction to Data Engineering Coursera Quiz Answers 2022

Introduction to Data Engineering Week 01 Quiz Answers

Q1. A modern data ecosystem includes a network of continually evolving entities. It includes:

  • Data sources, databases, and programming languages
  • Social media sources, data repositories, and APIs
  • Data providers, databases, and programming languages
  • Data sources, enterprise data repository, business stakeholders, and tools, applications, and infrastructure to manage data

Q2. Data Engineers work within the data ecosystem to:

  • Develop and maintain data architectures
  • Analyze data for actionable insights
  • Analyze data for deriving insights
  • Provide business intelligence solutions by monitoring data on different business functions

Q3. The goal of data engineering is to make quality data available for fact-finding and decision-making. Which one of these statements captures the process of data engineering?

  • Processing data and making it available to users securely
  • Collecting, processing, and storing data
  • Collecting, processing, and making data available to users securely
  • Collecting, processing, storing, and making data available to users securely

Q4. Data extracted from disparate sources can be stored in:

  • Data Lakes only
  • Databases only
  • Databases, data warehouses, data lakes, or any other type of data repository
  • Data Warehouses only

Q5. From the provided list, select the three emerging technologies that are shaping today’s data ecosystem.

  • Cloud Computing, Machine Learning, and Big Data
  • Big Data, Internet of Things, and Dashboarding
  • Machine Language, Cloud Computing, and Internet of Things
  • Cloud Computing, Internet of Things, and Dashboarding

Graded Quiz 02 Answers

Q1. Which one of these functional skills is essential to the role of a Data Engineer?

  • Inspect analytics-ready data for deriving insights
  • Proficiency in Mathematics
  • The ability to work with the software development lifecycle
  • Proficiency in working with ETL Tools

Q2. Oracle Exadata, IBM Db2 Warehouse on Cloud, IBM Netezza Performance Server, and Amazon RedShift are some of the popular __________________ in use today.

  • NoSQL Databases
  • Data Warehouses
  • Big Data Platforms
  • ETL Tools

Q3. Data Engineers manage the infrastructure required for the ingestion, processing, and storage of data.

  • True
  • False

Introduction to Data Engineering Week 02 Quiz Answers

Graded Quiz 01 Answers

Q1. There are two main types of data repositories – Transactional and Analytical. For high-volume day-to-day operational data such as banking transactions, Transactional, or OLTP, systems are the ideal choice.

  • True
  • False

Q2. Which of the following is an example of unstructured data?

  • Zipped files
  • XML
  • Spreadsheets
  • Video and Audio files

Q3. Which one of these file formats is independent of software, hardware, and operating systems, and can be viewed the same way on any device?

  • XML
  • XLSX
  • Delimited text file
  • PDF

Q4. Which data source can return data in plain text, XML, HTML, or JSON among others?

  • APIs
  • XML
  • Delimited text file
  • PDF

Q5. In the data engineer’s ecosystem, languages are classified by type. What are shell and scripting languages most commonly used for?

  • Manipulating data
  • Automating repetitive operational tasks
  • Building apps
  • Querying data

Graded Quiz 02 Answers

Q1. Data Marts and Data Warehouses have typically been relational, but the emergence of what technology has helped to let these be used for non-relational data?

  • NoSQL
  • SQL
  • Data Lake
  • ETL

Q2. What is one of the most significant advantages of an RDBMS?

  • Enforces a limit on the length of data fields
  • Can store only structured data
  • Is ACID-Compliant
  • Requires source and destination tables to be identical for migrating data

Q3. Which one of the NoSQL database types uses a graphical model to represent and store data, and is particularly useful for visualizing, analyzing, and finding connections between different pieces of data?

  • Key value store
  • Document-based
  • Column-based
  • Graph-based

Q4. Which of the data repositories serves as a pool of raw data and stores large amounts of structured, semi-structured, and unstructured data in their native formats?

  • Data Warehouses
  • Data Marts
  • Relational Databases
  • Data Lakes

Q5. While data integration combines disparate data into a unified view of the data, a data pipeline covers the entire data movement journey from source to destination systems, and ETL is a process within data integration.

  • True
  • False

Graded Quiz 03 answers

Q1. What does the attribute “Veracity” imply in the context of Big Data?

  • Scale of data
  • The speed at which data accumulates
  • Accuracy and conformity of data to facts
  • Diversity of the type and sources of data

Q2. ______________, in the context of Big Data, is the speed at which data accumulates.

  • Velocity
  • Volume
  • Value
  • Variety

Q3. What does the attribute “Value” imply in the context of Big Data?

  • The diversity of the type and sources of data
  • The accuracy and conformity of data to facts
  • Our ability and need to turn data into value
  • The speed at which data accumulates

Q4. Apache Spark is a general-purpose data processing engine designed to extract and process Big Data for a wide range of applications. What is one of its key use cases?

  • Consolidate data across the organization
  • Perform complex analytics in real-time
  • Scalable and reliable Big Data storage
  • Fast recovery from hardware failures

Q5. Which of the Big Data processing tools is used for reading, writing, and managing large data set files that are stored in either HDFS or Apache HBase?

  • Hive
  • ETL
  • Hadoop
  • Spark

Introduction to Data Engineering Week 03 Quiz Answers

Graded Quiz 01 Answers

Q1. Which one of these steps is an intrinsic part of the “Data Processing Layer” of a data platform?

  • Transform and merge extracted data, either logically or physically
  • Read data in batch or streaming modes from storage and apply transformations
  • Deliver processed data to data consumers
  • Transfer data from data sources to the data platform in streaming, batch, or both modes

Q2. Systems that are used for capturing high-volume transactional data need to be designed for high-speed read, write, and update operations.

  • True
  • False

Q3. What is the role of “Network Access Control” systems in the area of network security?

  • To inspect incoming network traffic for intrusion attempts and vulnerabilities
  • To ensure attackers cannot tap into data while it is in transit
  • To ensure endpoint security by allowing only authorized devices to connect to the network
  • To create silos, or virtual local area networks, within a network so that you can segregate your assets

Q4. ____________ ensures that users access information based on their roles and the privileges assigned to their roles.

  • Firewalls
  • Authorization
  • Authentication
  • Security Monitoring

Q5. Security Monitoring and Intelligence systems:

  • Ensure users access information based on their role and privileges
  • Create virtual local area networks within a network so that you can segregate your assets
  • Create an audit history for triage and compliance purposes
  • Ensure only authorized devices can connect to a network

Graded Quiz 02 Answers

Q1. Web scraping is used to extract what type of data?

  • Data from news sites and NoSQL databases
  • Images, videos, and data from NoSQL databases
  • Text, videos, and images
  • Text, videos, and data from relational databases

Q2. ___________ focuses on cleaning the database of unused data and reducing redundancy and inconsistency.

  • Data Profiling
  • Normalization
  • Denormalization
  • Data Visualization

Q3. OpenRefine is an open-source tool that allows you to:

  • Transform data into a variety of formats such as TSV, CSV, XLS, XML, and JSON
  • Use add-ins such as Microsoft Power Query to identify issues and clean data
  • Enforces applicable data governance policies automatically
  • Automatically detect schemas, data types, and anomalies

Q4. When you’re combining rows of data from multiple source tables into a single table, what kind of data transformation are you performing?

  • Unions
  • Denormalization
  • Normalization
  • Joins

Q5. When you detect a value in your data set that is vastly different from other observations in the same data set, what would you report that as?

  • Syntax error
  • Outlier
  • Irrelevant data
  • Missing value

Graded Quiz 03 Answers

Q1. What are some of the querying techniques you can apply to identify extreme values in a data column

  • Performing partial matches of data values
  • Slicing a data set
  • Maximum and Minimum values in a data column
  • Aggregation

Q2. You can perform partial matches of data values in a data column using:

  • Count function
  • Average function
  • Filtering patterns
  • Slicing a data set

Q3. Tools for ______________ break up a job into a series of logical steps which are monitored for completion and time to completion.

  • Application Performance Monitoring
  • Monitoring Query Performance
  • Job-level Runtime Monitoring
  • Monitoring the amount of data being processed in a data pipeline

Q4. Database partitioning helps optimize databases for performance. It does this by:

  • Reducing inconsistencies and anomalies in data
  • Dividing large tables into smaller individual tables
  • Tracking request response time and error messages
  • Minimizing the number of times a disk needs to be accessed when a query is processed

Q5. Database normalization is a design technique that helps reduce inconsistencies and anomalies from data.

  • True
  • False

Graded Quiz 04 Answers

Q1. In which phase of the data lifecycle do you establish the data you need, the amount of data you need, and how you intend to use the data you are collecting

  • Data Acquisition
  • Data Retention
  • Data Sharing
  • Data Processing

Q2. The process of _____________ abstracts the presentation layer without changing the data in the database physically.

  • Anonymization
  • Encryption
  • Pseudonymization
  • Data Profiling

Introduction to Data Engineering Week 04 Quiz Answers

Graded Quiz Answers

Q1. Data Engineering is a highly technical field. While communication, collaboration, and project management skills are somewhat useful, you don’t need these skills in order to grow in your role as a data engineer.

  • True
  • False

Q2. As a Lead Data Engineer what are some of the things you may be responsible for in addition to your hands-on skills?

  • Converting business requirements into technical specifications
  • Identify correlations, find patterns, and apply statistical methods to analyze and mine data
  • Visualize data to interpret and present the findings of data analysis
  • Provide business intelligence solutions by extracting insights from data

Q3. What are some of the factors that influence your growth on your journey from an Associate Data Engineer to a Principal Data Engineer role?

  • Domain specialization, such as in Healthcare, Banking, and Technology
  • If you spend enough time at one level, you are bound to grow into the next level role in a couple of years.
  • A Master’s degree in either Mathematics or Statistics
  • The amount of experience you gain within your chosen area of specialization and your understanding of other areas within data engineering

Q4. If you are an IT Support Specialist or a Software Tester gaining an entry into the field of data engineering will not be possible for you.

  • True
  • False

Q5. If you have basic familiarity with coding, you can develop some baseline technical skills that can get you started on your journey as a Data Engineer. What are some of these baseline skills?

  • Designing data pipelines
  • Familiarity with Big Data processing tools
  • Knowledge of operating systems, databases, and programming and query languages
  • Architecting data warehouses
Get all Course Quiz Answers of IBM Data Warehouse Engineer Professional Certificate

Introduction to Data Engineering Coursera Quiz Answers

Introduction to Relational Databases (RDBMS) Quiz Answers

Databases and SQL for Data Science with Python Quiz Answers

Hands-on Introduction to Linux Commands and Shell Scripting Quiz Answers

Relational Database Administration (DBA) Quiz Answers

ETL and Data Pipelines with Shell, Airflow and Kafka Quiz Answers

Getting Started with Data Warehousing and BI Analytics Quiz Answers

Team Networking Funda
Team Networking Funda

We are Team Networking Funda, a group of passionate authors and networking enthusiasts committed to sharing our expertise and experiences in networking and team building. With backgrounds in Data Science, Information Technology, Health, and Business Marketing, we bring diverse perspectives and insights to help you navigate the challenges and opportunities of professional networking and teamwork.

Leave a Reply

Your email address will not be published. Required fields are marked *