Get All Week Big Data, Genes, and Medicine Coursera Quiz Answers
Table of Contents
Big Data, Genes, and Medicine Week 01 Quiz Answers
DNA, RNA, Genes, and Proteins
Q1. The sequence “GTGAGCACCGTGCTGACCTCCAAATACCGTTAAGCTGGAGCCTCGGTGGC” can be a fragment of (check only one):
- Protein
- RNA
- DNA
Q2. Compare and contrast the different types of RNA in a cell by finding the one FALSE answer among the following:
- rRNA, or ribosomal RNA, is located in the ribosomes and an integral part of them (60%).
- tRNA, or transfer RNA, assists in the mRNA translation, by carrying each new amino acid to the growing end of a protein being created.
- Each amino acid has its specific ribosomal rRNA in a ribosome.
- mRNA, or messenger RNA, is the one that specializes in coding for proteins.
Transcription and Translation Processes
Q1. What is the RNA sequence complementary to the DNA sequence “AGAATCGGGACA”
- “TCTTAGCCCTGT”
- “AGAATCGGGACA”
- “UCUUAGCCCUGU”
- “AGAAUCGGGACA”
Q2. Which of the following codons have a different amino acid associated with them (you may use the protein wheel to do the translation)?
- AUU
- AUG
- AUA
- AUC
Q3. Which of the following biological macromolecules will never contain a “Tyrosine” subunit? Choose all that apply.
- RNA
- DNA
- Protein
Data, Variables, and Big Datasets
Q1. What kind of data is represented by the Gender (Male / Female) and Insurance (Medicare / Medicaid / Blue Cross / Commercial / Other) variables? Choose all that apply. (Hint: Focus on the nature of the data contained by the variable rather than the graphical display of that data).
- Categorical
- Nominal
- Numeric
Q2. What kind of data would most likely be represented by a Body Temperature (°C) variable? Choose all that apply.
- Interval
- Ratio
- Nominal
- Ordinal
Working with cBioPortal
Q1. To answer the following question, submit a query using a “User-defined List” as input at www.cbioportal.org, with the “tcga” cancer study tag entered:
Querying JAK2 on cBioPortal, what accounts for the majority of its alteration in GBM (TCGA 2008)? Hint: see the “Overview” tab.
- Amplification
- Deletion
- Mutation
Q2. To answer the following question, submit a query using a “User-defined List” as input at www.cbioportal.org, with the “tcga” cancer study tag entered:
Querying EGFR ERBB2 ERBB3 ERBB4 on cBioPortal, which of the following mutation types does ERBB2 exhibit the most often? Hint: see the “Mutations” tab.
- Truncating
- Missense
- In-frame
Q3. To answer the following question, submit a query using a “User-defined List” as input at www.cbioportal.org, with the Breast Invasive Carcinoma (TCGA, Cell 2015) dataset selected. Then, make sure to select “mRNA Expression data z-Scores (RNA Seq V2 RSEM)” with a z-score threshold of 2.0:
Querying JAK2 on cBioPortal, how is the JAK2 gene mRNA mostly dysregulated? Hint: see the “OncoPrint” tab.
- Downregulated
- Upregulated
Big Data, Genes, and Medicine Week 02 Quiz Answers
Datasets and Files Quiz Answers
Q1. Which of the following is an example of a File?
- An Excel spreadsheet
- A cell in an Excel spreadsheet
- A row in an Excel spreadsheet
- A column in an Excel spreadsheet
Q2. A CSV file uses a special character to separate between fields. Which of the following are commonly used separator characters?
- Character “a”
- A tab
- A space
- A comma
Data Preprocessing Tasks
Q1. Data preprocessing tasks include (choose all that apply):
- Discretizing variables
- Replacing missing values
- Normalizing data
- Adding variables
- Imputing missing values
Q2. Which of the following is a type of data reduction?
- Data normalization
- Feature selection
- Data imputation
- Data understanding
Normalization and Discretization
Q1. When normalizing the dataset [10, 20, 30, 40, 50] through Z-score normalization, which of the following elements would be normalized to 0?
- 20
- 10
- 40
- 50
- 30
Q2. Consider the data [2, 4, 8, 16] discretized by equal-depth binning. Which of the following is a valid set of equal-depth bin intervals for these data?
- [2, 3], [4, 16]
- [2, 7], [8, 16]
- [1, 5], [6, 10], [11, 15], [16, 20]
- [2, 8], [9, 16]
Data Reduction
Q1. When sampling a dataset [10, 20, 30, 40, 50, 60, 70, 80, 90] with Simple Random Sampling With Replacement, which samples can we obtain? Choose all that apply.
- [10]
- [20, 40, 40]
- [30, 30, 30, 30]
- [10, 50, 60]
Q2. When sampling a dataset [10, 10, 20, 30] with Simple Random Sampling Without Replacement, which samples can we obtain? Choose all that apply.
- [10, 20, 20]
- [10, 20, 30]
- [10, 10, 20]
Working with R
Q1. The code below loads a previously installed package into R.
library(examplePackage)
Which of the following is not a conventional way to install an R package?
- From Bioconductor, or another contributed package resource
- Directly from The Comprehensive R Archive Network install. packages()
- Purchasing an annual proprietary license for CRAN
Q2. Which of the following statements about Anaconda is not true?
- Anaconda can provide a Jupyter Notebook written in Python to run scripts online.
- Anaconda provides R by default.
- Anaconda is a set of tools for data science.
Big Data, Genes, and Medicine Week 03 Quiz Answers
Feature Selection Methods Quiz Answers
Q1. With a dataset having 4 features, how many different subsets of features are there
- 16
- 8
- 4
- 2
Q2. Which of the following are examples of feature selection methods? Choose all that apply.
- Log transformation.
- Filter methods.
- Student’s t-test.
- Wrapper methods.
Differentially Expressed Genes
Q1. Which of the following statements is FALSE regarding the identification of deferentially expressed genes?
- Identifying differentially expressed genes requires more than one subject in each population examined.
- Identifying differentially expressed genes requires one and only one population (e.g., diseased subjects).
- Identifying differentially expressed genes requires the consideration of multiple expression levels across multiple subjects.
Heatmaps
Q1. Which of the following best describes the axes of a heatmap?
- One axis may show genes while the other axis may show gene expression.
- One axis may show subjects while the other axis may show gene expression.
- One axis shows genes while the other axis shows subjects.
Big Data, Genes, and Medicine Week 04 Quiz Answers
Overview Quiz Answers
Q1. At which stage of Big Data analytics are classification and prediction methods used?
- Domain understanding
- Data understanding
- Data preparation
- Model building
- Training and evaluation
- Deployment
Q2. When building a classifier, which of the following steps are required at a minimum? Choose all that apply.
- Training step
- Test step
Classification with Analogy
Q1. Which of the following correctly describes a support vector machine (SVM) algorithm? Choose all that apply.
- It uses a kernel function to solve non-linear problems as if they were linear.
- It uses an ensemble of decision trees to separate classes.
- It aims at finding an optimal hyperplane to separate objects into separate classes.
Classification based on Rules
Q1. Which of the following best describes a decision tree?
- It requires the user to specify known class labels for the training step.
- It uses the analogy of the brain organization to create a tree.
- It builds a tree that is as compact as possible proceeding from bottom to top.
Classification with Neural Networks
Q1. Which of the following correctly describes a neural network?
- An input layer, one or several hidden layers, and an output layer are required to solve complex problems.
- It uses prior probabilities to calculate the classification rule.
- It uses a kernel function to solve non-linear problems as if they were linear.
Classification based on Statistics
Q1. Which of the following are statistical methods for classification? Choose all that apply.
- Generalized Linear Models (GLM)
- Bayesian network
- Logistic regression
Classification based on Probabilities
Q1. Which of the following are names for probabilistic models? Choose all that apply
- Graphical models
- Generalized Linear Models (GLM)
- Bayesian networks
- Decision trees
Prediction Models
Q1. Consider the scenario where a researcher uses neural networks to predict the presence or absence of cancer from measured variables. This is an example of which of the following models?
- Classification
- Prediction
- Clustering
Big Data, Genes, and Medicine Week 05 Quiz Answers
Gene Alterations Quiz Answers
Q1. Which of the following are the main effects of gene alterations? Choose all that apply.
- Underexpression of the corresponding gene
- Replacement of a chromosome
- Repression of a transcript
- Overexpression of the corresponding gene
- Creation of different forms of proteins
Q2. Which of the following terms best describes a genetic event in which part of one chromosome combines with part of another chromosome?
- Silent mutation
- Translocation
- Missense mutation
- Nonsense mutation
Gene Mutations
Q1. Which terms describe the type of mutation shown in the following figure? Choose all that apply.
- A single-point mutation
- A nonsense mutation
- A truncating mutation
- A synonymous mutation
Q2. Which of the following describes an individual homozygous for Gene A?
- The individual has two different alleles for Gene A.
- The individual has two copies of the same allele for Gene A.
Copy Number Alterations
Q1. Which of the following events would most likely lead to over-expression of Gene A?
- Repression of Gene A through DNA methylation.
- A synonymous mutation in the coding portion of Gene A.
- A decrease in the copy number of Gene A.
- An increase in the copy number of Gene A.
Q2. Which of the following accurately describes the difference between gene amplification and gene copy number gain?
- The number of copies of a gene in gene amplification is much smaller than in gene copy number gain.
- The number of copies of a gene in gene amplification is much larger than in gene copy number gain.
Genomic Alterations and Gene Expressions
Q1. To answer the following question, submit a query using a “User-defined List” as input at www.cbioportal.org, with the Uterine Corpus Endometrial Carcinoma (TCGA, Nature 2013) dataset selected. Then, make sure to select “mRNA Expression data z-Scores (RNA Seq V2 RSEM)” with a z-score threshold of 2.0:
Querying EGFR on cBioPortal, what percentage of diagnosed patients have genetic mutations, copy number variations, or gene expression alterations for this gene? Hint: see the “OncoPrint” tab.
Q2. To answer the following question, submit a query using a “User-defined List” as input at www.cbioportal.org, with the Uterine Corpus Endometrial Carcinoma (TCGA, Nature 2013) dataset selected. Then, make sure to select “mRNA Expression data z-Scores (RNA Seq V2 RSEM)” with a z-score threshold of 2.0:
Querying EGFR on cBioPortal, which of the following is true about this gene? Hint: see the “OncoPrint” tab.
- Most gene alterations are truncating mutation events.
- Most gene alterations are mRNA upregulation events.
- Most gene alterations are amplification events.
- Most gene alterations are missense mutation events.
Big Data, Genes, and Medicine Week 06 Quiz Answers
Clustering Quiz Answers
Q1. Which of the following best describes a cluster?
- A type of supervised learning method.
- A group of users of data analytics services.
- A group of data within a dataset.
Q2. Which of the following are criteria used for assessing the performance of a clustering method? Choose all that apply.
- Robustness to noise
- Understandability
- Scalability
Q3. What is the Euclidean distance between the two points (1, 3) and (1, 4) in a dataset?
Answer: Euclidean distance is a distance metric used in clustering to measure the distance between data points.
Clustering Methods
Q1. Which of the following statements correctly applies to KMeans clustering? Choose all that apply.
- The centroid of a cluster is calculated as the average of the samples in that cluster.
- KMeans clustering is robust to outliers.
- KMeans automatically determines the optimal number of clusters in a dataset.
Q2. Which of the following statements correctly applies to density-based clustering? Choose all that apply.
- Clusters can have any shape.
- It is a computationally expensive clustering algorithm.
- The centroid of a cluster is calculated as the median of the samples in that cluster.
Q3. Hierarchical clustering that starts from the whole dataset and progressively creates clusters of smaller and smaller size is called:
- Divisive
- Agglomerative
Pathways Quiz Answers
Q1. A cis-regulatory circuit involves:
- Upstream regulation factors not located close to the gene being regulated.
- Only transcription factors are located close to the gene being regulated.
Q2. Which of the following are databases through which known pathways can be found? Choose all that apply.
- BIOCARTA
- REACTOME
- DBSCAN
Get All Course Quiz Answers of Software Development Lifecycle Specialization
Software Development Processes and Methodologies Quiz Answers
Agile Software Development Coursera Quiz Answers
Lean Software Development Coursera Quiz Answers
Engineering Practices for Building Quality Software Quiz Answers