Command Line Tools for Genomic Data Science Quiz Answers

Get All Weeks Command Line Tools for Genomic Data Science Quiz Answers

Command Line Tools for Genomic Data Science Week 01 Quiz Answers

Quiz 1: Module 1 Quiz

Q1. Which of the following Unix commands can be used to view the content of a file?

View

less

Q2. Which of the following commands can be used to compress the content of a file?

View

gzip

Q3. The file “months” lists each of the 12 months on a separate line and no further lines. What would be the result if the following command was run:

cat months | head -1000 | wc –l

View

Q4. What is the effect of using the pipe operator ‘|’ in a sequence of commands:

View

Act as a character separator between different shell commands, without any effects on the outcome

Q5. If typing ‘pwd’ produces “/home/userA/Coursera/L1/”, which of the following commands will list the file content of the current directory?

View

ls .

Q6. Suppose your current working directory is “/home/Coursera/L1/”, and “peach”, “apple” and “pear” are subdirectories, each containing a single file named “genome”. What would be the current directory, as reported by running the ‘pwd’ command, after each of the four commands in the sequence below:

View

cd apple
rm *
cd ../..
mv apple pl

Q7. Consider the file “seasons” with the following columns separated by spaces ‘ ‘:

January 1 winter
…
December 12 winter

View

4, 3

Q8. Your current working directory is named “Plants”. Its subdirectory “apple” contains the files “apple.genome”, “apple.samples” and “apple.genes”. What would be the result of the command rmdir apple?

View

The command will have no effect, since the directory is not empty

Q9. Suppose that you have two files, A and B, containing experiment data:

File A: File B:

geneA + geneB +
geneB + geneC +
geneC –

What would be the sequence of outputs for the commands:

(1) comm -3 A B | wc –l
(2) comm -1 -3 A B | wc –l
(3) comm -2 A B | wc –l

View

1,2,4

Q10. The current working directory contains four subdirectories named “apple”, “pear”, “peach” and “strawberry”, each with the following files: “genome”, “genes” and “samples”. Which of the following commands would extract the top line from all of the “genes” files?

View

head -1 */genes

Quiz 2: Module 1 Exam

Q1. How many chromosomes are there in the genome?

View

Q2. How many genes?

View

5453

Q3. How many transcript variants?

View

5456

Q4. How many genes have a single splice variant?

View

5450

Q5. How may genes have 2 or more splice variants?

View

Q6. How many genes are there on the ‘+’ strand?

View

2662

Q7. How many genes are there on the ‘-’ strand?

View

2791

Q8. How many genes are there on chromosome chr1?

View

1624

Q9. How many genes are there on each chromosome chr2?

View

2058

Q10. How many genes are there on each chromosome chr3?

View

1771

Q11. How many transcripts are there on chr1?

View

1625

Q12. How many transcripts are there on chr2?

View

2059

Q13. How many transcripts are there on chr3?

View

1772

Q14. How many genes are in common between condition A and condition B?

View

2410

Q15. How many genes are specific to condition A?

View

1205

Q16. How many genes are specific to condition B?

View

1243

Q17. How many genes are in common to all three conditions?

View

1608

Command Line Tools for Genomic Data Science Week 02 Quiz Answers

Quiz 1: Module 2 Quiz

Q1. Which of the following strings cannot denote a DNA sequence:

View

MASLLRG

Q2. How many lines does it take to specify:

i) one fasta sequence? and ii) one fastq sequence?

Select the best answer.

View

Fasta – a fasta header followed by any number of sequence lines; fastq – 4 lines

Q3. Which of the following is incorrect:

View

BEDtools can be used to align sequences to the genome.

Q4. Which of the following is NOT an alignment operation:

View

Cut and paste

Q5. What is the minimum number of columns that are sufficient to specify a BED format?

View

Q6. Which of the following represents the most accurate conversion into BED of the GTF record:

chr1 CLASS exon 516 811 100 + . gene_id “genA”; transcript_id “genA.1”;
chr1 CLASS exon 1001 1115 100 + . gene_id “genA”; transcript_id “genA.1”;
chr1 CLASS exon 3010 3312 100 + . gene_id “genA”; transcript_id “genA.1”
```

View

chr1 515 3312 genA.1 100 + 515 3312 0 3 296,115,303 0,485,2494

chr1 516 3312 genA + 516 3312 0 2 296,303 0,2494

Q7. Determine the number of genes, transcripts, exons per transcript, gene orientation (strand), and the length of 5’ most exon(s) from the GTF snippet below. Select the correct answer.

chr1 HAVANA gene 3205901 3671498 . - . gene_id "MUSG51951.5";
chr1 HAVANA transcript 3205901 3216344 . - . gene_id "MUSG51951.5"; transcript_id "MUST162897.1";
chr1 HAVANA exon 3213609 3216344 . - . gene_id "MUSG51951.5"; transcript_id "MUST162897.1”;
chr1 HAVANA exon 3205901 3207317 . - . gene_id "MUSG51951.5"; transcript_id "MUST162897.1
chr1 HAVANA transcript 3206523 3215632 . - . gene_id "MUSG51951.5"; transcript_id "MUST159265.1”;
chr1 HAVANA exon 3213439 3215632 . - . gene_id "MUSG51

View

Genes: 1; Transcripts: 2; Exons: 2,2; Strand: -; Length of 5’ exon(s): 2735, 2193.

Q8. Which of the following is FALSE for the following read alignments:

R1 83 chr12 9232390 255 50M = 9232180 0
ATGGCAGAGCCTAATATGTCTCCTAGAGAATGGGAGAGATGGGAAGTCAT HGHHHHHHHHHHHHHHHHHHHHHHHHHHIGIIIIHHHHHHHHHHHGHHFH NM:i:0 NH:i:1 HI:i:0
R2 97 chr12 9232391 255 28M278N22M = 9242529
0 TGGCAGAGCCTAATATGTCTCCCAAAACTGAGACAGAAGCTCGGGCAGAT D>DDDHHHHHHHHHHIHIHHHHHIHHHHIGFFGGGHHHHHHHHHHFB.F NM:i:4 NH:i:3 HI:i:0 XS:A:+ NS:i:2
R3 77 * 0 0 0 * * 0 0 CTGATATGAGGAAAGAGGATTGCTTAAGCCCAGGAGGTAGAGGCTGTACC @@@FFDFFHFFHHJJJJIJEGFGIGHHIHIIIIGCDE?D?FGGCBHDGGG

View

R2 has an exact match to the genome.

Q9. For the alignment below, which statements are FALSE? The binary encoding for 97 is 972 = 0000 0110 00012. Select all answers that apply.

R2 97 chr12 9232391 255 28M278N22M = 9242529
0 TGGCAGAGCCTAATATGTCTCCCAAAACTGAGACAGAAGCTCGGGCAGAT D>DDDHHHHHHHHHHIHIHHHHHIHHHHIGFFGGGHHHHHHHHHHFB.F NM:i:4 XS:A:+ NS:i:2

View

The alignment passes quality checks.
The sequence of the read’s mate is reverse-complemented in its alignment.

Q10. Files ‘A.bed’ and ‘B.bed’ contain the following sets of intervals:

View

Quiz 2: Module 2 Exam

Q1. How many alignments does the set contain?

View

221372

Q2. How many alignments show the read’s mate unmapped?

View

65521

Q3. How many alignments contain a deletion (D)?

View

2451

Q4. How many alignments show the read’s mate mapped to the same chromosome?

View

150913

Q5. How many alignments are spliced?

View

Q6. How many alignments does the set contain?

View

7081

Q7. How many alignments show the read’s mate unmapped?

View

1983

Q8. How many alignments contain a deletion (D)?

View

Q9. How many alignments show the read’s mate mapped to the same chromosome?

View

4670

Q10. How many alignments are spliced?

View

Q11. How many sequences are in the genome file?

View

Q12. What is the length of the first sequence in the genome file?

View

29923332

Q13. What alignment tool was used?

View

stampy

Q14. What is the read identifier (name) for the first alignment?

View

GAII05_0002:1:113:7822:3886#0

Q15. What is the start position of this read’s mate on the genome? Give this as ‘chrom:pos’ if the read was mapped, or ‘*” if unmapped.

View

Chr3:11700332

Q16. How many overlaps (each overlap is reported on one line) are reported?

View

3101

Q17. How many of these are 10 bases or longer?

View

2899

Q18. How many alignments overlap the annotations?

View

3101

Q19. Conversely, how many exons have reads mapped to them?

View

Q20. If you were to convert the transcript annotations in the file “athal_wu_0_A_annot.gtf” into BED format, how many BED records would be generated?

View

Command Line Tools for Genomic Data Science Week 03 Quiz Answers

Quiz 1: Module 3 Quiz

Q1. Which of the following statements is FALSE:

View

SNP refers to a Single Non-defined Polymorphism

Q2. Which of the following statements is FALSE:

View

The VCF format shows the changes in amino acid resulting from the nucleotide mutation, in column 3.

Q3. What program can be used to generate a list of candidate sites of variation in an exome data set:

View

samtools

Q4. In a comprehensive effort to study genome variation in a patient cohort, you sequence and call variants in the exome, whole genome shotgun and RNA-seq data from each patient. Which of the following is FALSE when comparing these three types of resources:

View

Exome sequencing comprehensively captures variants in the 3’ and 5’ UTRs of genes.

Q5. Which of the following options can be used to allow bowtie2 to generate partial alignments?

View

–local

Q6. Select the correct interpretation for the snippet of ‘mpileup’ output below.

Chr3 11700316 C 8 .$……. 8C@C;CB3
Chr3 11951491 G 16 AAAA,……aA..A C2@2BCBCCCAC2CC4

Both sites show potential variation;

View

the alternate letter for site 1 is $, and for site 2 is A;

site 1 has 8 supporting reads, and site 2 has 16

Q7. Given the set of variants described in the VCF excerpt below, which of the following is FALSE?

INFO=

INFO=

FORMAT=

FORMAT=

Chr3 11966312 . G A 15.9 . DP=5;MQ=15 GT:PL 1/1:43,9,0
Chr3 11972108 . TAAAA TAAA 32.8 . INDEL;IDV=7;IMF=0.636364;DP=11;MQ=22 GT:PL 0/1:66,0,2
Chr3 13792328 rs145271872 G T 5.5 . DP=1;MQ=40 GT

View

The alternate allele for variant 1 is A

Q8. What does the following code do:

bowtie2 –x species/species –U in.fastq | grep –v “^@” | cut –f3 | sort | uniq –c

Run bowtie2 with a set of single-end reads, reporting the top 5 alignments for a read;

then determine the number of reads mapped reverse complemented

Run bowtie2 with a set of single-end reads, allowing for local matches;

then determine the number of matches with unmapped mates

Run bowtie2 with a set of single-end reads, reporting the best alignment only;

then determine the number of matches on each genomic sequence

Run bowtie2 with a set of single-end reads, allowing for local matches;

then determine the number of exact matches on each genomic sequence

Q9. What does the following snippet of code do NOT do:

samtools mpileup –O –f genome.fa in.bam | cut –f7

View

Report in the intermediate mpileup output the qualities of all read bases aligned at that position

Q10. What does the following code do NOT do:

bcftools call –v –c –O z –o out.vcf.gz in.vcf.gz

View

Report output in compressed VCF format

Quiz 2: Module 3 Exam

Q1. How many sequences were in the genome?

Q2. What was the name of the third sequence in the genome file? Give the name only, without the “>” sign.

View

Chr3

Q3. What was the name of the last sequence in the genome file? Give the name only, without the “>” sign.

View

mitochondria

Q4. How many index files did the operation create?

View

Q5. What is the 3-character extension for the index files created?

View

bt2

Q6. How many reads were in the original fastq file?

View

147354

Q7. How many matches (alignments) were reported for the original (full-match) setting? Exclude lines in the file containing unmapped reads.

View

137719

Q8. How many matches (alignments) were reported with the local-match setting? Exclude lines in the file containing unmapped reads.

View

141044

Q9. How many reads were mapped in the scenario in Question 7?

View

137719

Q10. How many reads were mapped in the scenario in Question 8?

View

141044

Q11. How many reads had multiple matches in the scenario in Question 7? You can find this in the bowtie2 summary; note that by default bowtie2 only reports the best match for each read.

View

43939

Q12. How many reads had multiple matches in the scenario in Question 8? Use the format above. You can find this in the bowtie2 summary; note that by default bowtie2 only reports the best match for each read.

View

56105

Q13. How many alignments contained insertions and/or deletions, in the scenario in Question 7?

View

2782

Q14. How many alignments contained insertions and/or deletions, in the scenario in Question 8?

View

2614

QQ15. How many entries were reported for Chr3?

View

360295

Q16. How many entries have ‘A’ as the corresponding genome letter?

View

1150985

Q17. How many entries have exactly 20 supporting reads (read depth)?

View

1816

Q18. How many entries represent indels?

View

1972

Q19. How many entries are reported for position 175672 on Chr1?

View

Q20. How many variants are called on Chr3?

View

398

Q21. How many variants represent an A->T SNP? If useful, you can use ‘grep –P’ to allow tabular spaces in the search term.

View

392

Q22. How many entries are indels?

View

320

Q23. How many entries have precisely 20 supporting reads (read depth)?

View

Q24. What type of variant (i.e., SNP or INDEL) is called at position 11937923 on Chr3?

View

SNP

Command Line Tools for Genomic Data Science Week 04 Quiz Answers

Quiz 1: Module 4 Quiz

Q1. Which of the following is FALSE:

View

A human gene can express at most 12 splice variants.

Q2. Which of the following is FALSE about the organization of a eukaryotic gene:

View

The length of an intron cannot be a multiple of 3.

Q3. What programs could you use to align RNA-seq reads to: i) a reference genome, and ii) a transcript database?

View

bowtie, bwa

tophat, bwa

Q4. Which of the following is FALSE:

View

RNA-seq can be used to quantify the expression levels of proteins.

Spliced reads can be used to determine the introns in a gene.

Q5. What programs could be used to: i) assemble transcripts from RNA-seq reads, and ii) identify potentially novel transcripts and genes?

View

cufflinks, cuff-compare

Q6. Which of the following is FALSE about the gene annotations in the following GTF snippet:

chr1 MGF gene 3413609 3671498 . - . gene_id "MG051951";
chr1 MGF transcript 3413609 3416344 . - .gene_id "MG051951"; transcript_id "MT162897";
chr1 MGF exon 3413609 3416344 . - . gene_id "MG051951"; transcript_id "MT162897";
chr1 MGF transcript 3421702 3671498 . - . gene_id "MG051951"; transcript_id "MT070533";
chr1 MGF exon 3670552 3671498 . - . gene_id "MG051951"; transcript_id "MT070533";
chr1 MGF CDS 3670552 3671348 . - 0 gene_id "MG051951"; transcript_id "MT070533";
chr1 MGF exon 342170

View

Both exons of MT70533 contain both coding and non-coding sequences.

Q7. What does the following code NOT do:

BWT2IDX=/home/me/genomes/hg20/hg20
ANNOT=/home/me/genomes/hg20/myannot.gtf
ANNOTIDX=/home/me/genomes/hg20/myannot/myannot
mkdir -p /home/me/SRR100000
tophat2 -o /home/me/SRR100000 -p 10 --max-multihits 10 \
-r 26 –-mate-std-dev 25 \
-a 6 \
-G $ANNOT –-transcriptome-index $ANNOTIDX \

View

Report spliced reads with at most 6 mismatches in the anchor site

Q8. What does the following code NOT do:

TOPHATDIR=/home/florea/Tophat/
mkdir –p Test1
cd Test1
ln –s $TOPHATDIR/accepted_hits.bam .
cufflinks -L Test1 -p 8 –j 0.10 –F 0.05 accepted_hits.bam

View

Use the default reference transcript annotation to guide assembly

Q9. Which of the following is NOT described in the following summary file produced by tophat:

Left reads:
Input : 60586968
Mapped : 58163843 (96.0% of input)
of these: 6832240 (11.7%) have multiple alignments (359075 have >10)
Right reads:
Input : 60586968
Mapped : 56969290 (94.0% of input)
of these: 6668479 (11.7%) have multiple alignments (358573 have >10)
95.0% overall read mapping rate.

View

The reads were 100 bp long

Q10. Which of the following is NOT TRUE about the output below, obtained from a cuffdiff differential expression analysis:

XLOC_000002 XLOC_000002 AT1G01020 1:5927-8737 q1 q2 OK 1.13032 3.48406 1.62404 0.694576 0.5277 0.998846 no
XLOC_000004 XLOC_000004 AT1G01073 1:44676-44787 q1 q2 NOTEST 0 0 0 0 1 1 no
XLOC_000042 XLOC_000042 AT1G01580 1:209394-213041 q1 q2 OK 1.59512 0 -inf nan 5e-05 0.0096703 yes

View

There are too many alignments for testing for differential expression at locus XLOC_000004

Quiz 2: Module 4 Exam

Q1. How many alignments were produced for the ‘Day8’ RNA-seq data set?

View

63845

Q2. How many alignments were produced for the ‘Day16’ RNA-seq data set?

View

58398

Q3. How many reads were mapped in ‘Day8’ RNA-seq data set?

View

63489

Q4. How many reads were mapped in ‘Day16’ RNA-seq data set?

View

57951

Q5. How many reads were uniquely aligned in ‘Day8’ RNA-seq data set?

View

63133

Q6. How many reads were uniquely aligned in ‘Day16’ RNA-seq data set?

View

57504

Q7. How many spliced alignments were reported for ‘Day8’ RNA-seq data set?

View

8596

Q8. How many spliced alignments were reported for ‘Day16’ RNA-seq data set?

View

10695

Q9. How many reads were left unmapped from ‘Day8’ RNA-seq data set?

View

Q10. How many reads were left unmapped from ‘Day16’ RNA-seq data set?

View

Q11. How many genes were generated by cufflinks for Day8?

View

186

Q12. How many genes were generated by cufflinks for Day16?

View

Q13. How many transcripts were reported for Day8?

View

192

Q14. How many transcripts were reported for Day16?

View

Q15. How many single transcript genes were produced for Day8?

View

180

Q16. How many single transcript genes were produced for Day16?

View

Q17. How many single-exon transcripts were in the Day8 set?

View

119

Q18. How many single-exon transcripts were in the Day16 set?

View

Q19. How many multi-exon transcripts were in the Day8 set?

View

Q20. How many multi-exon transcripts were in the Day16 set?

View

Q21. How many cufflinks transcripts fully reconstruct annotation transcripts in Day8?

View

Q22. How many cufflinks transcripts fully reconstruct annotation transcripts in Day16?

View

Q23. How many splice variants does the gene AT4G20240 have in the Day8 sample?

View

Q24. How many splice variants does the gene AT4G20240 have in the Day16 sample?

View

Q25. How many cufflinks transcripts are partial reconstructions of reference transcripts (‘contained’)? (Day8)

View

133

Q26. How many cufflinks transcripts are partial reconstructions of reference transcripts (‘contained’)? (Day16)

View

Q27. How many cufflinks transcripts are novel splice variants of reference genes? (Day8)

View

Q28. How many cufflinks transcripts are novel splice variants of reference genes? (Day16)

View

Q29. How many cufflinks transcripts were formed in the introns of reference genes? (Day8)

View

Q30. How many cufflinks transcripts were formed in the introns of reference genes? (Day16)

View

Q31. How many genes (loci) were reported in the merged.gtf file?

View

129

Q32. How many transcripts?

View

200

Q33. How many genes total were included in the gene expression report from cuffdiff?

View

129

Q34. How many genes were detected as differentially expressed?

View

Q35. How many transcripts were differentially expressed between the two samples?

View

Get All Course Quiz Answers for the Genomic Data Science Specialization

Introduction to Genomic Technologies Coursera Quiz Answers

Python for Genomic Data Science Coursera Quiz Answers

Algorithms for DNA Sequencing Coursera Quiz Answers

Command Line Tools for Genomic Data Science Coursera Quiz Answers

Bioconductor for Genomic Data Science Coursera Quiz Answers

Statistics for Genomic Data Science

Command Line Tools for Genomic Data Science Quiz Answers

Get All Weeks Command Line Tools for Genomic Data Science Quiz Answers

Table of Contents

Command Line Tools for Genomic Data Science Week 01 Quiz Answers

Quiz 1: Module 1 Quiz

Quiz 2: Module 1 Exam

Command Line Tools for Genomic Data Science Week 02 Quiz Answers

Quiz 1: Module 2 Quiz

Quiz 2: Module 2 Exam

Command Line Tools for Genomic Data Science Week 03 Quiz Answers

Quiz 1: Module 3 Quiz

Quiz 2: Module 3 Exam

Command Line Tools for Genomic Data Science Week 04 Quiz Answers

Quiz 1: Module 4 Quiz

Quiz 2: Module 4 Exam

Get All Course Quiz Answers for the Genomic Data Science Specialization

Team Networking Funda

Leadership and Organizational Behavior Week 04 Quiz Answers

Leadership and Organizational Behavior Week 03 Quiz Answers

Leadership and Organizational Behavior Week 02 Quiz Answers

Leadership and Organizational Behavior Week 01 Quiz Answers

Introduction to Integrative Therapies and Healing Practices Quiz Answers

Leave a ReplyCancel Reply

Get All Weeks Command Line Tools for Genomic Data Science Quiz Answers

Table of Contents

Command Line Tools for Genomic Data Science Week 01 Quiz Answers

Quiz 1: Module 1 Quiz

Quiz 2: Module 1 Exam

Command Line Tools for Genomic Data Science Week 02 Quiz Answers

Quiz 1: Module 2 Quiz

Quiz 2: Module 2 Exam

Command Line Tools for Genomic Data Science Week 03 Quiz Answers

Quiz 1: Module 3 Quiz

Quiz 2: Module 3 Exam

Command Line Tools for Genomic Data Science Week 04 Quiz Answers

Quiz 1: Module 4 Quiz

Quiz 2: Module 4 Exam

Get All Course Quiz Answers for the Genomic Data Science Specialization

Team Networking Funda

Related Posts

Leave a ReplyCancel Reply

Trending now