Introduction to Data Science in Python Coursera Quiz Answers – Networking Funda

All Weeks Introduction to Data Science in Python Coursera Quiz Answers

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating CSV files, and the NumPy library.

The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively.

By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.

Enroll on Coursera

Introduction to Data Science in Python Week 01 Quiz Answers

Introduction to the Course

Q1. “What will be the output of the following code?

1 import re
2 string = 'bat, lat, mat, bet, let, met, bit, lit, mit, bot, lot, mot'
3 result = re.findall('b[ao]t', string)
4 print(result)
  • ‘bat, bet, bit, bot’
  • [‘bat’, ‘bot’]
  • [‘bat’, ‘bet’, ‘bit’, ‘bot’]
  • ‘bat, bot’

Q2.

Assume a and b are two (20, 20) numpy arrays. The L2-distance (defined above) between two equal dimension arrays can be calculated in python as follows:

1 def l2_dist(a, b):
2   result = ((a - b) * (a - b)).sum()
3   result = result ** 0.5
4   return result


Which of the following expressions using this function will produce a different result from the rest?

  • l2_dist(np.reshape(a, (20 * 20)), np.reshape(b, (20 * 20)))
  • l2_dist(a, b)
  • l2_dist(np.reshape(a, (20 * 20)), np.reshape(b, (20 * 20, 1)))
  • l2_dist(a.T, b.T)

Q3. Consider the following variables in Python:

1 a1 = np.random.rand(4)
2 a2 = np.random.rand(4, 1)
3 a3 = np.array([[1, 2, 3, 4]])
4 a4 = np.arange(1, 4, 1)
5 a5 = np.linspace(1 ,4, 4)


Which of the following statements regarding these variables is correct?

  • a1.shape == a2.shape
  • a3.shape == a4.shape
  • a5.shape == a1.shape
  • a4.ndim() == 1

Q4. Which of the following is the correct output for the code given below?

1 import numpy as np
2 old = np.array([[1, 1, 1], [1, 1, 1]])
3 new = old
4 new[0, :2] = 0
5 print(old)
  • [[1 1 0][1 1 0]]
  • [[0 1 1][0 1 1]]
  • [[1 1 1][1 1 1]]
  • [[0 0 1][1 1 1]]

Q5. Given the 6×6 NumPy array r shown below, which of the following options would slice the shaded elements?

  • r[2:3,2:3]
  • r[[2,4],[2,4]]
  • r[[2,3],[2,3]]
  • r[2:4,2:4]

Q6.

1 import re
2 s = 'ACBCAC'


For the given string, which of the following regular expressions can be used to check if the string starts with ‘AC’?

  • re.findall(‘^AC’, s)
  • re.findall(‘^[AC]’, s)
  • re.findall(‘[^A]C’, s)
  • re.findall(‘AC’, s)

Q7. What will be the output of the variable L after the following code is executed?

1 import re
2 s = 'ACAABAACAAAB'
3 result = re.findall('A{1,2}', s)
4 L = len(result)
  • 5
  • 8
  • 4
  • 12

Q8. Which of the following is the correct regular expression to extract all the phone numbers from the following chunk of text:

1 Office of Research Administration: (734) 647-6333 | 4325 North Quad
2 Office of Budget and Financial Administration: (734) 647-8044 | 309 Maynard, Suite 205
3 Health Informatics Program: (734) 763-2285 | 333 Maynard, Suite 500
4 Office of the Dean: (734) 647-3576 | 4322 North Quad
5 UMSI Engagement Center: (734) 763-1251 | 777 North University
6 Faculty Adminstrative Support Staff: (734) 764-9376 | 4322 North Quad
  • [(]\d{3}[)]\d{3}[-]\d{4}
  • \d{3}\s\d{3}[-]\d{4}
  • [(]\d{3}[)]\s\d{3}[-]\d{4}
  • \d{3}[-]\d{3}[-]\d{4}

Q9. Which of the following regular expressions can be used to get the domain names (e.g. google.com, www.baidu.com) from the following sentence?

1 'I refer to https://google.com and I never refer http://www.baidu.com if I have to search anything'
  • (?<=https:\/\/)([.]*)
  • (?<=[https]:\/\/)([A-Za-z0-9.]*)
  • (?<=https:\/\/)([A-Za-z0-9]*)
  • (?<=https:\/\/)([A-Za-z0-9.]*)

Q10. The text from the Canadian Charter of Rights and Freedoms section 2 lists the fundamental freedoms afforded to everyone. Of the four choices provided to replace X in the code below, which would accurately count the number of fundamental freedoms that Canadians have?

1 text=r'''Everyone has the following fundamental freedoms:
2 (a) freedom of conscience and religion;
3 (b) freedom of thought, belief, opinion and expression, including freedom 4 of the press and other media of communication;
5 (c) freedom of peaceful assembly; and
6 (d) freedom of association.'''
7 import re
8 pattern = X
9 print(len(re.findall(pattern,text)))
1 \(.\) 

Introduction to Data Science in Python Week 02 Quiz Answers

Introduction to Pandas and Series Data

Q1. For the following code, which of the following statements will not return True?

1 import pandas as pd
2 sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}
3 obj1 = pd.Series(sdata)
4 states = ['California', 'Ohio', 'Oregon', 'Texas']
5 obj2 = pd.Series(sdata, index=states)
6 obj3 = pd.isnull(obj2)
1 obj2['California'] == None

Q2.

1 import pandas as pd
2 d = {'1': 'Alice','2': 'Bob','3': 'Rita','4': 'Molly','5': 'Ryan'}
3 S = pd.Series(d)

In the above python code, the keys of the dictionary d represent student ranks and the value for each key is a student name. Which of the following can be used to extract rows with student ranks that are lower than or equal to 3?

  • S.iloc[0:2]
  • S.loc[0:3]
  • S.iloc[0:3]
  • S.loc[0:2]

Q3. Suppose we have a DataFrame named df. We want to change the original DataFrame df in a way that all the column names are cast to upper case. Which of the following expressions is incorrect to perform the same?

  • df.rename(mapper = lambda x: x.upper(), axis = 1)
  • df = df.rename(mapper = lambda x: x.upper(), axis = 1)
  • df = df.rename(mapper = lambda x: x.upper(), axis = ‘column’)
  • df.rename(mapper = lambda x: x.upper(), axis = 1, inplace = True)

Q4.

For the given DataFrame df we want to keep only the records with a toefl score greater than 105. Which of the following will not work?

  • df[df[‘toefl score’] > 105]
  • df.where(df[‘toefl score’] > 105)
  • All of these will work
  • df.where(df[‘toefl score’] > 105).dropna()

Q5. Which of the following can be used to create a DataFrame in Pandas?

  • Pandas Series object
  • All of these work
  • 2D ndarray
  • Python dict

Q6. Which of the following is an incorrect way to drop entries from the Pandas DataFrame named df shown below?

  • df.drop(‘Ohio’)
  • df.drop([‘Utah’, ‘Colorado’])
  • df.drop(‘two’)
  • df.drop(‘one’, axis = 1)

Q7. For the Series s1 and s2 defined below, which of the following statements will give an error?

1 import pandas as pd
2 s1 = pd.Series({1: 'Alice', 2: 'Jack', 3: 'Molly'})
3 s2 = pd.Series({'Alice': 1, 'Jack': 2, 'Molly': 3})
  • s2.loc[1]
  • s2.iloc[1]
  • s1.loc[1]
  • s2[1]

Q8. Which of the following statements is incorrect?

  • loc and iloc are two useful and commonly used Pandas methods.
  • We can use s.iteritems() on a pd.Series object s to iterate on it.
  • If s and s1 are two pd.Series objects, we cannot use s.append(s1) to directly append s1 to the existing series s
  • If s is a pd.Series object, then we can use s.loc[label] to get all data where the index is equal to label.

Q9.

For the given DataFrame df shown above, we want to get all records with a toefl score greater than 105 but smaller than 115. Which of the following expressions is incorrect to perform the same?

  • df[df[‘toefl score’].gt(105) & df[‘toefl score’].lt(115)]
  • df[(df[‘toefl score’] > 105) & (df[‘toefl score’] < 115)]
  • df[(df[‘toefl score’].isin(range(106, 115)))]
  • (df[‘toefl score’] > 105) & (df[‘toefl score’] < 115)

Q10. Which of the following is the correct way to extract all information related to the student named Alice from the DataFrame df given below:

  • df.T[‘Mathematics’]
  • df.iloc[‘Mathematics’]
  • df[‘Mathematics’]
  • df[‘Alice’]

Introduction to Data Science in Python Week 03 Quiz Answers

More Data Processing with Pandas

Q1. Consider the two DataFrames shown below, both of which have Name as the index. Which of the following expressions can be used to get the data of all students (from student_df) including their roles as staff, where nan denotes no role?

  • pd.merge(student_df, staff_df, how=’left’, left_index=True, right_index=True)
  • pd.merge(student_df, staff_df, how=’right’, left_index=True, right_index=True)
  • pd.merge(staff_df, student_df, how=’right’, left_index=False, right_index=True)
  • pd.merge(staff_df, student_df, how=’left’, left_index=True, right_index=True)

Q2. Consider a DataFrame named df with columns named P2010, P2011, P2012, P2013, 2014 and P2015 containing float values. We want to use the apply method to get a new DataFrame named result_df with a new column AVG. The AVG column should average the float values across P2010 to P2015. The apply method should also remove the 6 original columns (P2010 to P2015). For that, what should be the value of x and y in the given code?

1 frames = ['P2010', 'P2011', 'P2012', 'P2013','P2014', 'P2015']
2 df['AVG'] = df[frames].apply(lambda z: np.mean(z), axis=x)
3 result_df = df.drop(frames,axis=y)
  • x = 0 , y = 1
  • x = 0 , y = 0
  • x = 1 , y = 0
  • x = 1 , y = 1

Q3. Consider the Dataframe df below, instantiated with a list of grades, ordered from best grade to worst. Which of the following options can be used to substitute X in the code given below, if we want to get all the grades between ‘A’ and ‘B’ where ‘A’ is better than ‘B’?

1 import pandas as pd
2 df = pd.DataFrame(['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D'], index=['excellent', 'excellent', 'excellent', 'good', 'good', 'good', 'ok', 'ok', 'ok', 'poor', 'poor'], columns = ['Grades'])
3 my_categories= X
4 grades = df['Grades'].astype(my_categories)
5 result = grades[(grades>'B') & (grades<'A')]
  • my_categories = pd.CategoricalDtype(categories=[‘D’, ‘D+’, ‘C-‘, ‘C’, ‘C+’, ‘B-‘, ‘B’, ‘B+’, ‘A-‘, ‘A’, ‘A+’], ordered=True)
  • my_categories = pd.CategoricalDtype(categories=[‘D’, ‘D+’, ‘C-‘, ‘C’, ‘C+’, ‘B-‘, ‘B’, ‘B+’, ‘A-‘, ‘A’, ‘A+’])
  • (my_categories=[‘A+’, ‘A’, ‘A-‘, ‘B+’, ‘B’, ‘B-‘, ‘C+’, ‘C’, ‘C-‘, ‘D+’, ‘D’], ordered=True)
  • my_categories = pd.CategoricalDtype(categories=[‘A+’, ‘A’, ‘A-‘, ‘B+’, ‘B’, ‘B-‘, ‘C+’, ‘C’, ‘C-‘, ‘D+’, ‘D’])

Q4. Consider the DataFrame df shown in the image below. Which of the following can return the head of the pivot table as shown in the image below df?

  • df.pivot_table(values=’score’, index=’country’, columns=’Rank_Level’, aggfunc=[np.median], margins=True)
  • df.pivot_table(values=’score’, index=’country’, columns=’Rank_Level’, aggfunc=[np.median])
  • df.pivot_table(values=’score’, index=’Rank_Level’, columns=’country’, aggfunc=[np.median])
  • df.pivot_table(values=’score’, index=’Rank_Level’, columns=’country’, aggfunc=[np.median], margins=True)

Q5. Assume that the date ’11/29/2019′ in MM/DD/YYYY format is the 4th day of the week, what will be the result of the following?

1 import pandas as pd
2 (pd.Timestamp('11/29/2019') + pd.offsets.MonthEnd()).weekday()
  • 6
  • 4
  • 7
  • 5

Q6. Consider a DataFrame df. We want to create groups based on the column group_key in the DataFrame and fill the nan values with group means using:

1 filling_mean = lambda g: g.fillna(g.mean())

Which of the following is correct for performing this task?

  • df.groupby(group_key).transform(filling_mean)
  • df.groupby(group_key).filling_mean()
  • df.groupby(group_key).aggregate(filling_mean)
  • df.groupby(group_key).apply(filling_mean)

Q7.

Consider the DataFrames above, both of which have a standard integer based index. Which of the following can be used to get the data of all students (from student_df) and merge it with their staff roles where nan denotes no role?

  • result_df = pd.merge(student_df, staff_df, how=’inner’, on=[‘First Name’, ‘Last Name’])
  • result_df = pd.merge(staff_df, student_df, how=’outer’, on=[‘First Name’, ‘Last Name’])
  • result_df = pd.merge(staff_df, student_df, how=’right’, on=[‘First Name’, ‘Last Name’])
  • result_df = pd.merge(student_df, staff_df, how=’right’, on=[‘First Name’, ‘Last Name’])

Q8. Consider a DataFrame df with columns name, reviews_per_month, and review_scores_value. This DataFrame also consists of several missing values. Which of the following can be used to:

  1. calculate the number of entries in the name column, and
  2. calculate the mean and standard deviation of the reviews_per_month, grouping by different review_scores_value?
  • df.groupby(‘review_scores_value’).agg({‘name’: len, ‘reviews_per_month’: (np.mean, np.std)})
  • df.agg({‘name’: len, ‘reviews_per_month’: (np.mean, np.std)}
  • df.agg({‘name’: len, ‘reviews_per_month’: (np.nanmean, np.nanstd)}
  • df.groupby(‘review_scores_value’).agg({‘name’: len, ‘reviews_per_month’: (np.nanmean, np.nanstd)})

Q9. What will be the result of the following code?:

1 import pandas as pd
2 pd.Period('01/12/2019', 'M') + 5
  • Period(‘2019-12-01’, ‘D’)
  • Period(‘2019-12-06’, ‘D’)
  • Period(‘2019-06’, ‘M’)
  • Period(‘2019-12’, ‘M’)

Q10. Which of the following is not a valid expression to create a Pandas GroupBy object from the DataFrame shown below?

  • df.groupby(‘vegetable’)
  • df.groupby(‘class’, axis = 0)
  • grouped = df.groupby([‘class’, ‘avg calories per unit’])
  • df.groupby(‘class’)

Introduction to Data Science in Python Week 04 Quiz Answers

Beyond Data Manipulation

Q1. Consider the given NumPy arrays a and b. What will be the value of c after the following code is executed?

1 import numpy as np
2 a = np.arange(8)
3 b = a[4:6]
4 b[:] = 40
5 c = a[4] + a[6]
46

Q2. Given the string s as shown below, which of the following expressions will be True?

1 import re
2 s = 'ABCAC'
1 re.match('A', s) == True
1 len(re.split('A', s)) == 2
1 len(re.search('A', s)) == 2
1 bool(re.match('A', s)) == True

Q3. Consider a string s. We want to find all characters (other than A) which are followed by triple A, i.e., have AAA to the right. We don’t want to include the triple A in the output and just want the character immediately preceding AAA . Complete the code given below that would output the required result.

1 def result():
2 s = 'ACAABAACAAABACDBADDDFSDDDFFSSSASDAFAAACBAAAFASD'
3 result = []
4 # compete the pattern below
5 pattern =
6 for item in re.finditer(pattern, s):
7 # identify the group number below.
8 result.append(item.group())
9 return result

Q4.

Consider the following 4 expressions regarding the above pandas Series df. All of them have the same value except one expression. Can you identify which one it is?

1 df.iloc[0]
1 df.index[0]
1 df[0]
1 df['d']

Q5.

Consider the two pandas Series objects shown above, representing the no. of items of different yogurt flavors that were sold in a day from two different stores, s1 and s2. Which of the following statements is True regarding the Series s3 defined below?

1 s3 = s1.add(s2)

1 s3['Plain'] >= s3['Mango']
1 s3['Blueberry'] == s1.add(s2, fill_value = 0)['Blueberry']
1 s3['Mango'] >= s1.add(s2, fill_value = 0)['Mango']
1 s3['Blueberry'] == s1['Blueberry']

Q6. In the following list of statements regarding a DataFrame df, one or more statements are correct. Can you identify all the correct statements?

  • Every time we call df.set_index(), the old index will be discarded.
  • Every time we call df.set_index(), the old index will be set as a new column.
  • Every time we call df.reset_index(), the old index will be discarded.
  • Every time we call df.reset_index(), the old index will be set as a new column.

Q7. Consider the Series object S defined below. Which of the following is an incorrect way to slice S such that we obtain all data points corresponding to the indices ‘b’, ‘c’, and ‘d’?

1 S = pd.Series(np.arange(5), index=['a', 'b', 'c', 'd', 'e'])

1 S['b':'e']
1 S[['b', 'c', 'd']]
1 S[S <= 3][S > 0]
1 S[1:4]

Q8.

Consider the DataFrame df shown above with indexes ‘R1’, ‘R2’, ‘R3’, and ‘R4’. In the following code, a new DataFrame df_new is created using df. What will be the value of df_new[1] after the below code is executed?

1 f = lambda x: x.max() + x.min()
2 df_new = df.apply(f)
88

Q9.

Consider the DataFrame named new_df shown above. Which of the following expressions will output the result (showing the head of a DataFrame) below?

  • new_df.stack()
  • new_df.unstack()
  • new_df.stack().stack()
  • new_df.unstack().unstack()

Q10.

Consider the DataFrame df shown above. What will be the output (rounded to the nearest integer) when the following code related to df is executed:

1 df.groupby('Item').sum().iloc[0]['Quantity sold']
30
Introduction to Data Science in Python Coursera Course Review:

In our experience, we suggest you enroll in Introduction to Data Science in Python courses and gain some new skills from Professionals completely free and we assure you will be worth it.

Introduction to Data Science in Python course is available on Coursera for free, if you are stuck anywhere between quiz or graded assessment quiz, just visit Networking Funda to get Introduction to Data Science in Python Coursera Quiz Answers.

Conclusion:

I hope this Introduction to Data Science in Python Coursera Quiz Answers would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Coursera Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about new skills in a business context; there are no prerequisite courses.

Keep Learning!

Get All Course Quiz Answers of Applied Data Science with Python Specialization

Introduction to Data Science in Python Coursera Quiz Answers

Applied Plotting, Charting & Data Representation in Python Quiz Answers

Applied Machine Learning in Python Coursera Quiz Answers

Applied Text Mining in Python Coursera Quiz Answers

Applied Social Network Analysis in Python Coursera Quiz Answers

Leave a Reply

error: Content is protected !!