# Introduction to Data Science in Python Coursera Quiz Answers – Networking Funda

### Introduction to Data Science in Python Week 01 Quiz Answers

#### Introduction to the Course

Q1. “What will be the output of the following code?

``````1 import re
2 string = 'bat, lat, mat, bet, let, met, bit, lit, mit, bot, lot, mot'
3 result = re.findall('b[ao]t', string)
4 print(result)``````
• ‘bat, bet, bit, bot’
• [‘bat’, ‘bot’]
• [‘bat’, ‘bet’, ‘bit’, ‘bot’]
• ‘bat, bot’

Q2.

Assume a and b are two (20, 20) numpy arrays. The L2-distance (defined above) between two equal dimension arrays can be calculated in python as follows:

``````1 def l2_dist(a, b):
2   result = ((a - b) * (a - b)).sum()
3   result = result ** 0.5
4   return result``````

Which of the following expressions using this function will produce a different result from the rest?

• l2_dist(np.reshape(a, (20 * 20)), np.reshape(b, (20 * 20)))
• l2_dist(a, b)
• l2_dist(np.reshape(a, (20 * 20)), np.reshape(b, (20 * 20, 1)))
• l2_dist(a.T, b.T)

Q3. Consider the following variables in Python:

``````1 a1 = np.random.rand(4)
2 a2 = np.random.rand(4, 1)
3 a3 = np.array([[1, 2, 3, 4]])
4 a4 = np.arange(1, 4, 1)
5 a5 = np.linspace(1 ,4, 4)``````

Which of the following statements regarding these variables is correct?

• a1.shape == a2.shape
• a3.shape == a4.shape
• a5.shape == a1.shape
• a4.ndim() == 1

Q4. Which of the following is the correct output for the code given below?

``````1 import numpy as np
2 old = np.array([[1, 1, 1], [1, 1, 1]])
3 new = old
4 new[0, :2] = 0
5 print(old)``````
• [[1 1 0][1 1 0]]
• [[0 1 1][0 1 1]]
• [[1 1 1][1 1 1]]
• [[0 0 1][1 1 1]]

Q5. Given the 6×6 NumPy array r shown below, which of the following options would slice the shaded elements?

• r[2:3,2:3]
• r[[2,4],[2,4]]
• r[[2,3],[2,3]]
• r[2:4,2:4]

Q6.

``````1 import re
2 s = 'ACBCAC'``````

For the given string, which of the following regular expressions can be used to check if the string starts with ‘AC’?

• re.findall(‘^AC’, s)
• re.findall(‘^[AC]’, s)
• re.findall(‘[^A]C’, s)
• re.findall(‘AC’, s)

Q7. What will be the output of the variable L after the following code is executed?

``````1 import re
2 s = 'ACAABAACAAAB'
3 result = re.findall('A{1,2}', s)
4 L = len(result)``````
• 5
• 8
• 4
• 12

Q8. Which of the following is the correct regular expression to extract all the phone numbers from the following chunk of text:

``````1 Office of Research Administration: (734) 647-6333 | 4325 North Quad
2 Office of Budget and Financial Administration: (734) 647-8044 | 309 Maynard, Suite 205
3 Health Informatics Program: (734) 763-2285 | 333 Maynard, Suite 500
4 Office of the Dean: (734) 647-3576 | 4322 North Quad
5 UMSI Engagement Center: (734) 763-1251 | 777 North University
• [(]\d{3}[)]\d{3}[-]\d{4}
• \d{3}\s\d{3}[-]\d{4}
• [(]\d{3}[)]\s\d{3}[-]\d{4}
• \d{3}[-]\d{3}[-]\d{4}

Q9. Which of the following regular expressions can be used to get the domain names (e.g. google.com, www.baidu.com) from the following sentence?

``1 'I refer to https://google.com and I never refer http://www.baidu.com if I have to search anything'``
• (?<=https:\/\/)([.]*)
• (?<=[https]:\/\/)([A-Za-z0-9.]*)
• (?<=https:\/\/)([A-Za-z0-9]*)
• (?<=https:\/\/)([A-Za-z0-9.]*)

Q10. The text from the Canadian Charter of Rights and Freedoms section 2 lists the fundamental freedoms afforded to everyone. Of the four choices provided to replace X in the code below, which would accurately count the number of fundamental freedoms that Canadians have?

``````1 text=r'''Everyone has the following fundamental freedoms:
2 (a) freedom of conscience and religion;
3 (b) freedom of thought, belief, opinion and expression, including freedom 4 of the press and other media of communication;
5 (c) freedom of peaceful assembly; and
6 (d) freedom of association.'''
7 import re
8 pattern = X
9 print(len(re.findall(pattern,text)))``````
``1 \(.\) ``

### Introduction to Data Science in Python Week 02 Quiz Answers

#### Introduction to Pandas and Series Data

Q1. For the following code, which of the following statements will not return True?

``1 import pandas as pd2 sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}3 obj1 = pd.Series(sdata)4 states = ['California', 'Ohio', 'Oregon', 'Texas']5 obj2 = pd.Series(sdata, index=states)6 obj3 = pd.isnull(obj2)``
``1 obj2['California'] == None``

Q2.

``1 import pandas as pd2 d = {'1': 'Alice','2': 'Bob','3': 'Rita','4': 'Molly','5': 'Ryan'}3 S = pd.Series(d)``

In the above python code, the keys of the dictionary d represent student ranks and the value for each key is a student name. Which of the following can be used to extract rows with student ranks that are lower than or equal to 3?

• S.iloc[0:2]
• S.loc[0:3]
• S.iloc[0:3]
• S.loc[0:2]

Q3. Suppose we have a DataFrame named df. We want to change the original DataFrame df in a way that all the column names are cast to upper case. Which of the following expressions is incorrect to perform the same?

• df.rename(mapper = lambda x: x.upper(), axis = 1)
• df = df.rename(mapper = lambda x: x.upper(), axis = 1)
• df = df.rename(mapper = lambda x: x.upper(), axis = ‘column’)
• df.rename(mapper = lambda x: x.upper(), axis = 1, inplace = True)

Q4.

For the given DataFrame df we want to keep only the records with a toefl score greater than 105. Which of the following will not work?

• df[df[‘toefl score’] > 105]
• df.where(df[‘toefl score’] > 105)
• All of these will work
• df.where(df[‘toefl score’] > 105).dropna()

Q5. Which of the following can be used to create a DataFrame in Pandas?

• Pandas Series object
• All of these work
• 2D ndarray
• Python dict

Q6. Which of the following is an incorrect way to drop entries from the Pandas DataFrame named df shown below?

• df.drop(‘Ohio’)
• df.drop(‘two’)
• df.drop(‘one’, axis = 1)

Q7. For the Series s1 and s2 defined below, which of the following statements will give an error?

``1 import pandas as pd2 s1 = pd.Series({1: 'Alice', 2: 'Jack', 3: 'Molly'})3 s2 = pd.Series({'Alice': 1, 'Jack': 2, 'Molly': 3})``
• s2.loc
• s2.iloc
• s1.loc
• s2

Q8. Which of the following statements is incorrect?

• loc and iloc are two useful and commonly used Pandas methods.
• We can use s.iteritems() on a pd.Series object s to iterate on it.
• If s and s1 are two pd.Series objects, we cannot use s.append(s1) to directly append s1 to the existing series s
• If s is a pd.Series object, then we can use s.loc[label] to get all data where the index is equal to label.

Q9.

For the given DataFrame df shown above, we want to get all records with a toefl score greater than 105 but smaller than 115. Which of the following expressions is incorrect to perform the same?

• df[df[‘toefl score’].gt(105) & df[‘toefl score’].lt(115)]
• df[(df[‘toefl score’] > 105) & (df[‘toefl score’] < 115)]
• df[(df[‘toefl score’].isin(range(106, 115)))]
• (df[‘toefl score’] > 105) & (df[‘toefl score’] < 115)

Q10. Which of the following is the correct way to extract all information related to the student named Alice from the DataFrame df given below:

• df.T[‘Mathematics’]
• df.iloc[‘Mathematics’]
• df[‘Mathematics’]
• df[‘Alice’]

### Introduction to Data Science in Python Week 03 Quiz Answers

#### More Data Processing with Pandas

Q1. Consider the two DataFrames shown below, both of which have Name as the index. Which of the following expressions can be used to get the data of all students (from student_df) including their roles as staff, where nan denotes no role?

• pd.merge(student_df, staff_df, how=’left’, left_index=True, right_index=True)
• pd.merge(student_df, staff_df, how=’right’, left_index=True, right_index=True)
• pd.merge(staff_df, student_df, how=’right’, left_index=False, right_index=True)
• pd.merge(staff_df, student_df, how=’left’, left_index=True, right_index=True)

Q2. Consider a DataFrame named df with columns named P2010, P2011, P2012, P2013, 2014 and P2015 containing float values. We want to use the apply method to get a new DataFrame named result_df with a new column AVG. The AVG column should average the float values across P2010 to P2015. The apply method should also remove the 6 original columns (P2010 to P2015). For that, what should be the value of x and y in the given code?

``1 frames = ['P2010', 'P2011', 'P2012', 'P2013','P2014', 'P2015']2 df['AVG'] = df[frames].apply(lambda z: np.mean(z), axis=x)3 result_df = df.drop(frames,axis=y)``
• x = 0 , y = 1
• x = 0 , y = 0
• x = 1 , y = 0
• x = 1 , y = 1

Q3. Consider the Dataframe df below, instantiated with a list of grades, ordered from best grade to worst. Which of the following options can be used to substitute X in the code given below, if we want to get all the grades between ‘A’ and ‘B’ where ‘A’ is better than ‘B’?

``1 import pandas as pd2 df = pd.DataFrame(['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D'], index=['excellent', 'excellent', 'excellent', 'good', 'good', 'good', 'ok', 'ok', 'ok', 'poor', 'poor'], columns = ['Grades'])3 my_categories= X4 grades = df['Grades'].astype(my_categories)5 result = grades[(grades>'B') & (grades<'A')]``
• my_categories = pd.CategoricalDtype(categories=[‘D’, ‘D+’, ‘C-‘, ‘C’, ‘C+’, ‘B-‘, ‘B’, ‘B+’, ‘A-‘, ‘A’, ‘A+’], ordered=True)
• my_categories = pd.CategoricalDtype(categories=[‘D’, ‘D+’, ‘C-‘, ‘C’, ‘C+’, ‘B-‘, ‘B’, ‘B+’, ‘A-‘, ‘A’, ‘A+’])
• (my_categories=[‘A+’, ‘A’, ‘A-‘, ‘B+’, ‘B’, ‘B-‘, ‘C+’, ‘C’, ‘C-‘, ‘D+’, ‘D’], ordered=True)
• my_categories = pd.CategoricalDtype(categories=[‘A+’, ‘A’, ‘A-‘, ‘B+’, ‘B’, ‘B-‘, ‘C+’, ‘C’, ‘C-‘, ‘D+’, ‘D’])

Q4. Consider the DataFrame df shown in the image below. Which of the following can return the head of the pivot table as shown in the image below df?

• df.pivot_table(values=’score’, index=’country’, columns=’Rank_Level’, aggfunc=[np.median], margins=True)
• df.pivot_table(values=’score’, index=’country’, columns=’Rank_Level’, aggfunc=[np.median])
• df.pivot_table(values=’score’, index=’Rank_Level’, columns=’country’, aggfunc=[np.median])
• df.pivot_table(values=’score’, index=’Rank_Level’, columns=’country’, aggfunc=[np.median], margins=True)

Q5. Assume that the date ’11/29/2019′ in MM/DD/YYYY format is the 4th day of the week, what will be the result of the following?

``1 import pandas as pd2 (pd.Timestamp('11/29/2019') + pd.offsets.MonthEnd()).weekday()``
• 6
• 4
• 7
• 5

Q6. Consider a DataFrame df. We want to create groups based on the column group_key in the DataFrame and fill the nan values with group means using:

``1 filling_mean = lambda g: g.fillna(g.mean())``

Which of the following is correct for performing this task?

• df.groupby(group_key).transform(filling_mean)
• df.groupby(group_key).filling_mean()
• df.groupby(group_key).aggregate(filling_mean)
• df.groupby(group_key).apply(filling_mean)

Q7.

Consider the DataFrames above, both of which have a standard integer based index. Which of the following can be used to get the data of all students (from student_df) and merge it with their staff roles where nan denotes no role?

• result_df = pd.merge(student_df, staff_df, how=’inner’, on=[‘First Name’, ‘Last Name’])
• result_df = pd.merge(staff_df, student_df, how=’outer’, on=[‘First Name’, ‘Last Name’])
• result_df = pd.merge(staff_df, student_df, how=’right’, on=[‘First Name’, ‘Last Name’])
• result_df = pd.merge(student_df, staff_df, how=’right’, on=[‘First Name’, ‘Last Name’])

Q8. Consider a DataFrame df with columns name, reviews_per_month, and review_scores_value. This DataFrame also consists of several missing values. Which of the following can be used to:

1. calculate the number of entries in the name column, and
2. calculate the mean and standard deviation of the reviews_per_month, grouping by different review_scores_value?
• df.groupby(‘review_scores_value’).agg({‘name’: len, ‘reviews_per_month’: (np.mean, np.std)})
• df.agg({‘name’: len, ‘reviews_per_month’: (np.mean, np.std)}
• df.agg({‘name’: len, ‘reviews_per_month’: (np.nanmean, np.nanstd)}
• df.groupby(‘review_scores_value’).agg({‘name’: len, ‘reviews_per_month’: (np.nanmean, np.nanstd)})

Q9. What will be the result of the following code?:

``1 import pandas as pd2 pd.Period('01/12/2019', 'M') + 5``
• Period(‘2019-12-01’, ‘D’)
• Period(‘2019-12-06’, ‘D’)
• Period(‘2019-06’, ‘M’)
• Period(‘2019-12’, ‘M’)

Q10. Which of the following is not a valid expression to create a Pandas GroupBy object from the DataFrame shown below?

• df.groupby(‘vegetable’)
• df.groupby(‘class’, axis = 0)
• grouped = df.groupby([‘class’, ‘avg calories per unit’])
• df.groupby(‘class’)

### Introduction to Data Science in Python Week 04 Quiz Answers

#### Beyond Data Manipulation

Q1. Consider the given NumPy arrays a and b. What will be the value of c after the following code is executed?

``1 import numpy as np2 a = np.arange(8)3 b = a[4:6]4 b[:] = 405 c = a + a``
`46`

Q2. Given the string s as shown below, which of the following expressions will be True?

``1 import re2 s = 'ABCAC'``
``1 re.match('A', s) == True``
``1 len(re.split('A', s)) == 2``
``1 len(re.search('A', s)) == 2``
``1 bool(re.match('A', s)) == True``

Q3. Consider a string s. We want to find all characters (other than A) which are followed by triple A, i.e., have AAA to the right. We don’t want to include the triple A in the output and just want the character immediately preceding AAA . Complete the code given below that would output the required result.

``1 def result():2 s = 'ACAABAACAAABACDBADDDFSDDDFFSSSASDAFAAACBAAAFASD'3 result = []4 # compete the pattern below5 pattern =6 for item in re.finditer(pattern, s):7 # identify the group number below.8 result.append(item.group())9 return result``

Q4.

Consider the following 4 expressions regarding the above pandas Series df. All of them have the same value except one expression. Can you identify which one it is?

``1 df.iloc``
``1 df.index``
``1 df``
``1 df['d']``

Q5.

Consider the two pandas Series objects shown above, representing the no. of items of different yogurt flavors that were sold in a day from two different stores, s1 and s2. Which of the following statements is True regarding the Series s3 defined below?

``1 s3 = s1.add(s2)``

``1 s3['Plain'] >= s3['Mango']``
``1 s3['Blueberry'] == s1.add(s2, fill_value = 0)['Blueberry']``
``1 s3['Mango'] >= s1.add(s2, fill_value = 0)['Mango']``
``1 s3['Blueberry'] == s1['Blueberry']``

Q6. In the following list of statements regarding a DataFrame df, one or more statements are correct. Can you identify all the correct statements?

• Every time we call df.set_index(), the old index will be discarded.
• Every time we call df.set_index(), the old index will be set as a new column.
• Every time we call df.reset_index(), the old index will be discarded.
• Every time we call df.reset_index(), the old index will be set as a new column.

Q7. Consider the Series object S defined below. Which of the following is an incorrect way to slice S such that we obtain all data points corresponding to the indices ‘b’, ‘c’, and ‘d’?

``1 S = pd.Series(np.arange(5), index=['a', 'b', 'c', 'd', 'e'])``

``1 S['b':'e']``
``1 S[['b', 'c', 'd']]``
``1 S[S <= 3][S > 0]``
``1 S[1:4]``

Q8.

Consider the DataFrame df shown above with indexes ‘R1’, ‘R2’, ‘R3’, and ‘R4’. In the following code, a new DataFrame df_new is created using df. What will be the value of df_new after the below code is executed?

``1 f = lambda x: x.max() + x.min()2 df_new = df.apply(f)``
`88`

Q9.

Consider the DataFrame named new_df shown above. Which of the following expressions will output the result (showing the head of a DataFrame) below?

• new_df.stack()
• new_df.unstack()
• new_df.stack().stack()
• new_df.unstack().unstack()

Q10.

Consider the DataFrame df shown above. What will be the output (rounded to the nearest integer) when the following code related to df is executed:

``1 df.groupby('Item').sum().iloc['Quantity sold']``
`30`

#### Get All Course Quiz Answers of Entrepreneurship Specialization

Entrepreneurship 1: Developing the Opportunity Quiz Answers