Preview

Machine Learning, concepts and Python - general

 1. Machine learning is

  The art of learning how to do anything with the help of a machine

  The selective acquisition of knowledge through the use of manual programs

  Machine's taking over education to provide learning to students

  the autonomous acquisition of knowledge through the use of computer programs.

 2. The Netflix Prize was a famed competition where Netflix offered $1,000,000 for a better _________________________The team that won called BellKor had a 10% improvement and used an ensemble of different methods to win.
netflix_machinelearning.jpg

  N vs NP algorithm solution

  search complexity algorithm

  collaborative filtering algorithm.

  reduction synthesis algorithm

 3. What’s the trade-off between bias and variance? Read the excerpt below - T/F?
Bias is error due to erroneous or overly simplistic assumptions in the learning algorithm you’re using. Variance is error due to too much complexity in the learning algorithm you’re using. This leads to the algorithm being highly sensitive to high degrees of variation in your training data, which can lead your model to overfit the data

  FALSE

  TRUE

 4. What is the difference between a Generative and Discriminative Algorithm?

  A generative model will learn categories of data while a discriminative model will simply learn the distinction between different categories of data

  A generative model is 'general' in that it does not ever look at data specifically. A discriminative model will be specific in it's harvesting of data

  A generative model will learn the distinction between different categories of data while a discriminative model will simply learn categories of data

  A generative model accepts all data for processing where as a discriminative model is selective

 5. How would you evaluate machine learning algorithms? Read the following excerpt and see if you can fill in the blanks.
You would first split the dataset into training and test sets, or perhaps use cross-validation techniques to further segment the dataset into composite sets of training and test sets within the data. You should then implement a choice selection of performance metrics: here is a fairly comprehensive list. You could use measures such as the ______________
machinelearningalgorithms.jpg

  discriminative accuracy, F1 matrix and the model statistics

  model performance, disability matrix and the F12 score

  F1 score, the accuracy, and the confusion matrix.

  generative model index, F1 matrix and the confusion score

 6. What is the difference between supervised and unsupervised machine learning?
Supervised learning requires training labeled data. For example, in order to do classification (a supervised learning task), you’ll need to first label the data you’ll use to train the model to classify data into your labeled groups. Unsupervised learning, in contrast, does not require labeling data explicitly.

  FALSE

  TRUE

 7. Which of the following is implemented on DataFrame to compute the correlation between like-labeled Series contained in different DataFrame objects ?

  conwith

  corrwith

  corwit

  None of the above

 8. Which of the following takes a dict of dicts or a dict of array-like sequences and returns a DataFrame?

  DataFrame.from_items

  DataFrame.from_dict

  All of the Mentioned

  DataFrame.from_records

 9. Which of the following library is similar to Pandas ?

  NumPy

  OutPy

  RpyN

  Maltlap

 10. Which of the following makes use of pandas and returns data in a Series or DataFrame ?

  NumPy

   pandaSDMX

   freedapi

  OutPy

 11. Which of the following method is used for transforming a SparseSeries indexed by a MultiIndex to a scipy.sparse.coo_matrix ?

  Series.to_coo()

  SparseSeries.to_coo()

  SparseSeries.to_cooser()

  None of the Mentioned

 12. Which of the following can be used to create sub–samples using a maximum dissimilarity approach ?

  inmaxDissim

  minDissim

  maxDissim

  All of the Mentioned

 13. Which of the following model model include a backwards elimination feature selection routine?

  MARS

  Pandas

  MCV

  MCRS

 14. Which of the following is a categorical outcome?

  Rsquared

  Accuracy

  RMSE

  All of the above

 15. Which of the following function provides unsupervised prediction ?

  cl_nowcast

  cl_precast

  cl_forecast

  None of the Mentioned

 16. How would you evaluate a logistic regression model?
You have to demonstrate an understanding of what the typical goals of a logistic regression are ___________________________________ etc. and bring up a few examples and use cases.

  accuracy, maxDissim

  classification, prediction

  logarithmic time complexity, possibility

  probability, prevention of errors

 17. How would you handle missing or corrupted data in a dataset?
You could find missing/corrupted data in a dataset and either drop those rows or columns, or decide to replace them with another value.

In Pandas, there are two very useful methods:__________________and  ______________________) that will help you find columns of data with missing or corrupted data and drop those values. If you want to fill the invalid values with a placeholder value (for example, 0), you could use the fillna() method.

  isPresent() and valueA()

   isnull() and dropna()

  isNot() and cullA(

  isBlank() and findA()

 18. True or False. An array is a series of objects with pointers that direct how to process them sequentially. A linked list is an ordered collection of objects
class Node:
  
  def __init__(self, data, nextNode=None):
    self.data = data
    self.nextNode = nextNode
    
class LinkedList:
  
  def __init__(self, head = None):
    self.head = head
    self.count = 0
    if self.head is not None:
      self.count = 1
      
  def appendNode(self, node):
    '''Append a node to the end of the linked list'''
    if self.head is None:
      self.head = node
    else:
      currentNode = self.head
      while currentNode.nextNode is not None:
        currentNode = currentNode.nextNode
      currentNode.nextNode = node
    self.count = self.count + 1
  
  def appendNodeRecursive(self,newNode,currentNode):
    if currentNode.nextNode is None:
      currentNode.nextNode = newNode
      self.count = self.count + 1
    else:
      self.appendNodeRecursive(newNode, currentNode.nextNode)
    return
  
  def deleteNode(self):
    '''Delete the last node in the linked list'''
    if self.head is None:
      return
    else:
      currentNode = self.head
      previousNode = currentNode
      #set currentNode to the last node
      while currentNode.nextNode is not None:
        previousNode = currentNode
        currentNode = currentNode.nextNode
      
      currentNode = None
      previousNode.nextNode = currentNode  
      self.count = self.count - 1
 
  def getCount(self):
    index = 0
    currentNode = self.head
    while currentNode is not None:
      index = index + 1
      currentNode = currentNode.nextNode
    
    return index
  
  def __iter__(self):
    self.currentNode = self.head
    return self
    
  def __next__(self):
    currentNode = self.currentNode
    if currentNode is None:
      raise StopIteration
    else:
      self.currentNode = currentNode.nextNode
    return currentNode
    
#llist = LinkedList(Node(1))
llist = LinkedList()

for n in range(3,13,2):
 llist.appendNode(Node(n))

llist.appendNodeRecursive(Node(20),llist.head)

print(llist.getCount())
print(llist.count)
for a in llist:
  print("Node: {}, Data: {}, Next: {}".format(a, a.data, a.nextNode))
  

  TRUE

  FALSE

 19. A hash table is a data structure that produces an associative array. A key is mapped to certain values through the use of a ______________________
def put(self,key,data):
  hashvalue = self.hashfunction(key,len(self.slots))

  if self.slots[hashvalue] == None:
    self.slots[hashvalue] = key
    self.data[hashvalue] = data
  else:
    if self.slots[hashvalue] == key:
      self.data[hashvalue] = data  #replace
    else:
      nextslot = self.rehash(hashvalue,len(self.slots))
      while self.slots[nextslot] != None and \
                      self.slots[nextslot] != key:
        nextslot = self.rehash(nextslot,len(self.slots))

      if self.slots[nextslot] == None:
        self.slots[nextslot]=key
        self.data[nextslot]=data
      else:
        self.data[nextslot] = data #replace

def hashfunction(self,key,size):
     return key%size

def rehash(self,oldhash,size):
    return (oldhash+1)%size
hashtable_machinelearning.png

  A* Algorithm

  hashing mesh algorithm

  hash function

  Binary tree

 20. _____________________can be used to categorize people into different tiers of intelligence based on IQ scores.

  classification linked lists

  decision trees

  hashing algorithms

  pandas databases