
Machine Learning, concepts and Python - general

 1. Machine learning is

  Machine's taking over education to provide learning to students

  the autonomous acquisition of knowledge through the use of computer programs.

  The selective acquisition of knowledge through the use of manual programs

  The art of learning how to do anything with the help of a machine

 2. The Netflix Prize was a famed competition where Netflix offered $1,000,000 for a better _________________________The team that won called BellKor had a 10% improvement and used an ensemble of different methods to win.

  N vs NP algorithm solution

  search complexity algorithm

  reduction synthesis algorithm

  collaborative filtering algorithm.

 3. What’s the trade-off between bias and variance? Read the excerpt below - T/F?
Bias is error due to erroneous or overly simplistic assumptions in the learning algorithm you’re using. Variance is error due to too much complexity in the learning algorithm you’re using. This leads to the algorithm being highly sensitive to high degrees of variation in your training data, which can lead your model to overfit the data



 4. What is the difference between a Generative and Discriminative Algorithm?

  A generative model will learn the distinction between different categories of data while a discriminative model will simply learn categories of data

  A generative model accepts all data for processing where as a discriminative model is selective

  A generative model will learn categories of data while a discriminative model will simply learn the distinction between different categories of data

  A generative model is 'general' in that it does not ever look at data specifically. A discriminative model will be specific in it's harvesting of data

 5. How would you evaluate machine learning algorithms? Read the following excerpt and see if you can fill in the blanks.
You would first split the dataset into training and test sets, or perhaps use cross-validation techniques to further segment the dataset into composite sets of training and test sets within the data. You should then implement a choice selection of performance metrics: here is a fairly comprehensive list. You could use measures such as the ______________

  F1 score, the accuracy, and the confusion matrix.

  generative model index, F1 matrix and the confusion score

  model performance, disability matrix and the F12 score

  discriminative accuracy, F1 matrix and the model statistics

 6. What is the difference between supervised and unsupervised machine learning?
Supervised learning requires training labeled data. For example, in order to do classification (a supervised learning task), you’ll need to first label the data you’ll use to train the model to classify data into your labeled groups. Unsupervised learning, in contrast, does not require labeling data explicitly.



 7. Which of the following is implemented on DataFrame to compute the correlation between like-labeled Series contained in different DataFrame objects ?




  None of the above

 8. Which of the following takes a dict of dicts or a dict of array-like sequences and returns a DataFrame?



  All of the Mentioned


 9. Which of the following library is similar to Pandas ?





 10. Which of the following makes use of pandas and returns data in a Series or DataFrame ?





 11. Which of the following method is used for transforming a SparseSeries indexed by a MultiIndex to a scipy.sparse.coo_matrix ?


  None of the Mentioned



 12. Which of the following can be used to create sub–samples using a maximum dissimilarity approach ?

  All of the Mentioned




 13. Which of the following model model include a backwards elimination feature selection routine?





 14. Which of the following is a categorical outcome?




  All of the above

 15. Which of the following function provides unsupervised prediction ?

  None of the Mentioned




 16. How would you evaluate a logistic regression model?
You have to demonstrate an understanding of what the typical goals of a logistic regression are ___________________________________ etc. and bring up a few examples and use cases.

  logarithmic time complexity, possibility

  accuracy, maxDissim

  classification, prediction

  probability, prevention of errors

 17. How would you handle missing or corrupted data in a dataset?
You could find missing/corrupted data in a dataset and either drop those rows or columns, or decide to replace them with another value.

In Pandas, there are two very useful methods:__________________and  ______________________) that will help you find columns of data with missing or corrupted data and drop those values. If you want to fill the invalid values with a placeholder value (for example, 0), you could use the fillna() method.

  isNot() and cullA(

  isPresent() and valueA()

  isBlank() and findA()

   isnull() and dropna()

 18. True or False. An array is a series of objects with pointers that direct how to process them sequentially. A linked list is an ordered collection of objects
class Node:
  def __init__(self, data, nextNode=None): = data
    self.nextNode = nextNode
class LinkedList:
  def __init__(self, head = None):
    self.head = head
    self.count = 0
    if self.head is not None:
      self.count = 1
  def appendNode(self, node):
    '''Append a node to the end of the linked list'''
    if self.head is None:
      self.head = node
      currentNode = self.head
      while currentNode.nextNode is not None:
        currentNode = currentNode.nextNode
      currentNode.nextNode = node
    self.count = self.count + 1
  def appendNodeRecursive(self,newNode,currentNode):
    if currentNode.nextNode is None:
      currentNode.nextNode = newNode
      self.count = self.count + 1
      self.appendNodeRecursive(newNode, currentNode.nextNode)
  def deleteNode(self):
    '''Delete the last node in the linked list'''
    if self.head is None:
      currentNode = self.head
      previousNode = currentNode
      #set currentNode to the last node
      while currentNode.nextNode is not None:
        previousNode = currentNode
        currentNode = currentNode.nextNode
      currentNode = None
      previousNode.nextNode = currentNode  
      self.count = self.count - 1
  def getCount(self):
    index = 0
    currentNode = self.head
    while currentNode is not None:
      index = index + 1
      currentNode = currentNode.nextNode
    return index
  def __iter__(self):
    self.currentNode = self.head
    return self
  def __next__(self):
    currentNode = self.currentNode
    if currentNode is None:
      raise StopIteration
      self.currentNode = currentNode.nextNode
    return currentNode
#llist = LinkedList(Node(1))
llist = LinkedList()

for n in range(3,13,2):


for a in llist:
  print("Node: {}, Data: {}, Next: {}".format(a,, a.nextNode))



 19. A hash table is a data structure that produces an associative array. A key is mapped to certain values through the use of a ______________________
def put(self,key,data):
  hashvalue = self.hashfunction(key,len(self.slots))

  if self.slots[hashvalue] == None:
    self.slots[hashvalue] = key[hashvalue] = data
    if self.slots[hashvalue] == key:[hashvalue] = data  #replace
      nextslot = self.rehash(hashvalue,len(self.slots))
      while self.slots[nextslot] != None and \
                      self.slots[nextslot] != key:
        nextslot = self.rehash(nextslot,len(self.slots))

      if self.slots[nextslot] == None:
      else:[nextslot] = data #replace

def hashfunction(self,key,size):
     return key%size

def rehash(self,oldhash,size):
    return (oldhash+1)%size

  hashing mesh algorithm

  hash function

  Binary tree

  A* Algorithm

 20. _____________________can be used to categorize people into different tiers of intelligence based on IQ scores.

  classification linked lists

  hashing algorithms

  decision trees

  pandas databases