Preview

10 - Hash Table

 1. Hashing involves applying a hashing algorithm to a data item, known as the hashing key, to create a ____________
Note: The following presentation may be useful and is available from www.teachingcomputing.com

  hash value

  dictionary set

  second hashing algorithm

  list

 2. Hashing algorithms take a large range of values (such as all possible strings or all possible files) and map them onto a _______________________ (such as a 128 bit number).

  similar decimal value (usually one more or one less)

  set of hexadecimal values

  larger set of values

  smaller set of values

 3. Hashing is often compared to providing the 'best of both worlds', in this case, the best of the features of both:

  graphs and binary trees

  graphs and tables

  ordinary lists and radix trees

  linked lists and arrays

 4. Hashing has two main applications. Hashed values can be used to speed data retrieval, and can be used to ________________________

  create IP addresses and DNS tables

   check the validity of data

  create graphs and insert nodes

  decrypt any password in the world

 5. In the following example, a hashing value is produced using MOD in the hashing algorithm. What is 60 MOD 6?
hashKey MOD 6


Hash Key	Hashing Algorithm	Hashing Value
======================================================
12345		12345 MOD 6		3
67564		67564 MOD 6		4
34237		34237 MOD 6		1
23423		23423 MOD 6		5
00332		00332 MOD 6		2

  3

  0

  4

  2

 6. The following excerpt summarises some rules for creating a good hash function. Which rule, if any, is incorrect?
Rules for a good hash function
==================================
1. Use only the data being hashed

2. Use all of the data being hashed

3. Be deterministic (predictable, not random)

4. Uniformly distribute data (a big range of hash codes is good)

5. Generate very similar hash codes for similar data.

  They are all correct

  Only rule 3 is incorrect - it is better to use random data for better results

  All of the rules are incorrect and do not apply in any way to creating a hash function

  Only rule 5 is incorrect - you should generate different hash codes for similar data

 7. The following shows the creation of a hash table to store scores. What number would go in the missing blanks?
usingahashtablequestion1.png

  4

  2

  3

  1

 8. Analyse the image below and select a statement from the options below that best describe what is happening.
usingahashtablequestion2.png

  Both names generate the same hash value and the second one simply goes into the second slot.

  There's a problem - both names generate a hash value of 1 causing what is termed a 'collision'

  The term for this is 'competition' as both hash values compete for one cell position.

  Mr Goose's hash value would be changed to 1+1 to give 2, and this could then be put in the next slot.

 9. What is one method for resolving the problem caused by collision?

  direct addressing

  semi-hash addressing

  hash algorithmic pointing

  open addressing

 10. Resolving collision can lead to ______________, and ____________ can be resolved by using an 'overflow'.

  clustering

  cluttering

  coping

  cloitering

 11. A good hash function would ideally have thousands, or even millions, of duplicate values.

  FALSE

  TRUE

 12. Another resolution to collision is _________. This allows each slot to hold a reference to a collection (or chain) of items.

  carting

  cpu-ing

  chaining

  clipping

 13. Ideally, a hashing algorithm should:

  provide a smaller output than input (to allow for faster comparing of hashes than the original data)

  either encrypt or delete all outputs on generation, to ensure the maximum security

  provide a bigger output than input (to allow for more careful comparisons)

  provide a completely identical output to input (to ensure consistency)

 14. These are tables which can be searched for an item in O(1) time using a hash function to form an address from the key

  hash table

  hash function

  secondary clustering

  collision

 15. This is a function which, when applied to the key, produces an integer which can be used as an address in a hash table.

  hash table

  collision

  secondary clustering

  hash function

 16. These are collision sequences generated by addresses calculated with quadratic probing.
Note: Quadratic probing is an open addressing scheme in computer programming 
for resolving collisions in hash tables—whenan incoming data's hash value 
indicates it should be stored in an already-occupied slot or bucket

  hash table

  hash function

  secondary clustering

  collision

 17. This is a function which, when applied to all the members of the set of items to be stored in a hash table, produces a unique set of integers within some suitable range.

  primary hash function

  non-collision function

  unique collision function

  perfect hash function

 18. This is the process of searching for another slot after a collision.
For example: Rehash(pos) = (pos+1)% size_of_table

  poshashing

  positional hashing

  post-collision unhashing

  rehashing

 19. This is a simple re-hashing scheme in which the next slot in the table is checked on a collision.

  linear probing

  rehashing

  positional hashing

  quadratic probing

 20. What is the following excerpt describing?
The tendency for many adjacent slots to be filled when linear probing is used

  collising

  clustering

  chaining

  rehashing

 21. The following excerpt describes how a hash value is stored in a table. Fill in the blanks
How a hash value is stored
===========================

1. Hashing algorithm is applied to the _____

2. The result is  _____ of where to store the value
in an array

3. It is possible that the hashing algorithm might
generate the same _____ for two different values.

4. If the above occurs it is called a collision.

5. A collision handling method like open addressing or ________
is used. 

  key / an index / key / chaining

  key / an index / collision / secondary hashing

  index / a key / index / chaining

  hash value / a key /index / closed addressing

 22. In re-hashing it is often necessary to create a larger hash table.

  TRUE

  FALSE

 23. In re-hashing, each value in the existing table is inserted into the new table - in a position _______________________________

  that starts at 0 and goes up by +1 for each value

  determined by a new hashing algorithm

  that starts at 0 and ends at 10

  randomly assigned to the value

 24. A dictionary (e.g. in Python) is a ____________________________________________

  typical index structure which is identical to the structure of a variable

  type of linked list that has pointers and dynamic cells to hold data

  type of online book that stores paragraphs of information along with index numbers, keys, values and pairs

  data structure that contains a collection of key-value pairs where the value is accessed via the key

 25. In the following example, showing a Python dictionary at work, what would the output be?
>>> D['a'] = 1
>>> D['b'] = 2
>>> D['c'] = 3
>>> D
{'a': 1, 'c': 3, 'b': 2}


>>> D['b'] 

  2

  b

  3

  1