CP164 Study Guide - Final Guide: Disk Array, Binary Search Algorithm

42 views2 pages

maroonmacaw543

14 Jun 2018

School

Department

Course

Professor

For unlimited access to Study Guides, a Grade+ subscription is required.

Hashing

We've seen various data structures for storing and retrieving data. Linear lists

provide O(1) insertion and O(n)searching, binary search trees provide O(log n) (average

case) insertion and retrieval. Can we do better in terms of time complexity for insertion

and retrieval? Yes, it turns out that we can, using hashing. There are a number of different

ways of using hashing, but we will look only at the HashSet data structure.

Hashing is implemented by a hash function, a function that, given a key value, returns an

integer value. We can then use this integer value to decide where in a storage array the

data should be stored. Ideally each key value generates a unique index value, thus placing

each piece of data in a separate and unique location in the data array. In practise we have to

be able to deal with situations where different key values generate the same index value,

which we will look at later.

Using Movie data as an example, we know that Movie objects use their title and year as a

key. We therefore must have method of turning those two pieces of data into an integer

number. To do this we are going to extend the Movie class by implementing

a __hash__ method. (Python then allows you to call this method with the keyword hash.)

The __hash__ method must turn the movie title and year into an integer. The following

method does just that:

def __hash__(self):

"""

-------------------------------------------------------

Generates a hash value from a movie title and year.

Use: h = hash(movie)

-------------------------------------------------------

Postconditions:

returns

value - the total of the characters in the title string

multiplied by the year (int > 0)

-------------------------------------------------------

"""

value = 0

for c in self.title:

value = value + ord(c)

value *= self.year

return value

(The ord function returns the integer code version of a character.)

Applied to some of our Movie data, this hash function produces values such as:

find more resources at oneclass.com

Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

Already have an account? Log in

Document Summary

We"ve seen various data structures for storing and retrieving data. Linear lists provide o(1) insertion and o(n)searching, binary search trees provide o(log n) (average case) insertion and retrieval. Yes, it turns out that we can, using hashing. There are a number of different ways of using hashing, but we will look only at the hashset data structure. Hashing is implemented by a hash function, a function that, given a key value, returns an integer value. We can then use this integer value to decide where in a storage array the data should be stored. Ideally each key value generates a unique index value, thus placing each piece of data in a separate and unique location in the data array. In practise we have to be able to deal with situations where different key values generate the same index value, which we will look at later.

CP164 Study Guide - Final Guide: Disk Array, Binary Search Algorithm

Document Summary

Get access

Related Documents

CP164 Study Guide - Final Guide: Perfect Hash Function, Linear Search, Disk Array

CP164 Study Guide - Final Guide: Cemetery Man, Init

CP164 Study Guide - Final Guide: Cemetery Man, Init