Monday, 28 September 2015

Hash, Hashing, Hashtable

Hash/hash value can be used to uniquely identify secret information. This requires that the hash function is collision resistant, which means that it is very hard to find data that generate the same hash value.

Hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values/hash codes/ hashes.
Hash functions accelerate table or database lookup by detecting duplicated records in a large file.


Majorly used hash function: Division
The hash function is:  fD (x) = x % M   This gives bucket address that range from 0 to M-1, where M = the table size.

Hashing is the process of mapping large amount of data item to a smaller table with the help of a hashing function. The essence of hashing is to facilitate the next level searching method when compared with the linear or binary search. The advantage of this searching method is its efficiency to hand vast amount of data items in a given collection.

Loading density
The identifier density of a hash table is the ratio n/T, where n is the number of identifiers in the table.  The loading density or loading factor of a hash table is a=n /(sb).

When hash function generates same hash for two different values is known as collision. Because when be store the data in bucket than there will be more than one value in bucket.

Linear probing is a scheme in computer programming for resolving hash collisions of values of hash functions by sequentially searching the hash table for a free location.


stepSize = probes.
newLocation = (startingValue + stepSize) % arraySize

Linear probing function (H(x, i)):

H(x, i) = (H(x) + i) (mod n), Here ordinary hash function H(x).

Here H(x) is the starting value, n the size of the hash table, and the stepsize is i in this case.

Often, the step size is one; that is, the array cells that are probed are consecutive in the hash table. Double hashing is a variant of the same method in which the step size is itself computed by a hash function.

Chained Hashing/Chaining
Instead of storing the data item directly in the hashtable, each hashtable entry contains a reference to a datastructure, e.g. a linked list (chain), or a balanced search tree. We compute the hash value  v of the data item d and then store d in the data structure referred to from the hashtable entry  at index  v.

Chaining using linked list:

Hash table (hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

Ideally, the hash function will assign each key to a unique bucket, but it is possible that two keys will generate an identical hash causing both keys to point to the same bucket. Instead, most hash table designs assume that hash collisions—different keys that are assigned by the hash function to the same bucket—will occur and must be accommodated in some way.


Worst case

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...