Comparison of search - sort - and hash functions.

Empirical Comparison of Array Search, Sort, and Hash Functions

Task A: Search Algorithm Analysis

In this task you will be comparing the performance of your selected array search algorithm (ternary, exponential or Fibonaccian) to the search algorithms provided as examples using both static analysis and empirical experimentation. Closely follow the structure or the test harness and search functions provided as examples (Jump Search example in Appendix A) to implement your chosen algorithm; note that you are only required to examine the scenario of successful searches. The examples are written in MATLAB, but you are free to use an alternative language if you wish (e.g., Python);

however, you will then need to convert the examples I have provided to that language to enable you to plot graphs that compare their empirical performance to that of your chosen algorithm. If you use Python, you could use matplotlib to plot graphs; if you use a language like C# or Java you may need to store your results in a text file and plot them using Excel or similar. Obviously, the easiest approach is to use MATLAB and simply modify the examples provided.

1. Empirical Analysis Test Function (Code):

2. Empirical Analysis Results Line Graph (Figure):

3. Evaluation of Empirical Test Findings (Written; Max 100 Words):

References:

In this task you will empirically compare three different gap sequence functions that can be used with the Shell sort algorithm, and compare to Shell (1959). Appendix B contains an example function for generating Shell’s original gap sequence, an implementation of Shell sort that uses Shell’s original gap sequence function, and a table of alternative sequences, including algebraic processes that can generate them, the first few terms of each sequence, and the worst case time complexities that are thought to be produced when these gap sequences are used. As noted in class, big-O notation, and especially worst case analyses, do not necessarily capture how an algorithm will function in the real world,

so this task requires that you create a simulation to test the following scenarios: i. reversed arrays (the putative worst case scenario); ii. random arrays of different lengths containing unique values. Note that, unlike Task A, your test harness cannot test all possibilities (i.e., arrays unsorted in all possible ways) because there are too many (). You will discuss the results of your experimentation and compare the relative performance of the gap sequences you selected.

Do not shy away from choosing gap sequences that share worst case time complexities, as there could be differences in performance in practice that would be interesting to uncover. For sorting there are two unit of work that will test us how much work the algorithm has done: comparisons and element moves. You will need to add to the Shell sort function provided to ensure that this information is recorded. You may use any language you wish, but clearly MATLAB and Python with matplotlib provide in-built data science functions for graph plotting and statistics, so will be easier.

1. Functions for Generating Selected Gap Sequences (Code):

2. Modified Shell Sort Function for Recording Metrics of Interest - Comparisons and Element Movements (Code):

3. Test Harness for Generating Test Data, Running Shell Sort Function, and Aggregating Metrics of Interest (Code):

4. Empirical Analysis Results Line Graph (Figure, Statistics):

5. Evaluation of Empirical Test Findings (Written; Max 100 Words):

References:

In this task, you are to perform an empirical comparison of the efficacy of three different hash functions (selecting from midsquare, modulus, folding, truncation, and Fibonacci) to assess their usefulness for transforming integer keys to array indices in a hash table with the objective of minimising collisions and clustering.

This task is a little more open-ended than the previous two, however you should build a simulation that involves the generation of keys (random and sequential) and measure the degree of clustering (i.e., how evenly spread the keys are in the table) under different load factors, how often collisions occur, and also evaluate the avalanche property and other desirable features of a hash function.

such as having a surjective relationship with the target array and being fast to compute. For each, do some background reading to understand how they may be applied in practice (e.g., folding may, in practice, involve grouping digits before summing), and state any assumptions you make.

1. Brief and Precise Description and Hash Tables and the Role of Hash Functions (Written, Max 100 Words):

2. Description and Static Demonstration of Hash Functions Selected (Written, Equations, Max 50 Words):

3. Implementation of Chosen Hash Functions as Function accepting a Key and Returning an Index (Code):

4. Test Harness for Evaluating Hash Functions with Random and Sequential Keys (Code):

5. Clustering Visualisations and Summary Statistics (Figure, Statistics):

6. Evaluation of Relative Efficacy of Hash Functions Compared (Written, Max 100 Words):

References:

Get instant help from 5000+ experts for