CSP2348 Assignment 2, Semester 1, 2

CSP2348 Data Structures

Assignment 2: Algorithm Implementation and Analysis

– Theme: Algorithm design, analysis & implementation

Objectives from the Unit Outline

· Apply mathematical theory to algorithm analysis.

· Design algorithms using various data structures.

· Analyse complexity and performance of their associated algorisms.

General Information:

· This is an individual assignment. Required skills include algorithm design, complexity analysis and implementation using Python programming language.

· An array is a typical data structure often used in many applications. An array is a random-access structure. You can use unique indexes to access array components in constant time. In term of data manipulation, this property makes array the most effective data structure.

· Algorithm complexity analysis is one of the most important skills covered in this unit. It can be done in either or both of theoretical algorithm analysis and experimental studies. Generally, theoretical analysis is performed by applying asymptotic theory/methods (such as in O-notation, etc.) to the algorithm’s complexity function that reflects the number of basic operations the algorithm must execute on the input size, so the growth rate of the function can be estimated. On the other hand, the experimental study requires the implementation of the algorithms to reveal the growth trend of the complexity function.

· This assignment focuses on array searching and sorting algorithms. You are requested to design, implement, analyse some algorithms, and do an experimental study on some array sorting algorithms. There are three questions. Question 1 is a task to design and implement a sorting algorithm and analyse it using O-notation. Question 2 is an experimental study for a total of 6 array- based sorting algorithms. The last question requires you to design and implement a tiny application using array data structures and employing specific array sorting/searching strategies into the application development.

· You will need to submit one assignment file (i.e., a report, in Word or PDF format) plus three (or more) code files.

Due: (See Due Date in Assignments section on Canvas)

Marks: 100 marks (which will be converted to 30% of unit mark)

Tasks

Q1: Array sorting algorithm design, implementation, and analysis (20 marks)

Assume that an array, A[0…n-1], contains n distinct integers. One way to sort A[ ] is based on the following observation:

For each element of A, say A[i], if there are k elements less than A[i] in A[ ], then after the array is sorted, A[i] will eventually be stored in A[k].

Index of A[…]

A[…] (before sorting)

–21

–2

A[…] (after sorting)

–21

–2

The above observation is the basis of the Counting Sort algorithm:

For each element of array, A, count the total number of elements of A that are less than it therefore to determine the final index/position at which the element is to be stored in the sorted version of A.

Task (1) – Sort the array A, assuming all elements in A are distinct.

(a) Based on the above idea/observation, design/write an algorithm CountingSort(A[ ], n) to sort the array

A. You can assume that A contains distinct integers only.

(Hint: If necessary, create an auxiliary array, e.g., C[0…n-1], such that C[i] stores the counts of individual elements of A that are less than

A[i] (i = 0, 1, 2, …, n-1). For example, the auxiliary array C[ ] for the example array A[ ] may look like:

Index of A[…]

A[…] (before
sorting)

–21

–2

C[…] (the counts of A[i])

You can then re-arrange elements of A[ ] based on the values in C[ ] therefore to sort the array A).

(b) Using O-notation to analyze the time complexity of your algorithm.

(d) Convert your algorithm to a Python program (e.g., you may name the code CountingSort1.py) , and then test it use an array of some 20 elements (note that the array used for testing must contain distinct elements only).

Task (2) – Sort the array A, allowing duplicated elements in A

Note that the above algorithm developed in Task (1) can sort arrays that contain distinct elements only. It may not work if the array contains duplicated elements.

(e) Modify your algorithm completed in Task (1) (a) so that it works for general integer arrays (i.e., the array elements may or may not be distinct).

Index of B[…]

B[…] (before sorting)

–21

B[…] (after sorting)

–21

(hint: Same as in the case of Task (1), create an auxiliary array, e.g., C[0…n-1], to store the counts of individual elements of the array. For example, the auxiliary array for the example array B[ ] looks like this:

Index of B[…]

B[…] (before sorting)

–21

C[…] (counts)

Note that for some i and j, if B[i] = B[j] then C[i] = C[j]. That is, the counts for all duplicated elements of B[ ] are equal. For example, there are three occurrences of 3 in B[…] (i.e., B[2], B[5], B[7]). Therefore, the counts for these elements will be the same, i.e., C[2]=C[5]=C[7]. – You need to develop a way so that those duplicated elements in B[ ] are stored consecutively in the sorted version of B[ ]. In this way, B[ ] can be sorted based on the values in C[…]).

(f) Re-do the sub-tasks of (b), (c) and (d) as that in Task (1) for this case.

(note that the array used for testing may contain duplicated element/s. You may name the code

CountingSort2.py).

Q2: Algorithm analysis by experimental studies (30 marks)

Produce Python code/s to implement a system that can be used to aid an experimental study on array- based sorting algorithms. There are two subtasks:

(a) Develop a system to test various cases of sorting algorithms. The system starts with a main Menu with three menu options:

This allows the user to choose to test individual or multiple sorting algorithms.

(i). Test individual sorting algorithms: This option leads to a sub-menu, like:

from which the user can choose to test any one specific sorting algorithm.

(Note: The counting sort is the algorithm that you have completed in Q1 Task (2), i.e., CountingSort2.py).

– After an option is chosen, the system prompts the user to enter the size of the array, i.e., an integer number n (n>0), and then it randomly generates n integers and store them in an array, say A[ ].

– Then it runs the sorting algorithm (taking A[ ] and n as input parameters), and outputs the number of comparisons (see step (2) in the Recommended procedure).

– Once completed, it goes back to the main menu.

(ii). Test multiple sorting algorithms:

– The system prompts the user to enter an integer number n (where n>0) as the size of an array, and then randomly generates n integers, and stores them in an array, say A[ ] (Note: For performance comparison, you must use the same A[…] and n to test all sorting algorithms in the following steps, i.e., do not re-generate array data for different sorting algorithms – why?).

– Print a table heading on screen, like:

Sorting algorithm name

Array size

Num. of
Comparisons

Run time (in ms.)

– Loop to generate a table: For each of the 5 sorting algorithms (see those listed in step

(a) (i)), the system calls individual algorithms, taking A[ ] and n as input/test data, and outputs the sorting results (including the name of the sorting algorithm, the size of the test data, the number of comparisons and the running time). That is, it produces a line of result (for each algorithm), e.g., for the selection sort algorithm, the output line may look like (when n is 1000):

selection sort

1000

499500

1.3 (ms.)

– Once completed, it goes back to the main menu.

(b) Conduct an experimental study by completing the following Table 2 and Table 3.

This can be done by either a manual or an automatic way:

· Manually: Run the above code/s (e.g., see (a) (ii) above), one at a time, with appropriate parameters (e.g., with various n values), collect data from the outputs, do necessary calculations (e.g., averaging over 10 runs for a specific n value, etc.) and fill in the tables; or

· Automatically: produce a code (e.g., by modifying/calling the above code/s) so that it produces the two tables as the output.

(Note: 1. your output format can be slightly different, but it must contain the required information, see those in the tables below.

2. You may find that countingSort1 is excluded from Table 1 and Table 2 because it doesn’t work for

general arrays – it works only for arrays of distinct integers/values).

Table 1: Experimental study: Average number of comparisons for sorting arrays of n integers (over 10 runs).

Sorting Algorithm

n=100

n=200

n=400

n=800

n=1000

n=2000

Selection sort

Insertion sort

Merge sort

Quick sort

Heap sort

CountingSort2

Table 2: Experimental study: Average running time (in ms) for sorting arrays of n integers (over 10 runs).

Sorting Algorithm

n=100

n=200

n=400

n=800

n=1000

n=2000

Selection sort

Insertion sort

Merge sort

Quick sort

Heap sort

CountingSort2

Recommended procedure (for Q2)

To assist you prepare and achieve the solution/s, the following steps are recommended:

(1) Write Python codes to implement the following array-based algorithms that can be used to sort an array of n integers:

· Selection sort

· Insertion sort

· Merge sort

· Quick sort

· Heap sort

· CountingSort2

(Note: the first 5 array-sorting algorithms are implemented and tested in the lab sessions and you have implemented the 6^th algorithm in Q1 – you should use them as your start points).

(2) For each of the above codes, make necessary modification so that it not only sorts the array, but

also counts the number of comparisons, and records running time (say, in ms).

(3) Write Python code to implement (a) (i).

(4) Write Python code to implement (a) (ii).

(5) Complete (b).

Q3: Array-based Application Programming (35 marks)

– A tiny programming project simulating Lotto’s award checking system

Application scenario:

A tiny Lotto system allows up to 1000 people to play “lotto”. This system provides some functions to facilitate lotto players to check their game status and get more information of lotto playing statistics. To simplify the system, we generate random data to mimic the lotto drawing and other functions.

The lotto drawing is a process that the system randomly generates winning numbers, which is a list of 8 distinct integers from a barrel of 30 integer numbers (i.e., all numbers are between 1 and 30, inclusive). These 8 winning numbers are divided into two parts: the first part consists of the first 6 winning numbers in sequence, called Primary Winning Numbers (PWNs), and the second part consists of the last two winning numbers, called Supplementary Winning Numbers (SWNs). One example of the lotto drawing result may look like (see Figure 1):

PWNs

SWNs

14, 18,
22, 9, 29, 6

17,
30

Figure 1. Format of PWNs & SWNs

For simplicity, we assume that each player is assigned an ID number, k (1£ k £1000). We sometimes use the ID numbers to identify players (e.g., player i, player k, etc.). We also limit one player to one set of game-numbers. That is, each player can choose one set of 6 distinct integer numbers as his/her game-numbers, each of which is also from a barrel of 30 numbers (i.e., all game numbers are integers between 1 and 30, inclusive). All players’ game-numbers are stored in a tiny “database”, which in this case is implemented/replaced by a two-dimensional array (matrix, or list), lotto[0…999][0…5]. More specifically, for each player i, his/her game-numbers are stored in the array of lotto[i-1][0…5]. The array lotto[0…999][0…5] can be represented schematically as follows (see Figure 2, note that the data within the array are for illustration only):

Player’s ID, i

Player i’s game
numbers

…

999

Figure 2. Format of game data in lotto[ ][ ]

There is a total of 4 classes/levels of lotto winners. To be a winner, a player’s game-numbers must meet the winning conditions/criteria:

· A 1st class winner is one whose game-numbers match/contain all 6 PWNs;

· A 2nd class winner is one whose game-numbers contain any 5 PWNs;

· A 3rd class winner is one whose game-numbers contain any 4 PWNs;

· A 4th class winner is one whose game-numbers contain any 3 PWNs or contain the two SWNs.

Note: As a rule, if a player is a winner, he/she is considered as a winner of his/her highest class only, not a winner of any lower classes (e.g., if p is 1^st class winner, p will not be considered as a 2^nd, or 3^rd, or 4^th class winner, although his/her game-numbers may still meet the winning conditions of lower classes)

System behaviour:

The tiny lotto system starts with an initialization stage, followed by a data pre-processing stage. It then displays a Menu (see below) that allows the user to navigate through and execute the following menu options:

The system continues looping and prompting user to choose one of these menu options until the user chooses to exit.

· Initialization stage: This stage has two initialization tasks:

(i) To initialize “lotto” players’ data, the system randomly generates all game-numbers for all players and stores the data in the array of lotto[0…999 ][0…5].

(ii) To imitate a lotto draw, the tiny “lotto” system also randomly generates a list of winning numbers (a sequence of 8 distinct numbers) and stores them in an integer array WinNo[0…7]. The first 6 numbers are PWNs, and the last 2 numbers are SWNs. Once

again, each winning number comes from the same barrel of 30 numbers (i.e., 1 £

WinNo[i] £ 30, for all i = 0, 1, 2, 3, 4, 5, 6, 7).

· Data pre–processing stage: This step/stage sorts all data stored in the key arrays (or lists) to facilitate future processes:

(i) It sorts game-numbers for all 1000 players. That is, it sorts the array lotto[ i ][0…5] for all i = 0, 1, 2, …, 999;

(ii) It then sorts the PWNs (i.e., data stored in WinNo[0…5]; and finally

(iii) It sorts the SWNs (i.e., data stored in WinNo[6…7].

(1) Menu option 1 (i.e., “Show initialized data” feature): This option prints two tables: the first table contains the game data stored in the array lotto[ ] [ ] and the second table contains the winning numbers stored in WinNo[ ]. It is required that you print/display the first table in the format showing in Figure 2 (e.g., one row per player for each player’s game numbers).

(2) Menu option 2 (i.e., “Display statistics of winners” feature): The system calculates the total number of winners for each class, thereby generating a statistic table, which is displayed on screen in the following format:

Winners statistics table:

Winner class

Total number of
winners

1^st class

2^nd class

3^rd class

4^th class

where the question marks (“?”) are to be replaced with data calculated from this option.

(3) Menu option 3 (i.e., “Check my lotto status” feature): The system displays an interface where a lotto player can check his/her lotto status (i.e., whether he/she is a lotto winner). When the player inputs his ID number k (1 £ k £ 1000), the system checks whether he/she wins the lotto and print the following output:

Player’s ID:

<k>

Player’s game-numbers:

<Game-numbers, in sequence, of lotto[k-1][0…5]>

PWNs:

<PWNs, in sequence>

SWNs:

<SWNs, in sequence>

Player’s status:

while <k> is the Player’s ID entered, and most of above data inside < … > are clear in meaning, and the <Message> is a piece of information, as described below:

If the player’s game-numbers (stored in lotto[k-1][0…5] ) match or contain

· all 6 Primary winning numbers (PWNs), the <Message> is “You win the game, congratulations!”

· any 5 PWNs, the <Message > is

“You are a 2^nd class winner, congratulations!”

· any 4 PWNs, the <Message > is

“You are a 3^rd class winner, congratulations!”

· any 3 PWNs, the <Message > is

“You are a 4^th class winner , congratulations!”

· less than 3 PWNs, but the game-numbers contain two SWNs, the <Message> is “You won a 4^th-class prize with SWNs, congratulations!”

· less than 3 PWNs and less than two SWNs, the <Message> is

“You are not a winner. Thanks for playing lotto. Good luck next time!”

(Note that there are two cases where a player could be a 4th class winner).

Your Tasks (for Q3):

Design algorithm/s and write Python code/s to implement the above tiny “lotto” system. Some further requirements are:

a) Sorting (in data pre-processing stage):

To reduce the searching costs, all arrays must be sorted in the data pre-processing stage. For practical purpose, you are required to use

(i) Insertion-sort algorithm to sort the PWNs (i.e., sub-array WinNo[0…5]);

(ii) Selection-sort algorithm to sort the SWNs (i.e., sub-array WinNo[6…7]); and

(iii) Merge-sort algorithm to sort all player’s game-numbers (i.e., array lotto[i] [0…5], for i = 0, 1, 2, …, 999).

b) Searching (for Menu option 2 only):

You should use binary-search algorithm (or develop your own, however not linear- search algorithm) as part of the algorithm that implements Menu option 2.

For each player i, determine if he/she is a winner and, if so, to which class? This can be done by searching each data item x = lotto[ i ][ j ] (j = 0, 1, …, 5) from PWNs (i.e., sub-array WinNo[0…5 ]) to determine whether x is a winning number. By counting the total number of winning numbers, you can determine if player i is a winner (of any class). Collate the data from this process and complete the first three rows of the winner statistics table and part of the fourth row of the table.

If player i is not a winner of any class based on above calculations, do the following to further determine if he/she still qualifies as a 4^th class winner for which his/her game-numbers contain the two SWNs: to do this, after the above steps, further check whether player i’s game-numbers contains the two SWNs, thereby determining whether player i is a 4^th class winner.

c) Matching (for Menu option 3 only):

(i). As a sub-task, develop a new algorithm match_value(A, B), that computes the matching values of two sorted integer arrays A and B and returns the total number of matching values of the two arrays.

For example, suppose

A[ ] = {2, 35, 42, 46, 72, 80, 97},

B[ ] = {7, 46, 65, 72, 89, 100}, and

C[ ] = {35, 41, 62, 80}.

Then match_value(A, B) would return 2 (because arrays A[ ] and B[ ] contain two matching values, 46 and 72), match_value(A, C) would return 1 (because there is one matching value, 35, in the two arrays A[ ] and C[ ]), and match_value(B, C) would return 0 (because values in array B[ ] do not match any values in array C[ ]).

(Hint: you may consider modifying the array-merging-algorithm to achieve this – see Workshop & PPT in Module 3)

(ii) Once you have completed and tested the match_value(A, B) algorithm, use it as a part of the algorithm that implements your Menu option 3. That is, once you receive a player’s ID (e.g., k), call the algorithm match_value(…) with lotto[k-1][0…5] as the first array and the PWNs as the second array. This will allow you determine if the player won the 1^st, 2^nd, 3^rd, or 4^th class prize, or none, completing part of Menu option 3.

(Please note that the game-numbers for each player (i.e., data stored in the array lotto[ i

][0…5], for any i value) are distinct. Pay attention to this requirement when you generate random numbers for lotto[ ][ ]. The same is true for generating Winning numbers)

***************************************************************************************** Some requirement/restriction on coding:

Text Box: (1). The only packages you can use (or import) are time, random, math and sys. You should not import any other package (such as tabulate, numpy, etc.) in your code. If you "import" any non-standard package/s into your code that prevent your code from running in our lab environment (e.g., when marking your assignment), you should bear the risk of losing marks. Therefore, it is recommended to you test your codes in the lab environment before submitting the assignment.
(2). This assignment requires the use of the Python programming language to implement the coding tasks. If you want to use a programming language other than Python, you should get a permission from your tutor beforehand, and you may be asked to personally demonstrate your work at the end.
(3). Please use one single programming language for all your programming tasks.

————————————————————————————————————–

Submission Instructions (3 marks)

· Submit your Assignment 2 via Canvas assessment submission facility.

· Your submission should include the assignment main document (i.e., a report) and Python source code/s. The main assignment document must be in report style, in Word or PDF format (see detailed format requirements below). Pages must be numbered. Your source code file/s must be runnable.

· Your submission should be in a single compressed file (e.g., .zip file) that contains your main document / report and source code files, as well as any other support documents (if any). Rename the .zip file to the following format

<student ID>_< First name>_<LAST NAME>_CSP2348_A2.zip.

· No hard copy submission is required.

· ECU rules/policies will apply for late submission of this assignment.

Format requirement of the Assignment report (main document) (12 marks):

Required
Content

Cover page

Must show unit code/name, assignment title, the
student ID and name, due date etc.

Executive Summary
(optional)

This should represent a snapshot of the entire
report that your
tutor will browse
through and should contain
the most vital information needed.

Table of Contents (ToC)

This must accurately reflect the content of your report and should be generated automatically in Microsoft Word with clear page numbers.

Introduction

Introduce the report, define
its scope, and state
any assumption if any; Use in-text citation/reference where
appropriate.

Main body (note: you should
use appropriate section title/s)

This part should contain (but not limited to):

· Understanding of concepts/techniques involved in the document/report.

·
Any strategies used to solve
the problems (e.g., an approach to develop a solution, if any).

·
The questions being solved/answered, e.g., Q1: (should include
but not limited to)

(i)
algorithms (in pseudo codes)
and algorithm analysis; and

(ii)
screen shots
(results of running the code/s, e.g.,
one screenshot per
algorithm). Q2: required result data/details (e.g., in tables)

screen snapshots of code execution results (e.g., one snapshot per sort algorithm, showing the code’s inputs, outputs or running result/s, including the number/s of comparisons and algorithm’s running
time, etc.).

Q3: the
match_value(…) algorithm (for Menu option
3);

screen shots (results of running the
code/s, e.g., one screenshot per function, etc.).

·
Comment/discussion or critiques on the solutions developed /achieved, etc.

·
No Python
codes are included/attached in body of the report
(all codes should
be saved as separate files).

Conclusion

Outcomes or summary of the works
completed in this
assignment.

References

A
list of end-text reference, if any,
formatted according to the ECU requirements using
the APA format (Web references are also OK).

Other requirements

The document/report should
be no more
than 10 pages (excluding references, snapshots, and diagrams). The text must use font Times New Roman and be not smaller than 12pt.

Academic Integrity and
Misconduct

·
This assignment is an individual work (not a teamwork). Your entire assignment must be
your own work (unless quoted otherwise) and produced for the current instance
of the unit. Any use of uncited content that was not created by you yourself
constitutes plagiarism and is
regarded as Academic Misconduct. All assignments will be submitted to
plagiarism checking software, which compares your submission version with
previous copies of the assignment, and the
work submitted by all other
students in the
unit (in the
current semester or previous
years/semesters).

·
Never
give any part of your assignment to someone else, even after the due date
or the results are released. Do not work on individual assignment with other
students. – You can help someone by explaining concepts, demonstrating skills,
or directing them
to the relevant resources. But doing any part
of the assignment for them or with them or showing them your work is inappropriate. An
unacceptable level of cooperation between students on an assignment is collusion and is deemed an act of
Academic Misconduct. If you are not sure about plagiarism, collusion or
referencing, simply contact your lecturer or unit coordinator.

·
You may be asked to explain and
demonstrate your understanding of the work you have
submitted. Your submission should accurately reflect your understanding and
ability to apply the unit content and the skills learnt from this unit.

·
Before submitting your assignment 2, make sure
you have checked all items in the Academic Integrity tick-before-submit checklist below and watch the video by clicking
the link next to the checklist image:

Watch this video
before submitting your assignment

Indicative Marking Guide:

Description

Allocated Marks (%)

Marks achieved &
Comments

Counting sort algorithm for
arrays of distinct elements:

o Algorithm design
& analysis

o Code/s to implement the algorithm, code testing /running result (by snapshot/s)

Code runs as expected? Inputs/Outputs & others

General counting sort algorithm (for arrays with
duplicated

elements):

o Algorithm design
& analysis

o Code/s to implement the algorithm, code testing /running result (by snapshot/s)

Code running? Inputs/Outputs as expected?

Algorithm analysis by experimental studies: re-implementation
of 5 sorting algorithms
2-level menus implemented

experimental data generation, correctly use of experimental data.

Code/s run as expected? Outputs correctly?

Q3: Tiny lotto system
development:

Initialization stage:
data generated as required
Data pre-processing stage: data sorted.

Menu option 1: shows data
as required.

Menu option 2: Sorting & searching algorithms used as required: winner
statistics table generated correctly.

Menu option 3: match_value( …) algorithm works,
and player’s status data shows as desired.

System runs OK and functions!

Overall quality of the work.

Report

Document presented as per format
requirement?

Submitted as per
submission requirement? etc.

Total mark of 100

(which is then converted to 30% of unit mark)

100 (/30)

Q1: Array sorting algorithm design, implementation, and analysis (20 marks)

Q3: Array-based Application Programming (35 marks)

Application scenario:

System behaviour:

Related Posts