DMS with Shannon Encoding - Detailed Solution with Comments

Question 1: provide detailed solutions with comments

Consider a DMS whose output is generated from an alphabet {a, b, c, d, e, f, g, h} with respective probabilities {0.25, 0.1, 0.01, 0.05, 0.25, 0.04, 0.25, 0.05}.

Design a Huffman code and sketch the corresponding coding tree. Specify the codewords for the letters in the alphabet.
Determine the average codeword length of the Huffman code.
Calculate the entropy of the source and compare it with the average codeword length of calculated in (b).
Encode the sequence aaabdceefghddca.
Solve and plot the results of questions a, b, c, d in Matlab

Question 2: provide detailed solutions with comments

For the DMS in problem (1), we now consider Shannon encoding.

Design a Huffman code and sketch the corresponding coding tree. Specify the codewords for the letters in the alphabet.
Determine the average codeword length of the Shannon code.
Calculate the entropy of the source and compare it with the average codeword length of calculated in (b).
Encode the sequence aaabdceefghddca.
Solve and plot the results of questions a, b, c, d in Matlab

Marking Scheme

Please note, working with others is an important part of computing – however, for your assignment, if your answers are the same as somebody else’s, or mostly the same as someone else’s, then you both will fail the assignment and will need to attend a plagiarism hearing.

The marks will be allocated as follows

A submission is unlikely to attract an excellent grade unless the following criteria are met.

You must provide evidence that the work is your own, references and quotations are given in the standard manner and clear discussion and documentation.
Your report should be clearly written and presented.
The University’s standard marking descriptions apply, see below

Description of Assessment Performance	Grade, Corresponding Grade Point
Excellent work	A+ 16 A 15 A- 14
Generally good work, but with some defects	B+ 13 B 12 B- 11
Generally sound work, but with a number of notable errors and/or omissions	C+ 10 C 9 C- 8
Work satisfactory, but with a number of significant errors and/or omissions	D+ 7 D 6 D- 5
Work unsatisfactory, with serious errors and/or omissions	E 4
Work of a very poor standard, with little relevant information	F 2
Work containing nothing of merit	G 1

Together with the qualifying features below.

As a guide and with reference to above, note that to obtain Grade A you must have completed all tasks.
The difference between B and A grades will depend upon the quality of the discussion and comments.

To obtain a pass mark it must be clear that the work is your own and that there is some evidence of a more than superficial understanding.

Step-by-Step Instructions

Huffman coding is applied in data transmission of text files and fax messages to enable data encoding and compression [1]. It is a statistical coding method that bases it prioritization on the occurrence of certain characters. In the DMS above, the output generated is based on probabilities assigned to the alphabets. In computing, all the characters are allocated the same amount of space despite the kind of character it is. In the modern era of data transmission, the code word lengths tend to vary unlike the ASCII [2].

Some can be shorter with more character frequency and other longer with less character frequency. The probability defines the character frequency in this problem [3]. Dr. David Huffman devised an algorithm to perform the Huffman coding tree with efficiency to ensure better compression of data for transmission purposes.

Step 1: reviewing the code or DMS output to determine the probabilities assigned to each character

Event	Probability
a	0.25
b	0.1
c	0.01
d	0.05
e	0.25
f	0.04
g	0.25
h	0.05

clc;

clear all;

close all;

% Define the character string

my_str = 'abcdefgh';

% Manually define the pbsability distribution

pbs_dist = [0.25 0.1 0.01 0.05 0.25 0.04 0.25 0.05];

num_bits = ceil(log2(length(pbs_dist)))

disp('Character Probability:');

for i = 1:length(pbs_dist)

display(strcat(my_str(i),' --> ',num2str(pbs_dist(i))));

end

total = sum(pbs_dist)

for i = 1:length(my_str)

Pngd_str{i} = my_str(i);

end The output is obtained as,

Step 2: the characters are sorted out or prioritized on the basis of their occurrences in the text. The occurrence in this case is determined by the probability of the event. Therefore,

c	f	d	h	b	a	e	g
0.01	0.04	0.05	0.05	0.1	0.25	0.25	0.25

% Save initial set of symbols and probabilities for later use

init_str = Pngd_str;

init_pbs = pbs_dist;

Pngd_pbs = pbs_dist;

rear = 1;

while (length(Pngd_pbs) > 1)

% Sort pbss

[Pngd_pbs,ndx] = sort(Pngd_pbs,'ascend');

% Sort string based on ndx

Pngd_str = Pngd_str(ndx);

% Create new symbol

new_node = strcat(Pngd_str(2),Pngd_str(1));

new_pbs = sum(Pngd_pbs(1:2));

% Dequeue used symbols from "old" queue

Pngd_str = Pngd_str(3:length(Pngd_str));

Pngd_pbs = Pngd_pbs(3:length(Pngd_pbs));

% Add new symbol back to "old" queue

Pngd_str = [Pngd_str, new_node];

Pngd_pbs = [Pngd_pbs, new_pbs];

% Add new symbol to "new" queue

newq_str(rear) = new_node;

newq_pbs(rear) = new_pbs;

rear = rear + 1;

end

Step 3: Now, one can build the Huffman code tree based on the prioritized list in step 2. New nodes are created where duplicates are present [4].

c	f	d	h	b	a	e	g
0.01	0.04	0.05	0.05	0.1	0.25	0.25	0.25
0.05		0.05	0.05	0.1	0.25	0.25	0.25
		0.1
		0.2
0.25
0.5
						0.5
1

tree = [newq_str,init_str];

tree_pbs = [newq_pbs, init_pbs];

Step 4: Reading the scale from left to right: implementation as done using MATLAB,

% Sort all tree elements

[Pngd_tree_pbs,ndx] = sort(tree_pbs,'descend');

Pngd_tree = tree(ndx);

parent(1) = 0;

num_children = 2;

for i = 2:length(Pngd_tree)

% Extract my symbol

me = Pngd_tree{i};

% Find my parent's symbol (search until shortest match is found)

count = 1;

parent_maybe = Pngd_tree{i-count};

diff = strfind(parent_maybe,me);

while (isempty(diff))

count = count + 1;

parent_maybe = Pngd_tree{i-count};

diff = strfind(parent_maybe,me);

end

parent(i) = i - count;

end

treeplot(parent);

title(strcat('Huffman Coding Tree - "',my_str,'"'));

display(Pngd_tree)

display(Pngd_tree_pbs)

[xs,ys,h,s] = treelayout(parent);

text(xs,ys,Pngd_tree);

for i = 2:length(Pngd_tree)

Code Examples using MATLAB

% Get my coordinate

my_x = xs(i);

my_y = ys(i);

% Get parent coordinate

parent_x = xs(parent(i));

parent_y = ys(parent(i));

% Calculate weight coordinate (midpoint)

mid_x = (my_x + parent_x)/2;

mid_y = (my_y + parent_y)/2;

% Calculate weight (positive slope = 1, negative = 0)

slope = (parent_y - my_y)/(parent_x - my_x);

if (slope > 0)

weight(i) = 1;

else

weight(i) = 0;

end

text(mid_x,mid_y,num2str(weight(i)));

end

for i = 1:length(Pngd_tree)

% Initialize code

code{i} = '';

% Loop until root is found

index = i;

p = parent(index);

while(p ~= 0)

% Turn weight into code symbol

w = num2str(weight(index));

% Concatenate code symbol

code{i} = strcat(w,code{i});

% Continue towards root

index = parent(index);

p = parent(index);

end

codeBook = [Pngd_tree', code']

grid on

To obtain the average codeword length of the Huffman Code [5],

Solution

Average codeword is given as,

From the code word set obtained from the solution, the average codeword is determined such that,

Therefore,

Event	Probability	Code	length
a	0.25	11	2
b	0.1	0111	4
c	0.01	01100	5
d	0.05	0101	4
e	0.25	10	2
f	0.04	01101	5
g	0.25	00	2
h	0.05	0100	4

Performing the entropy of the source, modelled as a random process associated with the probability [6]. The self-information of the solution is given as the information content such that,

alphabet with respective probabilities

let k be the number of source output provided by the DMS, the average self-information is given as,

The average information per source output for the source is given as,The entropy of the source measures the uncertainty of the source. The relationship between the uncertainty and the entropy can be illustrated using two probabilities [7].

pbs_dist = [0.25 0.1 0.01 0.05 0.25 0.04 0.25 0.05];

%% To compute the Source Entropy for the content given,

clc

xv=0;

for f=1:length(pbs_dist)

Hz=-pbs_dist(f)*log(pbs_dist(f));

disp(Hz)

xv=Hz+xv;

end

disp('The Source Entropy of the Codeword is given as')

disp(xv) % Source Entropy of the Codeword

The solution is given as,

The comparison between the average codeword length and the source entropy results in the coding efficiency such that,

QUESTION 2 (50 points)

For the DMS in question 1, we now consider Shannon encoding.

Design a Huffman code and sketch the corresponding coding tree. Specify the codewords for the letters in the alphabet.
Determine the average codeword length of the Shannon code.
Calculate the entropy of the source and compare it with the average codeword length as calculated in (ii)

The Shannon coding is done for noiseless channels. In this coding method, the code is lossless and the channel is perceived noiseless. According to the Shannon coding Theorem, the average code word length of a source of symbols can at best be equal to the source entropy and can never be less than that [8]. The Huffman coding, in the Shannon’s noiseless coding theorem, is optimal for a fixed alphabet size which is subject to the constraint that the source symbols are coded one at a time [9]. The Huffman coding algorithm is implemented,

Step 1: The events are arranged in descending order of their probabilities,

Event	Probability
c	0.01
f	0.04
d	0.05
h	0.05
b	0.1
a	0.25
e	0.25
g	0.25

%% define the inception parameters

my_str='abcdefgh'; %characters

%probability distribution over the characters

h=[0.25 0.1 0.01 0.05 0.25 0.04 0.25 0.05];

m=length(h)

T=cell(0); %defines an empty cell

for i=1:m

T=cell_set(T,i,i);

end

Step 2: The lowest probability symbols into a single compound symbol that replaces them in the next source reduction. The algorithm performs an iterative grouping of events based on their probabilities.

%% The algorithm in this case is slightly different such that the code

... The tree is formed from the lowest probabilities which are merged together

...progressively in an iterative manner

%step 1: Probability at the inception

p=h;

%merging the leading probabilities in order of decay

while length(p)>1

%perform sorting

[v,I]=sort(p);

if v(1)>v(length(v))

v=reverse(v);

I=reverse(I);

end

q=sum(v(1:2));

t=cell_sub(T,I(1:2));

%% simplifying the tree slightly

T = cell_sub(T, I(3:length(I)) );

p = v(3:length(v));

% Adding new events alongside their given probabilities

p(length(p)+1) = q;

T = cell_set(T, length(p), t);

end

%% graphical illustration of the binary coded tree

clf;

plot_hufftree(T)

Based on the output,

Event	Probability	Assigned Code	Code length
a	0.25	0	1
f	0.04	10	2
c	0.01	11	2
h	0.05	110	3
d	0.05	1110	4
b	0.1	1111	4
g	0.25	11110	5
e	0.25	111110	6

double entropy2probability( double entropy )
{ if(entropy>1.0) return 0.0 ;
double d, p = 0.5 ;
double f = entropy*0.75 ;
do { d = ENTROPY(p) - entropy ;
p += d*f ;
}
while( fabs(d)>0.000000001 ) ;
return p ;

Let be code word length of symbol

It is given as, 3.49 bits per event

The Shannon entropy [10] is given as,

It is given as, 2.52 bits

The code efficiency is given as

The use of Shannon encoding improves the coding efficiency from 64 percent to 72 percent [11]. The aim of the digital communication system is to have a coding efficiency closer to unity than any other value. The value at unity denotes that the transmission is successful without any loses or noise in the channel. The limits of efficiency should, however, not go below 0.5 as that would show that the technique is quite poor. As a result, the data transmission is ineffective and the quality of service, QoS, is rated as very poor in such a communication system [12].

References

[1]	T. Lee and H. Park , "Design and Implementation of a Static Huffman Encoding Hardware using a Parallel shifting Algorithm," IEEE, vol. 51, 2004.
[2]	R. G. Gallager, Information Theory and Reliable Communication, New York: Wiley, 2009.
[3]	U. Abrahams, Code and Parse Trees for Lossless Source Encoding, 2008.
[4]	M. B. Baer, "Redundancy related bounds for generalized Huffman Code," IEEE M, B., 2011.
[5]	C. E. Shannon, "A mathematical Theory of Communication," Bell System Technical journal, pp. 34-89, 2011.
[6]	L. Y. Liu, W. L. G, R. Wang and L. Y, "Data Structure of Huffman Code and its application to efficient encoding and decoding," IEEE, 2005.
[7]	T. M. Cover, "On the competitive optimality of Huffman codes," IEEE Transactions on Information Theory, 2001.
[8]	J. Ji-Han, C. Chin-Chen and C. Tung-Shou, An efficient Huffman Decoding Method Based on Pattern Partition and Look-up Table, 2009.
[9]	R. Gailly, "Performance analysis of Huffman coding algorithm," IJARCSSE, vol. 3, no. 10, pp. 25-90, 2013.
[10]	A. Rabia, S. Adeel and K. Danista, "Performance comparison of Huffman Coding and Double Huffman Coding," Innovative Computing Technology (InTech), vol. 6, pp. 361-364, 2016.
[11]	R. Capocelli, R. Giancarlo and I. Taneja, "Bounds on the redundancy of Huffman codes," IEEE Transations of information Theory, vol. 32, no. 6, pp. 854-957, 2016.
[12]	P. M. Fenwick, "Huffman Code Efficiencies for extensions of sources," IEEE Transactions on Communications, vol. 43, no. 2/3/4, pp. 34-76, 2005.

Cite This Work

To export a reference to this article please select a referencing stye below:

My Assignment Help. (2021). DMS With Shannon Encoding - Detailed Solution With Comments. Retrieved from https://myassignmenthelp.com/free-samples/cis117-digital-microwave-and-optical-communications/communication.html.

"DMS With Shannon Encoding - Detailed Solution With Comments." My Assignment Help, 2021, https://myassignmenthelp.com/free-samples/cis117-digital-microwave-and-optical-communications/communication.html.

My Assignment Help (2021) DMS With Shannon Encoding - Detailed Solution With Comments [Online]. Available from: https://myassignmenthelp.com/free-samples/cis117-digital-microwave-and-optical-communications/communication.html
[Accessed 25 April 2024].

My Assignment Help. 'DMS With Shannon Encoding - Detailed Solution With Comments' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/cis117-digital-microwave-and-optical-communications/communication.html> accessed 25 April 2024.

My Assignment Help. DMS With Shannon Encoding - Detailed Solution With Comments [Internet]. My Assignment Help. 2021 [cited 25 April 2024]. Available from: https://myassignmenthelp.com/free-samples/cis117-digital-microwave-and-optical-communications/communication.html.

Get instant help from 5000+ experts for

Writing Rewriting Editing

Subject/course code

❮ ❯

Pages

250 words

Order description (write/attach)

Attach file

I accept the T&C, agree to receive offers & updates

Have a coupon code?