Task:
Â
Q1(a)Explain the role of activation functions in neural networks.
Â
(b)A feedforward neural network has two input nodes, one hidden layer with three hidden neurons, and one output node. The activation functions for all hidden and output neurons are the same linear function in the following form:  g(x)=c*x,  where  âcâ  is  a  constant.  Please  explain  why  this  neural network is a linear classifier (you should use mathematical equations with analysis to illustrate this).
Â
(c) For the same neural network as in (b), if we replace the activation function for the output neuron with the sigmoid function, will this neural network still be a linear classifier? Please present your analysis and use mathematical equations to illustrate your ideas.
(d) Explain the role of activation functions in neural networks.
Â
A feedforward neural network has two input nodes, one hidden layer with three hidden neurons, and one output node. The activation functions for all hidden and output neurons are the same linear function in the following form:  g(x)=c*x,  where  âcâ  is  a  constant.  Please  explain  why this  neural network is a linear classifier (you should use mathematical equations with analysis to illustrate this).
Â
For the same neural network as in (b), if we replace the activation function for the output neuron with the sigmoid function, will this neural network still be a linear classifier? Please present your analysis and use mathematical equations to illustrate your ideas.
(e) Consider the same neural network in (d). For a training sample with input x1=1, x2=2, calculate the output of the network if the activation function g() in  the  hidden  layer  and  the  output  layer  is  the  identity  relation:  a=g(a). Please ignore the bias in this case for ease of calculation.
Q2 (a) Explain why deep learning is still needed for many complicated problems even  though  the  universal  approximation  theorem  states  that  a  shallow neural network can approximate any continuous functions?
Â
(b) You are training a multilayer perceptron using backpropagation, and after many  epochs  you  find  that  the  loss  function  is  very  small,  but  it  is oscillating within a very small range. What might be the reason for this and how will you deal with this situation?
Â
(c) In  the  context  of  backpropagation,  explain  what  characteristics  an activation  function  should  have  in  order  to  achieve  effective  training  of neural networks.
Â
(d) When training a neural network with backpropagation, the learning rate is first set to be a larger value, and then it is gradually decreased during the learning process. Explain the reason and draw a graph to illustrate this (2 marks for explaining the reason and 2 marks for the graph).
Â
In a convolution neural network, suppose there are 6 filters of size 3 x 3 and  stride  1  (the  step  size  of  the  filter)  in  the  first  layer.  If  an  input  of dimension  10  x  10  x  3  is  passed  through  this  layer,  what  are  the dimensions of the data which the next layer will receive? Draw a graph to illustrate your ideas (three marks for calculating the dimensions, and three marks for the graph).
Â
Note: the input dimension 10 x 10 x 3 means the input images have three channels  (RGB),  and  the  convolution  operation  needs  to  be  adapted  to the three channel situation.Â