Role of Activation Functions in Neural Networks

Explaining the Role of Activation Functions in Neural Networks and other Related Topics

Q1: Activation Functions in Neural Networks

Task:

Q1(a)Explain the role of activation functions in neural networks.

(b)A feedforward neural network has two input nodes, one hidden layer with three hidden neurons, and one output node. The activation functions for all hidden and output neurons are the same linear function in the following form: Â g(x)=c*x, Â where Â â€œcâ€ Â is Â a Â constant. Â Please Â explain Â why Â this Â neural network is a linear classifier (you should use mathematical equations with analysis to illustrate this).

(c) For the same neural network as in (b), if we replace the activation function for the output neuron with the sigmoid function, will this neural network still be a linear classifier? Please present your analysis and use mathematical equations to illustrate your ideas.

(d) Explain the role of activation functions in neural networks.

A feedforward neural network has two input nodes, one hidden layer with three hidden neurons, and one output node. The activation functions for all hidden and output neurons are the same linear function in the following form: Â g(x)=c*x, Â where Â â€œcâ€ Â is Â a Â constant. Â Please Â explain Â why this Â neural network is a linear classifier (you should use mathematical equations with analysis to illustrate this).

For the same neural network as in (b), if we replace the activation function for the output neuron with the sigmoid function, will this neural network still be a linear classifier? Please present your analysis and use mathematical equations to illustrate your ideas.

(e) Consider the same neural network in (d). For a training sample with input x1=1, x2=2, calculate the output of the network if the activation function g() in Â the Â hidden Â layer Â and Â the Â output Â layer Â is Â the Â identity Â relation: Â a=g(a). Please ignore the bias in this case for ease of calculation.

Q2 (a) Explain why deep learning is still needed for many complicated problems even Â though Â the Â universal Â approximation Â theorem Â states Â that Â a Â shallow neural network can approximate any continuous functions?

(b) You are training a multilayer perceptron using backpropagation, and after many Â epochs Â you Â find Â that Â the Â loss Â function Â is Â very Â small, Â but Â it Â is oscillating within a very small range. What might be the reason for this and how will you deal with this situation?

(c) In Â the Â context Â of Â backpropagation, Â explain Â what Â characteristics Â an activation Â function Â should Â have Â in Â order Â to Â achieve Â effective Â training Â of neural networks.

(d) When training a neural network with backpropagation, the learning rate is first set to be a larger value, and then it is gradually decreased during the learning process. Explain the reason and draw a graph to illustrate this (2 marks for explaining the reason and 2 marks for the graph).

In a convolution neural network, suppose there are 6 filters of size 3 x 3 and Â stride Â 1 Â (the Â step Â size Â of Â the Â filter) Â in Â the Â first Â layer. Â If Â an Â input Â of dimension Â 10 Â x Â 10 Â x Â 3 Â is Â passed Â through Â this Â layer, Â what Â are Â the dimensions of the data which the next layer will receive? Draw a graph to illustrate your ideas (three marks for calculating the dimensions, and three marks for the graph).

Note: the input dimension 10 x 10 x 3 means the input images have three channels Â (RGB), Â and Â the Â convolution Â operation Â needs Â to Â be Â adapted Â to the three channel situation.Â

Get instant help from 5000+ experts for