In this question you will generate and plot 2-dimensional data for a binary classification problem. We will call the two classes Class 0 and Class 1 (for which the target values are t = 0 and t = 1, respectively).

(a) (6 points) Write a Python function genData(mu0,mu1,Sigma0,Sigma1,N) that generates two clusters of data, one for each class. Each cluster consists of N data points. The cluster for class 0 is centred at mu0 and has covariance matrix Sigma0. The cluster for class 1 is centred at mu1 and has covariance matrix Sigma1. Note that mu0 and mu1 and all the data points are 2-dimensional vectors. Sigma0 and Sigma1 are 2 × 2 symmetric matrices that describe the shape of the clusters: the diagonal entries specify the variance of a cluster along each of the two dimensions, and the off-diagonal entries describe how correlated the two dimensions are. The function should return two arrays, X and t, representing data points and target values, respectively. X is a 2N × 2 dimensional array in which each row is a data point. t is a 2N-dimensional vector of 0s and 1s. Specifically, t[i] is 0 if X[i] belongs to class 0, and 1 if it belongs to class 1. The data for the two classes should be distributed randomly in the arrays. In particular, the data for class 0 should not all be in the first half of the arrays, with the data for class 1 in the second half.

We will model each cluster as a multivariate normal distribution. Recall that the probability density of such a distribution is given bywhere µ is the mean (cluster centre), Σ is the covariance matrix, and k is the dimensionaliy of the data (2 in our case). To generate data for a cluster, use the function multivariate normal in numpy.random. Use the function shuffle in sklearn.utils to distribute the data randomly in the arrays.

(b) (1 point) Use your function from part (a) to generate two clusters with 10,000 points each with mu0 = (0, −1), mu1 = (−1, 1) and Sigma0 = 2.0 0.5 0.5 1.0 !

Sigma1 = 1.0 −1.0

−1.0 2.0

!

You will have to encode these argument values as Numpy arrays.

−1.0 2.0

!

You will have to encode these argument values as Numpy arrays.

(c) Display the data from part (b) as a scatter plot, using red dots for points in cluster 0, and blue dots for points in cluster 1. Use the function scatter in numpy.pyplot. Specify a relatively small dot size by using the named argument s=2. Use the functions xlim and ylim to extend the x and y axes from -5 to 6. Title the plot, “Question 1(c): sample cluster data (10,000 points per cluster)”. If you have done everything correctly, the scatter plot should look something like Figure 1, which shows two heavily overlapping clusters. In particular, the2. (?? points) Binary Logistic Regression.

In this question you will use logistic regression to generate a classifier for cluster data. You will also generate a precision-recall curve for the classifier. Use the Python class LogisticRegression in sklearn.linear model to do the logistic regression. This class generates a Python object, much as the function Ridge did in Question 5 of 4

Assignment 1. The class comes with a number of attributes and methods that you will find useful for answering the questions below.

Assignment 1. The class comes with a number of attributes and methods that you will find useful for answering the questions below.

(a) Use genData to generate training data consisting of two clusters with 1000 points each. Use the same cluster centers and covariance matrices as in Question 1(b).

(b) Carry out logistic regression on the data in part (a). Print out the values of the bias term, w0, the weight vector, w, and the mean accuracy of the classifier on the training data. (Accuracy is the number of correct predictions.)

(b) Carry out logistic regression on the data in part (a). Print out the values of the bias term, w0, the weight vector, w, and the mean accuracy of the classifier on the training data. (Accuracy is the number of correct predictions.)

(c) Generate a scatter plot of the taining data as in Question 1(c), and draw the decision boundary of the classifier as a black line on top of the data. Title the figure, “Question 2(c): training data and decision boundary”.

(d) Recall that the standard decision boundary tends to make the number of false positives equal to the number of false negatives. However, these two kinds of error may have different costs, and we may want to shift the decision boundary to account for this. That is, instead of defining the decision boundary by w T x+w0 =0, we may want to define it by w T x + w0 = t for some threshold, t.Generate a scatter plot of the data, and plot seven different decision boundaries on top of it, for t = 3, 2, 1, 0, −1, −2, −3. Plot the decision boundary as a blue line when t is positive, as a red line when t is negative, and as a black line when t is 0. Title the figure, “Question 2(d): decision boundaries for seven thresholds”.

(e) Which of the seven values of t in part (d) gives the greatest number of false positives (i.e., false blue predictions)? Explain your answer.

(f) For t = 1, what is the probability of a point on the decision boundary being in class 1 (i.e., blue).

(g) Use genData to generate test data consisting of two clusters with 10,000 points each. Use the same cluster centers and covariance matrices as for the training data.

(h) Use the test data to compute and print out the following values for t = 1:

• The number of predicted positives (i.e., points predicted to be in class 1)

• The number of predicted negatives (i.e., points predicted to be in class 0)

• The number of true positives (i.e., predictions for class 1 that are correct).

• The number of false postives (i.e., predictions for class 1 that are incorrect)

• The number of true negatives (i.e., predictions for class 0 that are correct)

• The number of false negatives (i.e., predictions for class 0 that are incorrect).

• The precision.

• The recall.

• The number of predicted negatives (i.e., points predicted to be in class 0)

• The number of true positives (i.e., predictions for class 1 that are correct).

• The number of false postives (i.e., predictions for class 1 that are incorrect)

• The number of true negatives (i.e., predictions for class 0 that are correct)

• The number of false negatives (i.e., predictions for class 0 that are incorrect).

• The precision.

• The recall.

The number of predicted positives should be less than the number of predicted negatives. The number of true positives should be much greater than the number of false positives. (Explain both of these points, generating an appropriate figure to simplify your explanation. Title the figure, “Question 2(h): explanatory figure”.)

5

(i) Use the test data to generate a precision/recall curve for the classifier. That is, plot precision vs recall for 1000 different values of the threshold, t. You should choose the range of t values so that the curve is as long as possible. You should find that 0.5 ≤ precision ≤ 1 and 0 ≤ recall ≤ 1. The result should look something like Figure 2 (although the minimum precision in this curve is different). Label the axes, and title the figure, “Question 2(i): precision/recall curve”.

(j) Explain why the minimum precision is 0.5.

(k) Compute and print the area under the precision/recall curve. The area should be between 0.5 and 1.0. (Recall that the area under a curve (AUC) is the area between the curve and the x axis.)

(k) Compute and print the area under the precision/recall curve. The area should be between 0.5 and 1.0. (Recall that the area under a curve (AUC) is the area between the curve and the x axis.)

(l) Explain why the area under the curve must betwen 0.5 and 1.0. (You may want to include a figure in your explanation. If so, title it, “Question 2(l): explanatory figure”.)

3. (?? points total) Multi-class Classification. In this question, you will use logistic regression and K nearest neighbors (KNN) to classify images of handwritten digits. There are ten different digits (0 to 9), so you will be using multi-class classification. To start, download and uncompress (if necessary) the MNIST data file from the course

web page. The file, called mnist.pickle.zip, contains training and test data. Next, start the Python interpreter and import the pickle module. You can then read the file

mnist.pickle with the following command (’rb’ opens the file for reading in binary):

web page. The file, called mnist.pickle.zip, contains training and test data. Next, start the Python interpreter and import the pickle module. You can then read the file

mnist.pickle with the following command (’rb’ opens the file for reading in binary):

with open(’mnist.pickle’,’rb’) as f: Xtrain,Ytrain,Xtest,Ytest = pickle.load(f) The variables Xtrain and Ytrain contain training data, while Xtest and Yest contain test data. Use this data for training and testing in this question and in the rest of this assignment. Xtrain is a Numpy array with 60,000 rows and 784 columns. Each row represents a hand-written digit. Although each digit is stored as a row vector with 784 components, it actually represents an array of pixels with 28 rows and 28 columns (784 = 28 × 28). Each pixel is stored as a floating-point number, but has an integer value between 0 and 255 (i.e., the values representable in a single byte). The variable Ytrain is a vector of 60,000 image labels, where a label is an integer betwen 0 and 9. For example, if row n of Xtrain is an image of the digit 7, then Ytrain[n] = 7. Likewise for Xtest and Ytest, which represent 10,000 test images. To view a digit, you must first convert it to a 28 × 28 array using the function numpy.reshape. To display a 2-dimensional array as an image, you can use the function imshow in matplotlib.pyplot. To see an image in black-and-white, add the keyword argument cmap=’Greys’ to imshow. To remove the smoothing and see the 784 pixels clearly, add the keyword argument interpolation=’nearest’. Try displaying a few digits as images. (Figure 3 shows an example.) For comparison, try printing them as vectors. (Do not hand this in.)

In this project to compute the binary regression, logistic regression and generate the data using python code. We are using the mnist pickle file to run the program. We are developing the matrix multiplication, multi- class classification, binary regression and logistic regression using python. It has the separated file and using python code to develop the task. The first task is to develop the matrix multiplication and display the scatter plot. The second task to develop the binary logistic regression for generates the data and it displays the scatter plot. The third task to develop the multi classification and using the mniist file to generate the recall curve. The fourth task is logistic regression and it calculates the entropy by using the formula. The fifth task is softmax and we are calculate the cross entropy related to task 4.The sixth task is batch gradient descent and it develop the muliti class classification. The seventh task is stochastic gradient descent and it displays the output to relate to the batch size.

- In the first task to create the matrix multiplication and it calculate the values. We are create the A×A matrix. The matrix begins at 0 not 1.The first element has the 7 row and 0 column. And the second element has the 0 row and 4 columns. We are using the numpy array for create the element wise multiplication. Consider the two dimensional numpy array A and B. It perform the matrix multiplication using the python code. In additionally we using the vector. The array and the vector can be executed in parallel. If A is matrix in row and v is vector for column (Allison and Allison, 2012). If A is matrix in column and v is vector in row. We adding the matrix and vector like A+V to every element. It also avoiding the iterating function.

- Finally it displays the output in scatter plot. The task includes the additional task. For binary classification problem, first generate the data and plot the two dimensional data. The array include the function and the function should return the array x and t.And the function generate the two cluster like sigma0 and sigma1.

- And the data display the scatter plot using red dots and blue dots. The red and blue dots denoted as the cluster. Red dots for points in cluster 0 and blue dots for points in cluster 1.

To generate the classifier of two cluster data. It also generates the precision recall curve. In the task first generate the data and calculate the covariance matrix for same cluster. We are using the t value to generate the data and it displays the graph for the points. The predict value has negative value, positive value, true positive value, true negative value, false negative and false positive value (Feng et al., 2016). Float point constant is 0.4f and 0.5f.The 0.4 size of a double is 4 bytes and o.5f size of a double size is 5 bytes. The 0.5 float has an exact binary representation 0.1 and 0.4 does not have an exact binary representation 0.01101100.so 0.5 is true and 0.4 is false. So we take the minimum precision value 0.5.

We are using the logistic regression and K nearest neighbour. We are import the mnist image and classify the image. The file has the xtrain, ytrain, xtest and ytest data.KNN means it identifying the data using number of clusters. In the data have the 60000 rows and 784 columns. The images ate convert to the 28×28 array and the values between the o and 255.In the mnist file has the multiple image (Gusev and Ristov, 2013). The multiple image display the single imager using python code and it arrange 6×6 grid format. It also prints the test accuracy using the training dataset. We take the k value up to 20. We find the test accuracy using the k values.

In this task we are using the training data. We take the real vector x^{n }and binary vector t^{n}.We calculate the cross entropy using the vectors. We give some value to check it true or false. We give the prediction value and epsilon value to calculate the answer is correct or not. The cross entropy means the machine learning and optimization defied by the loss function. The value defined by the predict value (Huang, 2012). The predict values are negative value, positive value, true negative value, true positive value, false negative value and false positive value.

Proof

- a) The second last equation

- b) Let assume the X,T AND Y value.

^{T } (Y – T)

Assume X^{t }= ,Y = and T =

Apply the value

- c) We take i=0

Then

In this task to implement the multi class logistic regression. First implement the softmax function. To implement the softmax function using the multi class regression and it is a generalization of logistic regression (Kleinbaum and Klein, 2011).Consider the z and v.Z is a k diamensional vector and y is equal to the softmax of z.We using the k dimensional formula for implement the softmax function. Consider the two softmax function. Softmax of z and softmax1 of z.The softmax function return the value 0.5, 0.5.

The first cause to return the value and second cause warning. It return the nan, 0.The third cause should return –inf,0 function . We compute the first element value and it return the 0.5 value The second task we do the transform vector z and z’.

b (i) Consider the z=e^{xi } and z’=e^{xi-m }

Hence prove the softmax (z’)=softmax(z)

b (ii) Consider the log function

L_{i= }-log(y_{k})

We compute the derivation

= (P_{k }*(1 – P_{k}))

= (P_{k }_{ }- 1)

We compute the softmax function using the python code. We take the two vectors logy and logy. It should return the y and logy function. The y considers to the softmax of z and logy is consider to the log of softmax of x.

In the task related to task 4 and task 5.The binary logistic regression and muliti class classification using the batch gradient desent.The batch gradient descent means find the balance between the stochastic gradient descent and efficacy of the gradient descent. We are using the training dataset to implement the batch gradient descent. The compute the gradients and update the weight. We test the mnist data and it displays the scatter plot. We also define the cross entropy per data point (ZHENG and LUO, 2013). Consider the N and N means the number of data point in the sum. We are implementing the weights and find the Gaussian distribution function. We take the mean value 0 and standard deviation value 0.01.The weight values are repeated up to 5000 values. It compute the training loss, test loss mean training accuracy and test loss and accuracy.Traning loss means It build a model and find out the loss for the function. Additionally we plot the graph for 200 training loss. It uses to find out the training accuracy and training loss. It also calculates the test loss and test accuracy.

We take the 200 training data to implement the loos function.

We take the 2000 training data for loss function.

In this task we use the batch size. It performs the 500 training data. It computes the training accuracy and test accuracy. Stochastic gradient descent is an iterative method and it also called as the incremental gradient descent. We take the training dataset and it using to calculate the iteration function in the dataset. It is one of the machine learning problem and it solve the mathematically. But we are using python code for this function. In the python code calculates the number of iteration in the function and it solves the problem. Consider the mini batch of the training dataset and it computes the gradient on a small. It updates the weight and then move from one mini batch to another mini batch. The mini batch size is 100 and it takes 100 points for first execution. Every execution has 100 points. The mini batch has 500 batch size. It is step by step procedurte.First it compute the gradient and it update the weight. And then compute the training accurancy, traning loss and test accurancy, test loss. We plot the graph for every dataset in single graph. If the test accuracy increase then the process are decrese.We are already plot the training loss and test loss. The blue dots denote as the test loss and red dots denotes as the training loss. Finally print the all values for all graphs. The training accuracy is greater than the test accuracy and training loss is less than the test loss. The graph is depending upon the batch size. The batch size is increase the graph is also take different value.

In this project to solve the matrix mulitiplication, binary regressions and logistic regression using python. We using the training data and find the training loss, training accuracy and test loss, test accuracy. In the problem are satisfied by the python code. In the first task calculate the matrix multiplication by using python code and also implement the sigma values. In the second task is generating the regression problem. Generate the training dataset calculate the covariance matrix. The values are depending the t values. In the third task is combine the all the image and it display the single image. We are using the mnist dataset for the task. In the fourth task is calculate the cross entropy in mathematically. In the fifth task is softmax.It depending the multi class logistic regression. We using the two softmax value and generate by the python code. In the sixth task is batch gradient descent. It is based on the multi class logistic regression and gradient descent. We are using the minst dataset for the task. It also implement by the python code. We find the training accuracy dataset is greater than the test accuracy and training loss is less than the test loss. We plot the graph depend the mnist dataset. In the last task is stochastic gradient descent. It depending the batch size and plot the graph depending the batch size. It also computes the training accuracy, test accuracy and training loss, test loss.

Allison, P. and Allison, P. (2012). Logistic regression using SAS. Cary, NC: SAS Institute.

Feng, W., Sarkar, A., Lim, C. and Maiti, T. (2016). Variable selection for binary spatial regression: Penalized quasi-likelihood approach. Biometrics, 72(4), pp.1164-1172.

Gusev, M. and Ristov, S. (2013). A superlinear speedup region for matrix multiplication. Concurrency and Computation: Practice and Experience, 26(11), pp.1847-1868.

Huang, T. (2012). Neural information processing. Heidelberg: Springer.

Kleinbaum, D. and Klein, M. (2011). Logistic regression. New York: Springer.

ZHENG, X. and LUO, Y. (2013). Improved clonal selection algorithm for multi-class data classification. Journal of Computer Applications, 32(11), pp.3201-3205.

In this question you will generate and plot 2-dimensional data for a binary classification problem. We will call the two classes Class 0 and Class 1 (for which the target values are t = 0 and t = 1, respectively).

(a) (6 points) Write a Python function genData(mu0,mu1,Sigma0,Sigma1,N) that generates two clusters of data, one for each class. Each cluster consists of N data points. The cluster for class 0 is centred at mu0 and has covariance matrix Sigma0. The cluster for class 1 is centred at mu1 and has covariance matrix Sigma1. Note that mu0 and mu1 and all the data points are 2-dimensional vectors. Sigma0 and Sigma1 are 2 × 2 symmetric matrices that describe the shape of the clusters: the diagonal entries specify the variance of a cluster along each of the two dimensions, and the off-diagonal entries describe how correlated the two dimensions are. The function should return two arrays, X and t, representing data points and target values, respectively. X is a 2N × 2 dimensional array in which each row is a data point. t is a 2N-dimensional vector of 0s and 1s. Specifically, t[i] is 0 if X[i] belongs to class 0, and 1 if it belongs to class 1. The data for the two classes should be distributed randomly in the arrays. In particular, the data for class 0 should not all be in the first half of the arrays, with the data for class 1 in the second half.

We will model each cluster as a multivariate normal distribution. Recall that the probability density of such a distribution is given bywhere µ is the mean (cluster centre), Σ is the covariance matrix, and k is the dimensionaliy of the data (2 in our case). To generate data for a cluster, use the function multivariate normal in numpy.random. Use the function shuffle in sklearn.utils to distribute the data randomly in the arrays.

(b) (1 point) Use your function from part (a) to generate two clusters with 10,000 points each with mu0 = (0, −1), mu1 = (−1, 1) and Sigma0 = 2.0 0.5 0.5 1.0 !

Sigma1 = 1.0 −1.0

−1.0 2.0

!

You will have to encode these argument values as Numpy arrays.

−1.0 2.0

!

You will have to encode these argument values as Numpy arrays.

(c) Display the data from part (b) as a scatter plot, using red dots for points in cluster 0, and blue dots for points in cluster 1. Use the function scatter in numpy.pyplot. Specify a relatively small dot size by using the named argument s=2. Use the functions xlim and ylim to extend the x and y axes from -5 to 6. Title the plot, “Question 1(c): sample cluster data (10,000 points per cluster)”. If you have done everything correctly, the scatter plot should look something like Figure 1, which shows two heavily overlapping clusters. In particular, the2. (?? points) Binary Logistic Regression.

In this question you will use logistic regression to generate a classifier for cluster data. You will also generate a precision-recall curve for the classifier. Use the Python class LogisticRegression in sklearn.linear model to do the logistic regression. This class generates a Python object, much as the function Ridge did in Question 5 of 4

Assignment 1. The class comes with a number of attributes and methods that you will find useful for answering the questions below.

Assignment 1. The class comes with a number of attributes and methods that you will find useful for answering the questions below.

(a) Use genData to generate training data consisting of two clusters with 1000 points each. Use the same cluster centers and covariance matrices as in Question 1(b).

(b) Carry out logistic regression on the data in part (a). Print out the values of the bias term, w0, the weight vector, w, and the mean accuracy of the classifier on the training data. (Accuracy is the number of correct predictions.)

(b) Carry out logistic regression on the data in part (a). Print out the values of the bias term, w0, the weight vector, w, and the mean accuracy of the classifier on the training data. (Accuracy is the number of correct predictions.)

(c) Generate a scatter plot of the taining data as in Question 1(c), and draw the decision boundary of the classifier as a black line on top of the data. Title the figure, “Question 2(c): training data and decision boundary”.

(d) Recall that the standard decision boundary tends to make the number of false positives equal to the number of false negatives. However, these two kinds of error may have different costs, and we may want to shift the decision boundary to account for this. That is, instead of defining the decision boundary by w T x+w0 =0, we may want to define it by w T x + w0 = t for some threshold, t.Generate a scatter plot of the data, and plot seven different decision boundaries on top of it, for t = 3, 2, 1, 0, −1, −2, −3. Plot the decision boundary as a blue line when t is positive, as a red line when t is negative, and as a black line when t is 0. Title the figure, “Question 2(d): decision boundaries for seven thresholds”.

(e) Which of the seven values of t in part (d) gives the greatest number of false positives (i.e., false blue predictions)? Explain your answer.

(f) For t = 1, what is the probability of a point on the decision boundary being in class 1 (i.e., blue).

(g) Use genData to generate test data consisting of two clusters with 10,000 points each. Use the same cluster centers and covariance matrices as for the training data.

(h) Use the test data to compute and print out the following values for t = 1:

• The number of predicted positives (i.e., points predicted to be in class 1)

• The number of predicted negatives (i.e., points predicted to be in class 0)

• The number of true positives (i.e., predictions for class 1 that are correct).

• The number of false postives (i.e., predictions for class 1 that are incorrect)

• The number of true negatives (i.e., predictions for class 0 that are correct)

• The number of false negatives (i.e., predictions for class 0 that are incorrect).

• The precision.

• The recall.

• The number of predicted negatives (i.e., points predicted to be in class 0)

• The number of true positives (i.e., predictions for class 1 that are correct).

• The number of false postives (i.e., predictions for class 1 that are incorrect)

• The number of true negatives (i.e., predictions for class 0 that are correct)

• The number of false negatives (i.e., predictions for class 0 that are incorrect).

• The precision.

• The recall.

The number of predicted positives should be less than the number of predicted negatives. The number of true positives should be much greater than the number of false positives. (Explain both of these points, generating an appropriate figure to simplify your explanation. Title the figure, “Question 2(h): explanatory figure”.)

5

(i) Use the test data to generate a precision/recall curve for the classifier. That is, plot precision vs recall for 1000 different values of the threshold, t. You should choose the range of t values so that the curve is as long as possible. You should find that 0.5 ≤ precision ≤ 1 and 0 ≤ recall ≤ 1. The result should look something like Figure 2 (although the minimum precision in this curve is different). Label the axes, and title the figure, “Question 2(i): precision/recall curve”.

(j) Explain why the minimum precision is 0.5.

(k) Compute and print the area under the precision/recall curve. The area should be between 0.5 and 1.0. (Recall that the area under a curve (AUC) is the area between the curve and the x axis.)

(k) Compute and print the area under the precision/recall curve. The area should be between 0.5 and 1.0. (Recall that the area under a curve (AUC) is the area between the curve and the x axis.)

(l) Explain why the area under the curve must betwen 0.5 and 1.0. (You may want to include a figure in your explanation. If so, title it, “Question 2(l): explanatory figure”.)

3. (?? points total) Multi-class Classification. In this question, you will use logistic regression and K nearest neighbors (KNN) to classify images of handwritten digits. There are ten different digits (0 to 9), so you will be using multi-class classification. To start, download and uncompress (if necessary) the MNIST data file from the course

web page. The file, called mnist.pickle.zip, contains training and test data. Next, start the Python interpreter and import the pickle module. You can then read the file

mnist.pickle with the following command (’rb’ opens the file for reading in binary):

web page. The file, called mnist.pickle.zip, contains training and test data. Next, start the Python interpreter and import the pickle module. You can then read the file

mnist.pickle with the following command (’rb’ opens the file for reading in binary):

with open(’mnist.pickle’,’rb’) as f: Xtrain,Ytrain,Xtest,Ytest = pickle.load(f) The variables Xtrain and Ytrain contain training data, while Xtest and Yest contain test data. Use this data for training and testing in this question and in the rest of this assignment. Xtrain is a Numpy array with 60,000 rows and 784 columns. Each row represents a hand-written digit. Although each digit is stored as a row vector with 784 components, it actually represents an array of pixels with 28 rows and 28 columns (784 = 28 × 28). Each pixel is stored as a floating-point number, but has an integer value between 0 and 255 (i.e., the values representable in a single byte). The variable Ytrain is a vector of 60,000 image labels, where a label is an integer betwen 0 and 9. For example, if row n of Xtrain is an image of the digit 7, then Ytrain[n] = 7. Likewise for Xtest and Ytest, which represent 10,000 test images. To view a digit, you must first convert it to a 28 × 28 array using the function numpy.reshape. To display a 2-dimensional array as an image, you can use the function imshow in matplotlib.pyplot. To see an image in black-and-white, add the keyword argument cmap=’Greys’ to imshow. To remove the smoothing and see the 784 pixels clearly, add the keyword argument interpolation=’nearest’. Try displaying a few digits as images. (Figure 3 shows an example.) For comparison, try printing them as vectors. (Do not hand this in.)

In this project to compute the binary regression, logistic regression and generate the data using python code. We are using the mnist pickle file to run the program. We are developing the matrix multiplication, multi- class classification, binary regression and logistic regression using python. It has the separated file and using python code to develop the task. The first task is to develop the matrix multiplication and display the scatter plot. The second task to develop the binary logistic regression for generates the data and it displays the scatter plot. The third task to develop the multi classification and using the mniist file to generate the recall curve. The fourth task is logistic regression and it calculates the entropy by using the formula. The fifth task is softmax and we are calculate the cross entropy related to task 4.The sixth task is batch gradient descent and it develop the muliti class classification. The seventh task is stochastic gradient descent and it displays the output to relate to the batch size.

- In the first task to create the matrix multiplication and it calculate the values. We are create the A×A matrix. The matrix begins at 0 not 1.The first element has the 7 row and 0 column. And the second element has the 0 row and 4 columns. We are using the numpy array for create the element wise multiplication. Consider the two dimensional numpy array A and B. It perform the matrix multiplication using the python code. In additionally we using the vector. The array and the vector can be executed in parallel. If A is matrix in row and v is vector for column (Allison and Allison, 2012). If A is matrix in column and v is vector in row. We adding the matrix and vector like A+V to every element. It also avoiding the iterating function.

- Finally it displays the output in scatter plot. The task includes the additional task. For binary classification problem, first generate the data and plot the two dimensional data. The array include the function and the function should return the array x and t.And the function generate the two cluster like sigma0 and sigma1.

- And the data display the scatter plot using red dots and blue dots. The red and blue dots denoted as the cluster. Red dots for points in cluster 0 and blue dots for points in cluster 1.

To generate the classifier of two cluster data. It also generates the precision recall curve. In the task first generate the data and calculate the covariance matrix for same cluster. We are using the t value to generate the data and it displays the graph for the points. The predict value has negative value, positive value, true positive value, true negative value, false negative and false positive value (Feng et al., 2016). Float point constant is 0.4f and 0.5f.The 0.4 size of a double is 4 bytes and o.5f size of a double size is 5 bytes. The 0.5 float has an exact binary representation 0.1 and 0.4 does not have an exact binary representation 0.01101100.so 0.5 is true and 0.4 is false. So we take the minimum precision value 0.5.

We are using the logistic regression and K nearest neighbour. We are import the mnist image and classify the image. The file has the xtrain, ytrain, xtest and ytest data.KNN means it identifying the data using number of clusters. In the data have the 60000 rows and 784 columns. The images ate convert to the 28×28 array and the values between the o and 255.In the mnist file has the multiple image (Gusev and Ristov, 2013). The multiple image display the single imager using python code and it arrange 6×6 grid format. It also prints the test accuracy using the training dataset. We take the k value up to 20. We find the test accuracy using the k values.

In this task we are using the training data. We take the real vector x^{n }and binary vector t^{n}.We calculate the cross entropy using the vectors. We give some value to check it true or false. We give the prediction value and epsilon value to calculate the answer is correct or not. The cross entropy means the machine learning and optimization defied by the loss function. The value defined by the predict value (Huang, 2012). The predict values are negative value, positive value, true negative value, true positive value, false negative value and false positive value.

Proof

- a) The second last equation

- b) Let assume the X,T AND Y value.

^{T } (Y – T)

Assume X^{t }= ,Y = and T =

Apply the value

- c) We take i=0

Then

In this task to implement the multi class logistic regression. First implement the softmax function. To implement the softmax function using the multi class regression and it is a generalization of logistic regression (Kleinbaum and Klein, 2011).Consider the z and v.Z is a k diamensional vector and y is equal to the softmax of z.We using the k dimensional formula for implement the softmax function. Consider the two softmax function. Softmax of z and softmax1 of z.The softmax function return the value 0.5, 0.5.

The first cause to return the value and second cause warning. It return the nan, 0.The third cause should return –inf,0 function . We compute the first element value and it return the 0.5 value The second task we do the transform vector z and z’.

b (i) Consider the z=e^{xi } and z’=e^{xi-m }

Hence prove the softmax (z’)=softmax(z)

b (ii) Consider the log function

L_{i= }-log(y_{k})

We compute the derivation

= (P_{k }*(1 – P_{k}))

= (P_{k }_{ }- 1)

We compute the softmax function using the python code. We take the two vectors logy and logy. It should return the y and logy function. The y considers to the softmax of z and logy is consider to the log of softmax of x.

In the task related to task 4 and task 5.The binary logistic regression and muliti class classification using the batch gradient desent.The batch gradient descent means find the balance between the stochastic gradient descent and efficacy of the gradient descent. We are using the training dataset to implement the batch gradient descent. The compute the gradients and update the weight. We test the mnist data and it displays the scatter plot. We also define the cross entropy per data point (ZHENG and LUO, 2013). Consider the N and N means the number of data point in the sum. We are implementing the weights and find the Gaussian distribution function. We take the mean value 0 and standard deviation value 0.01.The weight values are repeated up to 5000 values. It compute the training loss, test loss mean training accuracy and test loss and accuracy.Traning loss means It build a model and find out the loss for the function. Additionally we plot the graph for 200 training loss. It uses to find out the training accuracy and training loss. It also calculates the test loss and test accuracy.

We take the 200 training data to implement the loos function.

We take the 2000 training data for loss function.

In this task we use the batch size. It performs the 500 training data. It computes the training accuracy and test accuracy. Stochastic gradient descent is an iterative method and it also called as the incremental gradient descent. We take the training dataset and it using to calculate the iteration function in the dataset. It is one of the machine learning problem and it solve the mathematically. But we are using python code for this function. In the python code calculates the number of iteration in the function and it solves the problem. Consider the mini batch of the training dataset and it computes the gradient on a small. It updates the weight and then move from one mini batch to another mini batch. The mini batch size is 100 and it takes 100 points for first execution. Every execution has 100 points. The mini batch has 500 batch size. It is step by step procedurte.First it compute the gradient and it update the weight. And then compute the training accurancy, traning loss and test accurancy, test loss. We plot the graph for every dataset in single graph. If the test accuracy increase then the process are decrese.We are already plot the training loss and test loss. The blue dots denote as the test loss and red dots denotes as the training loss. Finally print the all values for all graphs. The training accuracy is greater than the test accuracy and training loss is less than the test loss. The graph is depending upon the batch size. The batch size is increase the graph is also take different value.

In this project to solve the matrix mulitiplication, binary regressions and logistic regression using python. We using the training data and find the training loss, training accuracy and test loss, test accuracy. In the problem are satisfied by the python code. In the first task calculate the matrix multiplication by using python code and also implement the sigma values. In the second task is generating the regression problem. Generate the training dataset calculate the covariance matrix. The values are depending the t values. In the third task is combine the all the image and it display the single image. We are using the mnist dataset for the task. In the fourth task is calculate the cross entropy in mathematically. In the fifth task is softmax.It depending the multi class logistic regression. We using the two softmax value and generate by the python code. In the sixth task is batch gradient descent. It is based on the multi class logistic regression and gradient descent. We are using the minst dataset for the task. It also implement by the python code. We find the training accuracy dataset is greater than the test accuracy and training loss is less than the test loss. We plot the graph depend the mnist dataset. In the last task is stochastic gradient descent. It depending the batch size and plot the graph depending the batch size. It also computes the training accuracy, test accuracy and training loss, test loss.

Allison, P. and Allison, P. (2012). Logistic regression using SAS. Cary, NC: SAS Institute.

Feng, W., Sarkar, A., Lim, C. and Maiti, T. (2016). Variable selection for binary spatial regression: Penalized quasi-likelihood approach. Biometrics, 72(4), pp.1164-1172.

Gusev, M. and Ristov, S. (2013). A superlinear speedup region for matrix multiplication. Concurrency and Computation: Practice and Experience, 26(11), pp.1847-1868.

Huang, T. (2012). Neural information processing. Heidelberg: Springer.

Kleinbaum, D. and Klein, M. (2011). Logistic regression. New York: Springer.

ZHENG, X. and LUO, Y. (2013). Improved clonal selection algorithm for multi-class data classification. Journal of Computer Applications, 32(11), pp.3201-3205.

OR

To export a reference to this article please select a referencing stye below:

My Assignment Help. (2019). *Machine Learning And Data Mining*. Retrieved from https://myassignmenthelp.com/free-samples/csc-411-machine-learning-and-data-mining.

"Machine Learning And Data Mining." My Assignment Help, 2019, https://myassignmenthelp.com/free-samples/csc-411-machine-learning-and-data-mining.

My Assignment Help (2019) *Machine Learning And Data Mining* [Online]. Available from: https://myassignmenthelp.com/free-samples/csc-411-machine-learning-and-data-mining

[Accessed 21 January 2020].

My Assignment Help. 'Machine Learning And Data Mining' (My Assignment Help, 2019) <https://myassignmenthelp.com/free-samples/csc-411-machine-learning-and-data-mining> accessed 21 January 2020.

My Assignment Help. Machine Learning And Data Mining [Internet]. My Assignment Help. 2019 [cited 21 January 2020]. Available from: https://myassignmenthelp.com/free-samples/csc-411-machine-learning-and-data-mining.

If you are stuck with an overly complicated dissertation and looking for an ideal academic expert for the needful backup, then count on our expertise. Our writers are capable of working on a plethora of research paper topics. Be it Management or Law, Humanities or Literature – we shall always have an accurate answer to your question. In addition, we shall help you with the perfect research paper outline and a flawless list of bibliography on time. So, what are you waiting for? Place an order with us right now.

Answer: Introduction Lee Electronics is a small retail warehouse. Therefore, an appropriate communication channel should enable it to save on the costs. Also, the medium of communication adopted by the company will dictate how the message is sent to the employees (?eitan, 2017, p. 121). Hence, the type of information, the employees and the overall purpose of communication are significant considerations when selecting an appropriate medium of co...

Answer: Introduction: The current project is aimed to analyse and create a targeting market model for the retail sector. In other words, the main objective is to develop analytical skills in customer’s analytics and categorise customers in different groups and segments based on their purchasing behaviour (Chen, 2014). In the current time there is a lot of data generated each day. This data can be used to make customized offers to the cu...

Answer: Introduction Banks are huge financial institutions that are responsible for handling the money for the common mass in the entire world. This makes the Banks rely mostly on the capacity, speed, intelligence, security and accountability for the responsibility it holds. Since, the banking industry goes through almost millions of transactions every day; mainframes are required for it has a capability of handling more than 2.5 billion tran...

Answer: Title Installing PLC in Robots Start and End Date Project will start on 25th September 2018 and it will end on 8th January 2019. Goals and Objectives To make the school students aware about robotics To successfully install Programmable Logic Controller (PLC) in robots To incorporate and execute the project requirements within the estimated project schedule and budget. Project requirements Functional...

Answer: Introduction: The modern world and the rapid industrialization have demanded for the sustainable use of the natural resources all over the world. The governments of the different countries across the world have developed different laws and regulations for the sustainable use of the resources. These laws and regulations adopt different techniques and measures which differ in each country. The purpose of this discussion is to comp...

Just share your requirements and get customized solutions on time.

Orders

Overall Rating

Experts

Our writers make sure that all orders are submitted, prior to the deadline.

Using reliable plagiarism detection software, Turnitin.com.We only provide customized 100 percent original papers.

Feel free to contact our assignment writing services any time via phone, email or live chat.

Our writers can provide you professional writing assistance on any subject at any level.

Our best price guarantee ensures that the features we offer cannot be matched by any of the competitors.

Get all your documents checked for plagiarism or duplicacy with us.

Get different kinds of essays typed in minutes with clicks.

Calculate your semester grades and cumulative GPa with our GPA Calculator.

Balance any chemical equation in minutes just by entering the formula.

Calculate the number of words and number of pages of all your academic documents.

Our Mission Client Satisfaction

*This is the worst company to ever do business with I gavw them all my papers they to pay 20.00 they have had my assignment for a whole week a day before the assignment is due they said it will be 180 .. I’m like a 180 are you serious why. ? I’m ...*

Australia

*Good Work! Thank you for the effort. I passed the course with a good appreciation. The tutor was quick to respond and the quality of work is amazing*

Australia

*I thank you for a job well done. And also the promptness is well recognized and appreciated. Thanks.*

Australia

*I have tried many sites, but I found (My Assigment Help) the best. They submitted me the work before the deadline. so, they are too fast, high quality writing, responsing very quickly. I REALY REALY liked this site.*

Australia