Kaggle Comp. Instructions and Requirements

Instructions and Submission Requirements for Kaggle Competition

Task

To participate in the competition, you must provide a list of predicted outputs for the instances on the
Kaggle website. To solve the problem, we expect you to try the following methods:

• A baseline of SVM or logistic regression: using your own implementation or using a library.

• Any other ML method of your choice. Be creative! Some suggestions are neural network trained by back-propagation, k-NN, random forests, kernelized SVM, CNN’s, etc.

For the Kaggle competition, you can submit results from your best performing system.
Note: We suggest you to start early, allowing yourself enough time to submit multiple times and get a sense of how well you are doing.

In addition to your methods, you must write up a report that details the pre-processing, validation, algorithmic, and optimization techniques, as well as providing your Kaggle results that we compare them with. The report should contain the following sections and elements:

• Feature design: Describe and justify your pre-processing methods, and how you designed
and selected your features.

• Algorithms: Give an overview of the learning algorithms used without going into too much detail in the class notes (e.g. SVM derivation, etc.), unless you judged necessary.

• Methodology: Include any decisions about training/validation split, distribution choice for Naive Bayes, regularization strategy, any optimization tricks, setting hyper-parameters, etc.

• Results: Present a detailed analysis of your results, including graphs and tables as appropriate.

This analysis should be broader than just the Kaggle result: include a short comparison of the most important hyper- parameters and all the methods you implemented.

•Discussion: Discuss the pros/cons of your approach methodology and suggest areas of future work.

•References (optional).

• Appendix (optional). Here you can include additional results, more detail of the methods, etc.
The main text of the report should not exceed 6 pages. References and appendix can be in excess of
the 6 pages.

We are expect you to follow these rules:

• You must submit the code developed during the project. The code must be well documented.
The code should include a README file containing instructions on how to run the code.

• Make sure to fix the random seeds so that the generated predictions are exactly matching your submitted prediction file.

• You should submit your result in .csv format. More information about the correct structure and format could be found in Kaggle website (go to : Overview→ Evaluation).

•You must submit a written report according to the general layout described above.

Marks will be attributed based on 50% for performance on the private test set in the competition and 50% for the written report. For the competition, the performance grade will be calculated as follows: The top team, according to the score on the private test set, will receive 100%. If the team doesn’t cross the basic baseline, entered by the instructor, will score 0%. All other grades will be calculated according to the interpolation of the private test set scores between those two extremes.

For the written report, the evaluation criteria include:

• Technical correctness of the description of the algorithms (may be validated with the submitted code).

•Meaningful analysis of final and intermediate results.

•Clarity of descriptions, plots, figures, tables.

• Organization and writing. Please use a spell-checker and don’t underestimate the power of a well- written report.

Do note that the grading of the report will emphasize the rationale behind the pre-processing and optimization techniques. The code should be clear enough to reflect the logic articulated in the report.
We are looking for a combination of insight and clarity when grading the reports.

Get instant help from 5000+ experts for