In this project, you are expected
Â
(1) to select a particular area of Machine Learning that interests you,
Â
(2) to conduct a literature search on this area,
Â
(3) to focus on a specific problem in the area you selected, and
Â
(4a) to design and implement a novel learning scheme or
Â
(4b) to extend an existing scheme to deal with the problem you have identified. Alternatively
Â
(4c) You can compare the performance of different existing schemes on the specific problem you have identified in (1), (2) and (3) or on a particular real-world data set (but not one of the benchmark data sets such as those in the UCI repository: such a data set must be of interest to industry or research).
Project Suggestions
Design a combination scheme for combining learning methods that present different strengths and weaknesses. This scheme should benefit from the different learning methods' advantages but not suffer from their individual weaknesses. ?
Â
Ensemble-based combination schemes often perform more accurately than a single "best classifier". Investigate the relationship between the accuracy of the individual combined classifiers and that of their combination. ?
Â
Identify an area of Natural Language Processing that could be handled by a machine learning method (example, the translation of certain prepositions from one language to another), propose a method for automatically constructing a training set for that problem from raw text and a lexicon, and apply one or several learning algorithm to that data set. ?
Â
Implement a program for detecting domain specific keywords in a collection of texts written for that domain. ? If you have a data set of interest to you (example: from a past or present job, or another academic project), evaluate the performance of standard learning techniques on that set, identify particular properties of your data set that may negatively affect the learning performance, devise and implement a scheme for addressing this deficiency. ?
Â
Design a method for generating new features and selecting the most useful ones for a given learning task. ? Use the Mixture-of-Experts Framework with different learning schemes. Is it a useful scheme for combining different classifiers?
Â
Design and implement a concept-learner (or extend an existing concept- learner) for dealing with class imbalance (the situation where a training set contains more positive than negative data (or the other way around)). ?
Â
Design and implement a concept-learner (or extend an existing concept- learner) for dealing with the case of small disjuncts and rare cases. ?
Â
Compare the performance of a number of unsupervised classifiers used in supervised mode to the performance of supervised classifiers. ?
Â
Compare the performance of combination methods such as bagging or boosting when used with different learning methods.
Your report should contain:
A statement of the problem you are studying. ?
Â
A review of the related literature on the topic and a discussion of where your study fits in this previous literature. ?
Â
A description of the method you have designed or of the methods you are comparing. Assume that the reader does not know how the systems you have designed and/or used work. ?
Â
A description of the data to which you applied your research (this description should include: number of features, values these features can take, size of the data set, size of the training and testing sets, etc.) ?
Â
A description of the methodology you used to set the various learning parameters of the systems you tested and a discussion of the optimal settings you found. This is particularly relevant in the context of Neural Networks, for example where the Number of Hidden Units, Learning rates, Momentum, Number of RBF's etc. have to be chosen by the user.
Â
The idea here is that your results should be reproducible by anyone reading your paper. ? A description of your testing methodology (e.g., 10-fold cross-validation) and a discussion of why this testing methodology is appropriate. ?
Â
A description of your results. Think of the format that would best illustrate the points you are trying to make. Should you list your results in a table? represent them with a graph? what sort of graph? what results are necessary to report? ?
Â
A discussion of your results. i.e., a section that explains why, in your opinion, the results you reported were obtained: why the learners you considered were successful or why they failed. If you want, you can also discuss what you think would happen under conditions different from those you specifically tested. ?
Â
A discussion of the relevance of your results: what have you achieved with your study? How do your results support the claims you have made in the earlier parts of your report? ?
Â
A section discussing future work. There, you should try to identify sets of experiments that would be interesting to run and to discuss why they would be interesting (i.e., what are the issues that such experiments would test).