DIT380/TDA231: Algorithms for Machine Learning and Inference
January-March 2011.
Machine Learning means to infer hypotheses from given data or experience, which results in some model that generalizes the data and hopefully draws correct conclusions. In this sense, algorithms (implemented as computer programs) that generate such hypotheses can "learn" from data and improve their performance. "Learning" algorithms do not contain direct instructions
how to solve a problem, rather, we only tell them how to adapt themselves to
new data. This is appropriate for complex problems where we lack complete
knowledge and even an explicit specification. Machine learning methods are
commonly used in classification tasks (recognizing objects, understanding
situations, predicting future data, etc.) and in expert systems (e.g., for
diagnosis). A commercially and scientifically important area of application
is Data Mining, where such algorithms are used to detect relevant information
and patterns in large databases. Another major field is Bioinformatics.
You won't see many implementation details of specific learning algorithms in
this course. (We suppose that you are already enough familiar with the
programming aspects of Computer Science.) Rather, the course deals with the
principal and general issues of Machine Learning: Why do these algorithms
"learn"? What is behind them? We also refer to existing implementations that
you can use "from the shelves". The really important thing is to know how to
apply and interpret them in a proper way. Nevertheless, you should notice that
most of the algorithms work in a rather simple fashion and could be
implemented from scratch using only standard programming skills (and enough
time).
Course Goals
After the course you should
- have learned sound mathematical foundations for the inference of hypotheses from data and models on scientific grounds,
- know a representative set of available Machine Learning approaches,
- to some extent be able to evaluate the methods qualitatively and quantitatively, and to recognize both their strengths and limitations.
Messages
Practical Information
Prerequisites: Some elementary knowledge of discrete mathematics,
combinatorics, probability theory, mathematical statistics, and calculus is essential. A basic course in algorithms and data structures is also a useful
prerequisite, but you need more the "algorithmic thinking" rather than
specific algorithms. Here is a little
prerequisite self-test.
Instructor:
Peter Damaschke (ptr, room 6478)
Assistants:
Azam Sheikh Muhammad (azams, room 6453), Bassel Mannaa (bassel)
Lectures:
Tuesday, 10:00-11:45, room HC1
Thursday, 13:15-15:00, room HC2
Consultation time: by appointment, in office 6478. Book a consultation time if you have any course-related questions or difficulties you want to
discuss, alone or in small groups.
Possible changes in the schedule will be announced.
Examination and Grading Criteria
To pass the course, you must do (1) the compulsory exercises during
the course, and (2) a take-home exam with summarizing questions.
Details will be announced. Some practical exercises may be part of the
hand-ins, but no programming exercises will be given. We do not use a formal
point system but apply the following grading criteria:
5/VG: Solutions are correct and clear, subject to minor difficulties.
4/G: Solutions are mainly correct, but with some substantial difficulties.
3/G: Solutions show a basic understanding, but exercises have been
managed only after several attempts, or mistakes or gaps remain.
U: The understanding is in general insufficient, major parts of the
exercises have not been solved.
Improvement of your grade is possible later on, by an individual follow-up
assignment, however you must really show an improvement to reach the next higher level.
Rules and Policies
Read them carefully and take them very seriously.
- Deadlines are firm. Delays must be motivated before the deadlines.
Unannounced late submissions will not be considered. It is everyone's
responsibility to observe the deadlines.
- If you get feedback on the exercises: complete and improve your solutions
accordingly and resubmit as soon as possible. The earlier you resubmit, the
more chances you have. The number of resubmissions is not limited, until some final deadline for all exercises at the end of the course.
- It is allowed, even encouraged, to discuss the exercises during the course. Also, do not hesitate to ask if you have difficulties with the exercises, or if something is unclear.
- However, you must write your final solutions on your own, using your own
words, and express them in the way you understood them yourself.
- Submitting others' work in your own name is cheating! It can lead to severe consequences, in very bad cases even suspension from studies.
- Specifically, it is prohibited to copy from each other, to copy from books, articles, web pages, etc., or to submit solutions that you got from
other persons, unless you explicitly quote the sources and add your own
explanations. And: a copy with modifications is, in this sense, still a copy.
- You are also responsible for not giving others the opportunity to copy from your work. (We will not investigate who copied from whom.)
- In the take-home exam you must work completely on your own and direct
questions to the teachers only.
Exercise Submission
Solutions to the theoretical hand-in exercises must be submitted
individually (not in groups) by email to "ptr..." The subject line
should contain the course code and exercise number. Please send only plain text or PDF attachments, no other formats. Write your name and personal
number also on the submitted attachments (as we may want to print them).
Literature
The course does not exactly follow a particular book, but it can be helpful
to read on the side "Mitchell: Machine Learning (McGraw-Hill 1997)", ISBN
0-07-042807-7, which should be available at Cremona. Chapter 2-8 were earlier
used as the course book. The book is somewhat outdated, but still it gives well readable introductions to the main topics.
Other books that may be of interest are:
Russell, Norvig: Artificial Intelligence. (A good general AI book with several chapters related to this course. It has also been used in the AI course.)
Bishop: Pattern Recognition and Machine Learning. (See here.)
Fayyad et al: Advances in Knowledge Discovery and Data Mining.
Michie, Spiegelhalter, Taylor: Machine Learning, Neural and Statistical
Classification. (This text is
freely available.)
Gr"unwald: The Minimum Description Length Principle. (More specialized and advanced! Selected chapters are
freely available.)
Contents
Lecture 1. Scope of machine learning, typical problems, basic
framework, inductive bias.
Figures.
Lecture 2. Version space, classification, learning by queries,
candidate elimination.
Figures.
Lecture 3. Remarks on the inductive bias, computational issues,
noisy data, Minimum Description Length (MDL) principle, the decision
tree "language".
Figures.
More figures.
Lecture 4. Description length of decision trees vs. Boolean
expressions, ID3 algorithm, overfitting, validation.
Figures.
Lecture 5. Ensemble methods. Probability and learning: entropy
again, Bayesian learning, prior and posterior probabilities, MAP and ML hypotheses.
Lecture 6. Bayesian learning as an extension of concept learning, small
examples, regression and ML hypotheses, about learning probabilistic
concepts.
Lecture 7. Bayesian classifiers: Optimal, Gibbs, and Naive; the space
of linear separators.
Figures.
Lecture 8. Training perceptrons, inductive bias and training of sigmoid
neural networks.
Figures.
Lecture 9. Support vector machines. Sample size estimates, probably
almost correct (PAC) learning.
Figures.
Lecture 10. VC dimension, mistake bounds, boosting again.
Figures.
Lecture 11. Perceptron mistake bound, neareast neighbor learning,
discretization of a numerical attribute.
Figures.
Lecture 12. Learning relevant attributes.