COMS 4771

Thursday, April 17, 2008

Final

The final is worth 25 points (+7 bonus points), covering the lectures after the midterm. The following lectures will not be covered:

Nearest neighbor methods (covered by the final assignment)
Prediction bounds (the last lecture before the midterm)
Large-scale learning (exponentiated gradient, VW), but the standard gradient method may be covered.

If necessary, the final will be curved.

Wednesday, April 16, 2008

Since a number of you chose this option, the oral quiz is not going to happen (even if it takes 10 minutes per paper, the quiz would take about 10 hours). We can have a written quiz. The quiz will take 30 minutes per paper and will start at 1pm on May 1. (If you are doing only one paper, you can either come at 1pm and leave at 1:30pm, or come at 1:30pm and leave at 2pm.) The final will be held at 2:40pm the same day.

1) Please email me IF you want to do paper #2. (As I recall, nobody expressed interest in reading it, so unless I hear from someone, the paper is off the list.)

2) Please email me IF there is any time conflict.

3) Please don't email me asking what type of questions there will be on the quiz. You should understand the papers in depth. You CAN use the papers on the quiz, but you can't use any other material.

Saturday, April 5, 2008

Questions about final projects

- in Problem #1, do we have to come up with a new method, or can we use
an already existing one (I was thinking about Kmeans, for example) ?

You can use an existing method, provided that you are happy with how it performs on the problem. The webpage describing the dataset has a list of test error rates achievable by different methods, so you can see how well you are doing.

how about the code? do we have to provide it as well? does it have to be
100% original or can we use (and maybe adapt) toolboxes? (I saw for
example you pointed us to weka). Are there restrictions on the language?

You have to provide the code. You can use toolboxes. The code doesn't have to be original if you can make it work well on the problem. The grade *will* depend on the performance of other students. There is no restriction on the programming language as long as you make it easy for me to run your solution to verify the reported test error rates. (Again, if you somehow use the test set to tune your solution, you will automatically get 0 points.)

- If we turn in some projects before May 1st, will they be graded
earlier (so that we get an idea whether we should attempt others) ?

Yes, but every student will be given only one additional attempt.

An important note: If you choose to do a reading assignment with a quiz, I will subtract points if you clearly don't understand an important concept from the paper. So choose this option only if you are serious about it.

Wednesday, April 2, 2008

Nearest Neighbor Methods

Dimensionality reduction:
Isomap
Locally Linear Embedding (LLE)
Maximum variance unfolding

Distance Metric Learning for Large Margin
Nearest Neighbor Classification

Nearest neighbor schemes:
Piotr Indyk's tutorial
Ken Clarkson's tutorial

Cover trees
Locality-Sensitive Hashing (LSH)
ANN: A Library for Approximate Nearest Neighbor Searching

Test set bound

Pantelis Monogioudis, a student in the class, implemented the test set bound in matlab. The implementation is here.

Tuesday, April 1, 2008

Large Scale Learning

A link to Vowpal Wabbit (Fast Online Learning at Yahoo! Research).
A link to Hadoop, also here and here.

Friday, March 28, 2008

Active Learning

D. Cohn, L. Atlas, R. Ladner. Improving generalization with active learning, Machine Learning, 15(2), 1994.

S. Dasgupta, D.J. Hsu, and C. Monteleoni. A general agnostic active learning algorithm.
Neural Information Processing Systems (NIPS), 2007.

S. Dasgupta. Coarse sample complexity bounds for active learning.
Neural Information Processing Systems (NIPS), 2005.

S. Dasgupta, A. Kalai, and C. Monteleoni. Analysis of perceptron-based active learning.
Eighteenth Annual Conference on Learning Theory (COLT), 2005.

The algorithm A² described in the class:
M-F. Balcan, A. Beygelzimer, and J. Langford. Agnostic Active Learning, ICML 2006.
Video lecture on A².

Hanneke, S. (2007). Teaching Dimension and the Complexity of Active Learning. In proceedings of the 20^th Annual Conference on Learning Theory (COLT).

Hanneke, S. (2007). A Bound on the Label Complexity of Agnostic Active Learning. In proceedings of the 24^th Annual International Conference on Machine Learning (ICML).

M-F. Balcan, A. Broder, and T. Zhang. Margin Based Active Learning, COLT 2007.