Talk:MLRG/spring09
From ResearchWiki
(Difference between revisions)
m |
(→The Vapnik-Chervonenkis Dimension) |
||
| Line 22: | Line 22: | ||
(These question are bound to expose my ignorance.) | (These question are bound to expose my ignorance.) | ||
* From definitions 2 and 4 in the book (chapter 1), I get the idea that "PAC learnable" and "efficiently PAC learnable" are one and the same. Is that right? | * From definitions 2 and 4 in the book (chapter 1), I get the idea that "PAC learnable" and "efficiently PAC learnable" are one and the same. Is that right? | ||
| + | : '''Piyush''': Definitions 2 and 4, both are about efficiently PAC learnable hypothesis classes. The difference is that definition 4 talks about learning a concept class C using a more expressive hypothesis class H (due to intractability issues). The most basic definition is actually Definition 1 which doesn't assume conditions required for efficient PAC learnability (i.e., polynomial time requirement in terms of various parameters). | ||
* The book defines what it means to have <math>VCD(C) = infinity</math>. Are there any interesting examples of concept classes with infinite VC dimension? | * The book defines what it means to have <math>VCD(C) = infinity</math>. Are there any interesting examples of concept classes with infinite VC dimension? | ||
: '''John Moeller''' FYI, "\infty" = <math>\infty</math> | : '''John Moeller''' FYI, "\infty" = <math>\infty</math> | ||
| + | : '''Piyush''': Interesting example: SVM, with Gaussian kernel having an almost zero width. Uninteresting example (if you care!): a lookup table. | ||
* How can we tell (and does it make sense to even ask this question?) if a concept class with infinite VC dimension (whatever that is) is PAC learnable or not? (It isn't clear to me based on theorem 3.3, 3.4, and 3.5.) | * How can we tell (and does it make sense to even ask this question?) if a concept class with infinite VC dimension (whatever that is) is PAC learnable or not? (It isn't clear to me based on theorem 3.3, 3.4, and 3.5.) | ||
| + | : '''Piyush''': If the VC dimension of a concept class is infinite, it is '''not''' PAC learnable (at least formally). | ||
| + | |||
| + | '''Piyush''' | ||
| + | |||
| + | * (A question and a possible answer) Okay, so if I were correct in what I said just above then it's a bit pessimistic for cases where the concept-class does have an infinite VC dimension. Indeed, we know that k-nearest neighbors (kNN) theoretically have infinite VC dimension. Same is true for SVM with Gaussian kernel having very small width. Then why is it that they both do so well in practice? The answers to both questions, I think, are possibly related: with SVMs, as the Gaussian kernel width goes to zero (and VC dimension goes to infinity), it starts behaving like a kNN classifier. However, kNN is a '''memory-based''' learning algorithm and I think kNN is not studied in the PAC learning framework, in the first place (someone please correct me if I am wrong here). And since SVM with Gaussian kernel of almost zero width and kNN are essentially equivalent, I think the behavior of such an SVM classifier cannot be justified using the PAC learning framework. Such an SVM works due to the same reasons as kNN does. Thoughts? | ||
| + | |||
| + | * A follow-up on the above: What should a kernel-based SVM classifier be called as: model-based or memory based? I think the latter since in this case we need to keep all the data in memory? | ||
Revision as of 09:11, 5 February 2009
Contents |
Jan 15, 2009
On the issue of representation
It seems like the problem of PAC-learning DNF formulae might be related to the hardness of approximation ? Intuitively, we're placing a distance function on the concepts, and we want a solution that's epsilon-close. The argument appears to be that if you could do this, you could solve an NP-complete problem, which is exactly how hardness of approximation works. So the question is: Is there any general result of the form "any problem hard to approximate under the RP!=NP assumption is hard to PAC-learn" ? --Schizoid 10:31, 15 January 2009 (MST)
Arvind
- I do not understand why we define two classes (or two sets) C and H. What I understand is that there should be one hypothesis class (lets say H) and other just the target concept (not the class) and we measure if c(x) = h(x) for
Jan 29, 2009
Learning in the Presence of Noise
Adam
- The second paragraph in the introduction to the chapter states that much of the theory developed for the non-noise case is preserved in the presence of noise. It even says that the examples that it showed to be efficiently PAC learnable without noise continue to be so even when there is noise. Furthermore, the book proves (and states in section 5.4) that concepts classes that are learnable from statistical queries ... are efficiently PAC learnable in the presence of classification noise. Has it been proven for the general case that if we can efficiently PAC learn a concept class using a particular representation in the absence of noise, that it is necessarily possible to efficiently learn the concept class from statistical queries? If not, has it been proven that that is not the case (if so, is there a simple counter example)?
Feb 5, 2009
The Vapnik-Chervonenkis Dimension
Adam (These question are bound to expose my ignorance.)
- From definitions 2 and 4 in the book (chapter 1), I get the idea that "PAC learnable" and "efficiently PAC learnable" are one and the same. Is that right?
- Piyush: Definitions 2 and 4, both are about efficiently PAC learnable hypothesis classes. The difference is that definition 4 talks about learning a concept class C using a more expressive hypothesis class H (due to intractability issues). The most basic definition is actually Definition 1 which doesn't assume conditions required for efficient PAC learnability (i.e., polynomial time requirement in terms of various parameters).
- The book defines what it means to have VCD(C) = infinity. Are there any interesting examples of concept classes with infinite VC dimension?
- John Moeller FYI, "\infty" =
- Piyush: Interesting example: SVM, with Gaussian kernel having an almost zero width. Uninteresting example (if you care!): a lookup table.
- How can we tell (and does it make sense to even ask this question?) if a concept class with infinite VC dimension (whatever that is) is PAC learnable or not? (It isn't clear to me based on theorem 3.3, 3.4, and 3.5.)
- Piyush: If the VC dimension of a concept class is infinite, it is not PAC learnable (at least formally).
Piyush
- (A question and a possible answer) Okay, so if I were correct in what I said just above then it's a bit pessimistic for cases where the concept-class does have an infinite VC dimension. Indeed, we know that k-nearest neighbors (kNN) theoretically have infinite VC dimension. Same is true for SVM with Gaussian kernel having very small width. Then why is it that they both do so well in practice? The answers to both questions, I think, are possibly related: with SVMs, as the Gaussian kernel width goes to zero (and VC dimension goes to infinity), it starts behaving like a kNN classifier. However, kNN is a memory-based learning algorithm and I think kNN is not studied in the PAC learning framework, in the first place (someone please correct me if I am wrong here). And since SVM with Gaussian kernel of almost zero width and kNN are essentially equivalent, I think the behavior of such an SVM classifier cannot be justified using the PAC learning framework. Such an SVM works due to the same reasons as kNN does. Thoughts?
- A follow-up on the above: What should a kernel-based SVM classifier be called as: model-based or memory based? I think the latter since in this case we need to keep all the data in memory?