Last year, I read a blog post from Brendan O'Connor entitled "Statistics vs. Machine Learning, fight!" that discussed some of the differences between the two fields. Andrew Gelman responded favorably to this:
Simon Blomberg:
From R's fortunes
…
Can anyone tell me what is meant by the phrase 'weak learner'? Is it supposed to be a weak hypothesis? I am confused about the relationship between a weak learner and a weak classifier. Are both the same or is there some difference?
In the adaboost…
In some sense this is a crosspost of mine from math.stackexchange, and I have the feeling that this site might provide a broad audience.
I am looking for a mathematical introduction to machine learning. Particularly, lots of literature that can be…
I am new in machine learning. I am studying a course in machine learning (Stanford University ) and I did not understand what is meant by this theory and what is its utility. I am wondering if someone could detail this theory for me.
This theory is…
I wonder why do we use the Gaussian assumption when modelling the error. In Stanford's ML course, Prof. Ng describes it basically in two manners:
It is mathematically convenient. (It's related to Least Squares fitting and easy to solve with…
The 'fundamental' idea of statistics for estimating parameters is maximum likelihood. I am wondering what is the corresponding idea in machine learning.
Qn 1. Would it be fair to say that the 'fundamental' idea in machine learning for estimating…
I have come across some basic ways to measure the complexity of neural networks:
Naive and informal: count the number of neurons, hidden neurons, layers, or hidden layers
VC-dimension (Eduardo D. Sontag [1998] "VC dimension of neural networks"…
I want to get deeper into the Machine Learning(theory and application in Finance). I want to ask how relevant are complex analysis and functional analysis as a basis for Machine Learning? Do I need to learn these subjects or should I concentrate…
I've been reading Shalev-Shwartz & Ben-David's book, "Understanding Machine Learning", which presents the PAC theory in its Part I. While the theory of PAC learnability does appear very elegant and remarkable to me, I'm not so sure about its…
I am trying to answer the following question: "How much (binary) data do I need for my learner to have seen every variable of the dataset at least once?" In my set-up I am feeding my algorithm binary vectors (i.e. with all elements equal to either 1…
For classification, what theoretical results are between cross-validation estimate of accuracy and generalisation accuracy?
I particularly asking about results in a PAC-like framework where no assumptions are made that your function class contains…
Recently I am reading a paper in 2001, Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Dynamically Discovering Likely Program Invariants to Support Program Evolution TSE 2001, in this paper, it says,
Learning approach such as…
I've recently came across topic known as PAC-Bayesian, but I cannot find a source to read about it. Any article that I came across are talking about its application in a specific area but there is no introduction to what it exactly is.
All common treatment of PAC bounds based on Rademacher complexity assume a bounded loss function (for a self-contained treatemnt, see this handout by Schapire. However, I could not find any result for unbounded losses (with finite moments or…
I have a regression problem. The aim is to estimate the best fitting curve from a set of features. Now I have extracted a set of features that are relevant based on the literatures found.
Now the performance with the present set of features is not…