Introduction to machine learning for mathematicians

Question

In some sense this is a crosspost of mine from math.stackexchange, and I have the feeling that this site might provide a broad audience.

I am looking for a mathematical introduction to machine learning. Particularly, lots of literature that can be found is relatively imprecise and a lot of pages are spent without any content.

However, starting from such literature, I discovered the Coursera courses from Andrew Ng, the book of Bishop on pattern recognition and finally a book of Smola. Unfortunately, the book of Smola is only in draft state. In Smola's book even proofs can be found, which appeals to me. Bishop's book is already quite good, but a certain amount of rigor is missing.

In short: I am looking for a book like Smola's, that is, as precise and rigorous as possible and uses mathematical background (though short introductions are of course OK).

Any recommendations?

It looks like the question is unfinished - it breaks off after "and". — J W, Mar 25 '15 at 20:30
you might want to explain why a mathematician wants to learn about machine learning (to find a job as data scientist/ to do research/ etc) which will help people point you in the right direction — seanv507, Mar 25 '15 at 20:46
@sean, I want to find a job as data scientist. Besides, it's an interesting topic. I just finished my PhD in mathematics, particularly in the field of pde. Further focal point of my studies were, numerics, differential geometry, geometric modelling and computational geometry. However, in germany, the market for pde guys is almost non existent. Especially if you want to leave the universities (at least in that field). Having spend some time on mathematics, I want to stay in contact with math. I think ML is ideal for that. — Quickbeam2k1, Mar 25 '15 at 21:23
for data science I would argue you need basic statistics understanding (eg linear/logistic regression),experimental design-eg ab testing etc,and in addition an understanding of recommender system techniques — seanv507, Mar 25 '15 at 22:58
@ Mike Miller, Smola's draft book only has few content. The book learnign with kernels, posted by Marc Claesen seems to be more fitting to me. @ Seanv507: What if you are a data scientist in the field of (geometric) pattern recognition and algorithm development? — Quickbeam2k1, Mar 26 '15 at 05:16
@Quickbeam2k1 - I don't recognise the field - can you give examples. I was just pointing out where the majority of the jobs are. — seanv507, Mar 26 '15 at 12:09
@seanv507: Assume you have CAD-data and you need to classify them according to their shape. (Easiest cases: donuts, spheres, balls, cylinders). New algorithms for e.g. faster search are required. When working in the area of trading systems and risk control, e.g. in banks, you need to analyse data. Additonally , you also might need to improve existing algorithms, due to different or new models or lack of speed. Or maybe the most prominent: working on autonomous cars. — Quickbeam2k1, Mar 26 '15 at 12:31
I second @seanv507's comment= yes, there are a few exciting jobs in machine learning. Sure, aim for them. However, I think the majority of "machine learning" jobs are actually more like "automated basic statistics" (as featured in Ng's course, btw). — P.Windridge, Mar 26 '15 at 14:50
Has my question been undeleted? Somehow it was lost before today? — Quickbeam2k1, Apr 01 '15 at 08:30

score 15 · Answer 1 · edited Mar 26 '15 at 10:09

15

I would recommend Elements of Statistical Learning (free PDF file). It has sufficient maths and a good introduction to all the relevant techniques - together with some insights on why the techniques work (and when they don't).

Also Introduction to Statistical Learning (which is more practical - how to do it in R). It has a course running statistical learning; you might find the lectures on YouTube (and again free PDF).

edited Mar 26 '15 at 10:09

Peter Mortensen

271
3
8

answered Mar 25 '15 at 20:44

seanv507

4,305
16
25

3

That is a very nice recommendation. In addition to this, I suggest "Learning from Data" from Yaser S. Abu-Mostafa. It is heavily theoretical but explains very clearly topics such as feasibility of learning and VC dimension. The are videos and slides available [online](https://work.caltech.edu/telecourse.html). – tiagotvv Mar 26 '15 at 10:29
I second the suggestion "Learning from Data" from Yaser S. Abu-Mostafa. The book is very short but packed with valuable information. Much focus is indeed put on feasibility of learning and complexity. – Vladislavs Dovgalecs Mar 31 '15 at 23:23

score 11 · Accepted Answer · answered Mar 31 '15 at 23:11

11

For what you describe, I highly recommend "Foundations of Machine Learning" by Mohri et.al. It is an undergraduate text, but it is for really good undergraduates. It is readable and it is the only place I have found what I would call a mathematical definition of machine learning (pac and weak pac). It is worth reading for that reason alone. I also have a math Phd. I'm familiar with, and like, many of the books mentioned above. I'm particularly fond of ESL for a broad spectrum of techniques and ideas, but it's a statistics book with lots of mathematics.

answered Mar 31 '15 at 23:11

meh

1,902
13
18

1

Btw, I'm told that Schapire, in his thesis proved that weak PAC implies PAC. His proof amounts to the boosting technique, so it's a nice example of how a theoretical question led to a very practical result. – meh Mar 31 '15 at 23:14
Thanks, for your remarks. I think I will work with ESL later after working with Mohri's and Shalev-Shwartz's books – Quickbeam2k1 Apr 01 '15 at 09:59

score 9 · Answer 3 · edited Mar 26 '15 at 10:08

9

You will probably like Learning With Kernels by Schölkopf and Smola. Most of Schölkopf's work is mathematically rigorous.

That said, you are probably better off reading research papers instead of textbooks. Research papers contain full derivations and proofs of convergence, bounds on performance, etc. which are very often not included in textbooks. A good place to start is the Journal of Machine Learning, which is highly regarded and fully open access. I also recommend the proceedings of conferences like ICML, NIPS, COLT and IJCNN.

edited Mar 26 '15 at 10:08

Peter Mortensen

271
3
8

answered Mar 25 '15 at 19:52

Marc Claesen

17,399
1
49
70

thanks for the hints with the journal. However, I fear that the journals are, so far, too advanced for me. Nevertheless, this migth be a valuable source for the future. – Quickbeam2k1 Mar 25 '15 at 20:42

score 5 · Answer 4 · answered Mar 31 '15 at 23:16

5

I would suggest Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz. I admit that I read only small portions of it but I immediately noticed rigor with which author approached every problem and discussion.

answered Mar 31 '15 at 23:16

Vladislavs Dovgalecs

2,315
15
18

Introduction to machine learning for mathematicians

4 Answers4

Linked