I came across the term Factorization Machines in recommender systems. I know what Matrix Factorization is for recommender systems but never heard of Factorization Machines. So what's the difference?
3 Answers
Matrix factorization is a method to, well, factorize matrices. It does one job of decomposing a matrix into two matrices such that their product closely matches the original matrix.
But Factorization Machines are quite general in nature compared to Matrix Factorization. The problem formulation itself is very different. It is formulated as a linear model, with interactions between features as additional parameters. This feature interaction is done in their latent space representation instead of their plain format. So along with the feature interactions like in Matrix Factorization, it also takes the linear weights of different features.
So compared to Matrix Factorization, here are key differences:
- In recommended systems, where Matrix Factorization is generally used, we cannot use side-features. Ex for a movie recommendation system, we cannot use the movie genres, its language etc in Matrix Factorization. The factorization itself has to learn these from the existing interactions. But we can pass this info in Factorization Machines
- Factorization Machines can also be used for other prediction tasks such as Regression and Binary Classification. This is usually not the case with Matrix Factorization
The paper shared in previous answer is the original paper that talks about FMs. It has great illustrative example too as to what FM exactly is.
Edit: A note on side features that can be used in Factorization Machines but not Matrix factorization:
Matrix Factorization is solely a collaborative filtering approach which needs user engagements on the items. So it doesn't work for what is called as "cold start" problems. Think of a new movie released on Netflix. As no one would have watched it, matrix factorization doesn't work for it. But as Netflix would know the genre, actors, director etc, Factorization Machine can kick-start the recommendations for this movie from day 1 itself, which is a crucial component for many websites that use recommendation systems.

- 701
- 2
- 8
- 17
Just some extension to Dileep's answer.
If the only features involved are two categorical variables (e.g. users and items) then FM is equivalent to the matrix factorization model. But FM can be easily applied to more than two and real valued features.

- 13,692
- 7
- 51
- 80
-
2what do you mean by "equivalent"? would they really have the same model equation in that case? – Zaid Gharaybeh Sep 13 '20 at 03:30
-
@dontloo: can you elaborate more on the "equivalence" please? – Betty Aug 24 '21 at 16:30
Matrix factorization is a different factorization model. From the article about FM:
There are many different factorization models like matrix factorization, parallel factor analysis or specialized models like SVD++, PITF or FPMC. The drawback of these models is that they are not applicable for general prediction tasks, but work only with special input data. Furthermore their model equations and optimization algorithms are derived individually for each task. We show that FMs can mimic these models just by specifying the input data (i.e. the feature vectors). This makes FMs easily applicable even for users without expert knowledge in factorization models.
From libfm.org:
"Factorization machines (FM) are a generic approach that allows to mimic most factorization models by feature engineering. This way, factorization machines combine the generality of feature engineering with the superiority of factorization models in estimating interactions between categorical variables of large domain."

- 504
- 5
- 14