Training the learning model on the log transformed data?

Question

So I've gone through this CV post, and in my primitive understanding I assume we do log transformation when we 'care' about relative changes and also to even out the positive skweness from our data.
So for example let's imagine having a dataset as follows:

Quantity  Area  TotalSF  Year  Price
    7     1710   856     2003  321510
    6     1262   1262    1976  183190
    7     1786   920     2001  228410
    7     1717   756     1915  171230
    8     2198   1145    2000  201000

After log transforming 'Area', 'Price' and 'TotalSF' the respective plots shows nice normal distribution. The log transformed data looks something like this:

 Quantity  Area     TotalSF     Year  Price
    7    7.444249   6.752270    2003  18.34
    6    7.140453   7.140453    1976  17.75
    7    7.487734   6.824374    2001  17.92
    7    7.448334   6.628041    1915  17.43
    8    7.695303   7.043160    2000  18.12

My questions are:

Are we going to train our model on this log transformed data ?
If yes, then do we need to log transform our test data as well ?
How do we get back the normal/actual values, say for the variable 'Price', after log transforming it ?

Edit:
This question specifically asks for suggestions whether to train a model based on log transformed data or not, and how to get the actual values back - more of a beginner-friendly question. The question that's used to flag this question as a duplicate on the other hand asks whether it is valid to back transform or not. That's a completely different premise, I believe.

What are you planning to do with the data? Train a linear regression model? — Stergios, Apr 25 '18 at 07:32
@Nuhman: Thanks, will have a look. Can you please post a link for (some) data so I can make some examples? — kjetil b halvorsen, Sep 28 '18 at 09:35
What, in practical terms, do you want to optimize while performing this regression? — Sextus Empiricus, Sep 28 '18 at 09:38

score 6 · Accepted Answer · answered Apr 26 '18 at 10:47

6

You can train your model on the log-transformed data
If you do (1) then you necessarily need to log-transform your test data as well
You can get the initial 'price' value by using exp() since exp(ln(x)) = x

answered Apr 26 '18 at 10:47

Stergios

538
2
14

Training the learning model on the log transformed data?

1 Answers1