Questions tagged [feature-scaling]
122 questions
7
votes
1 answer
Scaling dataset in Random Forest
Scaling a dataset for Random Forest modelling is not necessary. However, if we have already done the scaling and normalization to the dataset, will it impact our Random Forest modelling?

Subrata Mukherjee
- 71
- 1
4
votes
1 answer
Which machine learning algorithms get affected by feature scaling?
Which of the following machine learning algorithms will be affected if we apply feature scaling?
Naïve-Bayes
k-Nearest Neighbor (KNN)
Support Vector Machine (SVM)
Decision Trees
Neural Network (NN)

Josp Andrew
- 43
- 1
- 5
3
votes
2 answers
Scaling features for neural network input
I have a df with many features that take both negative and positive values.For example a feature may have values in range (-10 , 10).For each feature which has negative values the negative sign means direction and this -10 is actually a larger value…

Epitheoritis 32
- 157
- 4
3
votes
1 answer
How important is outcome variable scaling in SVM regression?
Should I scale outcome variable for SVM regression? What is the magnitude of impact of outcome variable scaling in SVM regression?

vasili111
- 755
- 2
- 10
- 21
3
votes
1 answer
Feature engineering before or after scaling?
I am doing feature engineering on a set of features to reduce the size of the dataset. The features can have different scales. E.g, one feature has values that vary between 1000 and 1500, and the other features vary between 0 and 100. One of the…

xeon123
- 225
- 2
- 6
3
votes
1 answer
Look ahead bias induced by standardization of a time series?
Let's say I'm using some machine learning model to predict future values of a time series (e.g. stock price, air temperature, etc). In my model, I'm using some autoregressive features such as lagged target variable, rolling mean of the target…

Liz
- 33
- 4
3
votes
0 answers
Why does LASSO predict random data "well" during leave-one-out cross validation?
pre-amble:
While investigating different cross validation strategies for small sample size dataset's with relatively large number of features I came across this peculiar result. While making a simple Leave-One-Out-Cross-Validation setup with scaled…

TCulos
- 31
- 3
2
votes
1 answer
Scaling a power law distribution for k-means clustering
For my project I want to group some products by using a few variables. For grouping, I am using k-means clustering. One of my variables is a metric called CR (conversion rate) which takes values ranging from 0 to some positive integer (the upper…

gülsemin
- 65
- 4
2
votes
0 answers
Is it always better to use the RobustScaler (vs StandardScaler)?
From reading the docs, I believe the RobustScaler is more immune to outliers that the StandardScaler. In that case, why not just use the RobustScaler always?

Levon
- 433
- 7
- 16
2
votes
1 answer
Should we apply feature transformation for test data?
I am working on a regression problem. The data contains 13 features (after performing feature selection). to some of these features, I have applied log transformation and box-cox to fix the skewness. For some features, I have also used standard…

Dawood Aijaz
- 21
- 2
2
votes
1 answer
Scaling embedding layer's outputs in Tensorflow
I have a neural network that takes categorical and quantitative features as inputs.
The quantitative features are scaled in $[0,1]$. I apply an embedding layer to get a continuous representation of the categorical features and then i concatenate…

Qtip
- 43
- 5
2
votes
1 answer
SGD is sensitive to feature scaling
I am taking a deep learning class and the class slides state one of SGD's problems as: "Gradient is scaled equally across all dimensions." Now what is meant by this is I believe, when we have d-dimensional features, the learning rate is multiplied…

diane
- 43
- 3
2
votes
2 answers
What is the name for normalization $\leq 1$?
In my current thesis I have two weight components. As I want to join those components, weighted by a percentage, I thought about normalizing(/scaling?) both components respectively by their max value.
Component $c1$ will therefore be:
$c1' =…

Goddilein
- 21
- 2
2
votes
1 answer
Does Batch Normalized network still need scaled inputs?
I'm a bit new on this topic. Does Batch Normalization replace feature scaling?
As far as my understanding goes, the batch normalization uses an exponential moving average to estimate $\mu$ and $\sigma$ on the fly to normalize batches during the…

tornikeo
- 143
- 7
2
votes
1 answer
Ordinal Feature Encoding (Linear or Nonlinear?)
In most ordinal features, it seem that the scaling is linear. E.g. [1, 2, 3, 4] with higher score representing larger effect on the target variables
But is it possible to encode the feature in a nonlinear fashion? such as [1, 2, 4, 8]. What is the…

K_inverse
- 175
- 6