How creating bins for a numeric feature can enables the model to learn nonlinear relationships within a single feature?

Question

I understood How binning of numerical feature would help build correlations between the feature & the predictor. For example

For a regression problem, we can bucketize "population" feature into the following 3 buckets (for instance):

bucket_0 (< 5000): corresponding to less populated blocks
bucket_1 (5000 - 25000): corresponding to mid populated blocks
bucket_2 (> 25000): corresponding to highly populated blocks

Given the preceding bucket definitions, the following population vector:

[[10001], [42004], [2500], [18000]]

becomes the following bucketized feature vector:

[[1], [2], [0], [1]]

I took the example from here, and they suggest in the same setting if create bins for 3 different features such as latitude, longitude, roomsperperson, then we can enable the model to learn nonlinear relationships within every single feature!

Three separate binned features:

[binned latitude], [binned longitude], [binned roomsPerPerson]

I can't understand the learning of "nonlinear relationships within every single feature" because at a time while training only one of the bin(for each feature would be available) would be visible to learning algorithm & not the whole vector, as vector symbolize the input space. Correct me if I am wrong!

How creating bins for a numeric feature can enables the model to learn nonlinear relationships within a single feature?

0 Answers0