I'm new to Machine Learning. I'm trying to build a regression model, to predict the price of a cab ride. I have distance, source, destination and other independent variables.
Do I need to encode (one hot encoding) both source and destination variables (categorical) before building a model?
Note: source and destination have the same number of unique elements, and I also thought of dropping source and destination, and keeping only distance. However, some records have different distance values for the same source and destination (because of cab drivers taking different routes).
I'm also worried about the curse of dimensions after encoding. How would you proceed with this type of data?