I'm working on a problem where I need to concatenate feature from a resnet model and few extracted features for a end-to-end deep learning model.
Model summary:
IMAGE -> ResNet-50 -> 2048 features --\
-- 2053 features (concatenated) -> Dense -> Softmax
Extracted features (5) -> 5 features --/
The feature vector from images is extremely high dimensional, on the other hand, the other vector is very low dimensional. The performance is not very good yet, how to apply some operations/change the architecture so that the discrepancy between this feature vectors is resolved?
Possibly some papers which solves this problem would be much appreciated.