First of all, I must say I'm in no way a data scientist or anything; I just happened to come across a problem, and I am attempting to use statistics to find the best solution.
The problem is about splitting each of six sets of items in two, with each combination having four parameters, and finding the split that minimizes four parameters.
Specifically, I'm trying to optimize the book schedule, so that my desk mate and I can bring the required schoolbooks to school with the least effort - that is, with
- minimal total weight for me $W_m$,
- minimal total weight for my desk mate $W_d$,
- minimal swapped books for me $S_m$, and
- minimal swapped books for my desk mate $S_d$.
How should I seek the best splitting, i.e. the combination that minimizes all four parameters?
My first thought was simple weighted Euclidean distance: simply assign a score $S = \sqrt{\alpha(W_m^2+W_d^2)+\beta(S_m^2+S_d^2)}$ to each combination, and then evaluate the $n$ lowest scores for some weights $(\alpha,\:\beta)$. However, my concern is that such a score would possibly give too much weight to suboptimal solutions, no matter the choice of weights. This seems especially true considering that [I read somewhere that] the Euclidean distance tends to favour closer points much more than farther ones.
Perhaps an improvement could be computing the average weight $W=\frac{W_m+W_d}2$ and the average swap number $S=\frac{S_m+S_d}2$, and then computing the weighted Euclidean distance between these, $S=\sqrt{\alpha W^2+\beta S^2}$.
Any thoughts?