Crew selection: ranking rowers by letting them race against each other

Question

Seat selection is a common practice in competitive rowing and I would be curious about more solid statistical underpinnings: there are more rowers in a team than the 8 seats in the crew boat. So the coach splits the team repeatedly into two smaller crews of 4 and lets them race against each other - noting both who wins and the margin of winning (say: seconds). There is prior knowledge from land training about relative strength. The goal is to find the strongest 8 rowers from a team of 12 or so and typically only 2 or 3 seats are really contentious.

I am curious about how to model this but don't have a strong background in statistics and hence look for a starting point. I was attracted by the simplicity of an answer for Measuring individual player effectiveness in 2-player per team sports
The number of races is limited because they are tiring. This suggests the partioning of the team for races should be designed carefully to learn the most.
I would like an approach where each race adds knowledge about the ranking (or relative contribution). A naive approach would simply use the fraction of races won but I would expect that more can be learned by taking prior knowledge into account: was the win to be expected or surprising?

Is this only about which individual rowers to include, or could there also be *team effects*, that some are more effective together with some others, or *seat effects*, that some rowers are better at some seat position? — kjetil b halvorsen, Apr 28 '19 at 21:12
There is an aspect of some rowers harmonising better with some than with others. I would be happy to ignore this for now but this is definitely happening. — Christian Lindig, Apr 28 '19 at 21:20
Some seemingly relevant papers: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2886831/, https://www.cesifo-group.de/dms/ifodoc/docs/Akad_Conf/CFP_CONF/CFP_CONF_2016/am16_Gollier/Papers/am16_Upmann.pdf — kjetil b halvorsen, Apr 28 '19 at 22:14
A potential proxy for a rower’s contribution could be his/her power output. This is known from land training. So in order to win, a crew must have in total more power than the losing crew. As we see presumably weaker crews win, we must update our knowledge about the individual power output. This could lead to some least-square error how we assign power to explain the wins and losses. — Christian Lindig, Apr 28 '19 at 22:25
I have worked on the problem and wrote a draft paper https://lindig.github.io/papers/seat-racing-2020-draft.pdf. The paper gives more background and uses a simple statistical model that assigns power to athletes as a measure of their contribution. I would welcome feedback because my stats foo is weak. — Christian Lindig, May 17 '20 at 15:01
I have another paper. I now believe this can be solved purely with linear algebra. This captures that the solution is not unique but still useful. https://lindig.github.io/papers/seat-racing-iv-2020-draft.pdf — Christian Lindig, Aug 09 '20 at 12:32
You should consider answering the Q yourself, based on that paper! — kjetil b halvorsen, Aug 09 '20 at 14:49

score 1 · Answer 1 · answered Apr 28 '19 at 22:35

You should look into the design of experiments. Say there are 12 rowers, and 8 seats in the boat. We start with the model $$ Y= \beta_0 + \sum_{i=1}^{12} \beta_i I_i + \epsilon_i $$ where $I_i$ are inclusion indicators for the rowers, and $\sum_{i=1}^{12} I_i=8$. I would start with optimal experimental design for this model. The number of possible teams are $\binom{12}{8}=495$, so trying all teams would be prohibitive.

You could just generate all 495 teams, and use as input for some optimal design algorithms, like in R package AlgDesign which implements the Fedorov exchange algorithm using the criterion of D-optimality. I would guess there are theoretical results for this specific kind of model, but cannot find references.

... and typically only 2 or 3 seats are really contentious

With some prior information, you could look into Bayesian design of experiments. Not much on this site, see https://en.wikipedia.org/wiki/Bayesian_experimental_design.

Christian Lindig · Answer 2 · 2020-08-09T17:41:59.223

I don't have a fully general answer but it is a start. This assumes:

8 rowers (identified as 1 to 4 and A to D); 1 to 4 always row on one side, and A to D on the other.
2 boats with 4 rowers each: two from 1 to 4 and two from A to D.
6 races of 2 boats; distance 1000m, for each boat the time is taken.
rowers are placed into boats according to a swap matrix
from the time of a race the average power of the crew is computed. The general connection is $P=k*v^3$ with $k=2.8*4$ a typical drag coefficient for boats with 4 rowers and $v$ the speed of the boat calculated from distance and time.

The problem now becomes finding a solution for the following equation system (with some crew power values as an example):

$$ \begin{bmatrix} 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 \\ 1 & 1 & 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 1 & 0 & 0 & 1 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 1 & 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 & 0 & 1 & 1 & 0 \\ \end{bmatrix} \times \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ x_A\\ x_B\\ x_C\\ x_D\\ \end{bmatrix} = \begin{bmatrix} 1303.02 \\ 1350.62 \\ 1354.82 \\ 1274.37 \\ 1305.56 \\ 1304.04 \\ 1304.49 \\ 1256.89 \\ 1252.69 \\ 1333.14 \\ 1301.95 \\ 1303.47 \\ \end{bmatrix}\\ $$ or more succinctly as \begin{equation} \label{eq:sx=p} S x = P \end{equation} where $S$ is the swap matrix, $x$ the power of each rower, and $P$ the observed crew power. The swap matrix $S$ identifies who is racing together in a crew.

Solving $Sx=P$ for $x$ is not straight forward: $S$ is not square and therefore has no inverse. The left inverse $S'$ with $S'S=I$ does not exist because $\textrm{rank}(S)= 7 < 8$. But $S$ has a unique generalised inverse $S^+$ which we can use to describe all solutions.

\begin{align} S^+ &= \frac{1}{48} \begin{bmatrix} +7&-5&-5&+7&-5&-5&-5&+7&+7&-5&+7&+7\\ +7&+7&+7&+7&+7&+7&-5&-5&-5&-5&-5&-5\\ -5&-5&+7&-5&-5&+7&+7&+7&-5&+7&+7&-5\\ -5&+7&-5&-5&+7&-5&+7&-5&+7&+7&-5&+7\\ +7&+7&-5&-5&-5&+7&-5&-5&+7&+7&+7&-5\\ +7&-5&+7&-5&+7&-5&-5&+7&-5&+7&-5&+7\\ -5&+7&+7&+7&-5&-5&+7&-5&-5&-5&+7&+7\\ -5&-5&-5&+7&+7&+7&+7&+7&+7&-5&-5&-5 \end{bmatrix} \end{align}

All solutions for rower power $x$ given crew power $P$ are given by

\begin{equation} \label{eq:x=sp} x = S^+ P + c \begin{bmatrix} +1&+1&+1&+1&-1&-1&-1&-1 \end{bmatrix}^T \end{equation}

for an arbitrary constant $c$. Possible solutions are

\begin{equation} \begin{array}{rrrr} rower & c=0 & c=10 & c=30 \\ 1 & 293.40 & 303.40 & 323.40 \\ 2 & 343.42 & 353.42 & 373.42 \\ 3 & 334.14 & 344.14 & 364.14 \\ 4 & 332.80 & 342.80 & 362.80 \\ A & 331.67 & 321.67 & 301.67 \\ B & 334.53 & 324.53 & 304.53 \\ C & 342.74 & 332.74 & 312.74 \\ D & 294.82 & 284.82 & 264.82 \\ \end{array} \end{equation}

Note that the difference in power between rowers within one side is independent of $c$. This is a feature of the swap matrix $S$ and now can be used to rank rowers per side.

The strongest rowers on starboard are 2, 3, 4, 1 in this order.
The strongest rowers on port side are C, B, A, D in this order.
We can't tell who is the strongest rower across sides because power shifts between sides are not detected by the method. Parameter $c$ basically models this shift. Across all solutions, the difference in power between rowers of one side remains constant and leads to an order independent of $c$.
With $c=0$ the combined power of rowers of 1 to 4 and A to D is equal. This makes it somewhat more likely to be the true scenario than a solution with a large parameter $c$.

Crew selection: ranking rowers by letting them race against each other

2 Answers2