I have a tabular dataset with 10000 rows, each with a RowID and 30 numerical features. There are multiple rows with the same RowID. The aim is to come up with a model, which at test time would take 3 rows (without the RowID) and predict if at least one of the rows has a different RowID from the rest. Regardless of what the data represents, how does one model this problem? Is there a name for this kind of a task?
As a representative example, suppose you have this table of scores obtained by students for 4 subjects. Exams were taken twice so each student has two rows:
StudentID Score1 Score2 Score3 Score4
--------- ------ ------ ------ ------
1 25 50 90 43
1 23 51 93 42
2 87 45 76 67
2 85 51 74 65
3 97 34 65 21
3 96 37 65 29
4 19 32 90 81
4 21 37 99 86
Suppose you want to build a model which would take 3 rows of such scores (without StudentID), not necessarily rows corresponding to the students from above, and predict if they were all scores obtained by the same student. How would one go about modelling this? Is this a supervised learning problem?