2

• Pianos have 5 different types of black note, and 7 different types of white note, on them.

• A famous music psychology paper examined the effects of note colour (black vs white) and instrument (piano vs keyboard vs synthesised sound - N.B., referred to as ‘timbre’) in a 2-way ANOVA. It reported the following:

• "It is clear that the white-key notes are responded to more correctly than are the black-key notes for all timbres. An analysis of variance showed that this difference was highly significant [F(1, 30) = 24.16, p < .0002], with no interaction between the pitch-(white-key notes vs. black-key notes) and the timbre [F(2, 30) = 0.489] indicating that the advantage of white-key notes existed across timbres."

• From the degrees of freedom it seems to me that the participants weren’t individually scored, but rather the number of times they correctly identified each note was counted and then pooled together to give a total score attributed to each of the 12 notes, rather than to any person.

• This would mean that for each of the three instruments, scores for the 5 black notes were in the 'black' level of the note colour factor, and scores for the 7 white notes were in the 'white' level of the note colour factor. This would result in degrees of freedom for the sum of squares within (error) of: 5 black notes = 5 - 1 = 4; 7 white notes = 7 - 1 = 6; 6 + 4 = 10; 10 x 3 [i.e., 3 instruments] = 30. Thus F(1, 30)

• What are the implications of giving scores to notes (a concept) rather than to the participants that really gained those scores?

• Things that trouble me:

  • If there are 5 black notes vs 7 white notes for 3 instruments then there are unequal group sizes (n=15 vs n=21). As 21/15 = 1.4 (so, under 1.5) perhaps this isn't an issue?

  • Is this binomial data? The participant responses to the notes can either be correct or incorrect. However:

a) each participant encounters each note a number of times, so we’re either dealing with aggregated scores or averages

b) the scores are then given to the notes themselves, rather than to the participants.

So, do points a and b change the nature of the data?

• The way the experiment was conducted was repeated measures: the same people (n=7) experienced all notes on all instruments. As the scores are allocated to the notes rather than the participants perhaps this isn't an issue? However...

• Even here I think it should be a mixed design. Surely each note can be classified as a colour (between subjects, as in the white notes are never in the black notes group, and vice versa) but is subjected to being played by different instruments (within subjects, as in the same group of notes are being played in each condition). I can't remember how this would affect the degrees of freedom, but I don't think they'd come out as '1, 30' (as is the case in the paper) were this a mixed design?

FYI this is the paper I'm referring to: Miyazaki, K. I. (1989). Absolute pitch identification: Effects of timbre and pitch region. Music perception, 7(1), 1-14.

Does this data analysis trouble you? What troubles you about it?

Also is there any literature on best practice for assigning scores to non-sentient beings, such as concepts (differently coloured piano notes, in this instance)?

Karolis Koncevičius
  • 4,282
  • 7
  • 30
  • 47
Sam Leak
  • 21
  • 2

1 Answers1

0

I think I may have found the answer to (at least part of) my own question. Would you agree with the following?

It is easier to imagine: 'the same group of people in two different conditions vs 2 different groups of people in different conditions to each other'

than it is to imagine: 'the same set of notes being played by two different instruments' vs '2 different sets of notes being played by different conditions to each other'

As such I could reconceptualise the experiment/analysis as follows:

"Thirty six notes walk into a bar, where they meet a scientist. The scientist wants to know if the others in the bar (there are 7 of them) will still recognise these notes in two factors: • Hat colour (black hat vs white hat) • Instrument they're playing (piano vs keyboard vs synthesiser)

• The scientist splits the 36 notes into three groups of twelve.

  • Group A are given a piano to play;
  • group B are given a keyboard to play; and
  • group C are given a synthesiser to play.

• Each group is split so that:

  • five notes wear a black hat
  • seven notes wear a white hat

Each of the seven people (that were in the bar when the notes walked in) is then asked to judge each of the 36 notes (they have 28 attempts at each). Each note is given a recognition score out of 196 (28 attempts x 7 people).

The scientist then runs a 2-way independent ANOVA with 2 factors: ‘Hat colour’ (2 levels: Black vs White), and ‘Instrument they're playing’ (3 levels: piano vs keyboard vs synthesiser)'

One way to get over the conceptual hurdle of scoring objects rather than people, is simply to imagine those objects as people. It's hard to imagine, as is the case here, that the experiment's participants are in fact used as the measurement instrument for the experiment's stimuli (the reverse to what we'd expect), but if you can mentally humanise the stimuli then this makes this easier.

Looking at the setup as I've now described above, I think that Miyazaki was correct to run a 2-way independent ANOVA?

However, what about the binomial data (i.e., correct responses vs incorrect responses) issue? He hasn't listed any tests for normality or for homogeneity of variance either... I guess I just give him the benefit of the doubt? This is a psychology experiment from 1989… do you think he will have done all of this?)

--

A further issue that concerns me is that the distinct white-to-black note split seems artificial. In the various studies that find this effect (including for 2 of the musical instruments in this study itself) there is often the odd black note or two that is recognised better than the odd white note or 2. I think that this shows the experimenter's bias in presuming piano notes would divide themselves by colour, but that in reality there may be more interesting subgroupings. How could he have looked for this is an unbiased way? Factor analysis? Would the data be suitable for that?

Sam Leak
  • 21
  • 2