1

I am looking for a library, script or program that can normalize the transcribed and gold texts when computing the word error rate (WER) of an automated speech recognition system.

For example, if:

  • the gold transcript is Without the dataset the article is useless
  • the predicted transcript is Without the data set the article's useless

the texts should be normalized so that the WER is 0 (and not 3 or 4 if the text isn't normalized).


I have crossposted the question at:

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271

1 Answers1

1

The hubscr.pl tool from sclite uses GLM files for normalization, you can dowload an example here or here.

The syntax of GLM files is described here (mirror).

GLM files are essentially regular expression kind of normalization where you need to list all possible expansions. They aren't generic enough though.

Excerpt from en20030506.glm (mirror):

[WINNER'S] => [{WINNER'S / WINNER IS / WINNER HAS}] / [ ] __ [ ]
[WINTER'S] => [{WINTER'S / WINTER IS / WINTER HAS }] / [ ] __ [ ]
[WISCONSIN'S] => [{WISCONSIN'S / WISCONSIN IS / WISCONSIN HAS}] / [ ] __ [ ]
[WIT'S] => [{WIT'S / WIT IS / WIT HAS}] / [ ] __ [ ]
[WOMAN'S] => [{WOMAN'S / WOMAN IS / WOMAN HAS}] / [ ] __ [ ]
Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271
Nikolay Shmyrev
  • 540
  • 2
  • 11