Google's Speech API has audio speech to text capabilities in multiple languages. It supports Turkish too. That language is very interesting, it's so called agglutinative: you stick word parts one after another instead of prepositions and other parts in languages like English. This leads to pretty much unlimited size vocabulary.
Do you know how Google implemented Turkish speech recognition for their API? I can't believe they used the same techniques as in English.
UPDATE
Here's an example transcript that Google API returned from the following clip on YouTube:
you would have to ask him I have no clue Yahoo answers I was Adam Scott really in Jumanji in The Truman Show I looked him up on iTunes it said under movies her is in was Jumanji and The Truman Show I don't * * * * believe it will listen I'm not in either of those movies so yeah you really shouldn't * * * *
I think it's excellent quality of transcription. I used my beautiful AudioEngine monitors and put a crappy 20 years old LabTec computer mic in front of it. A truly amateur setup, but that's how these things will be used in practice, i.e. in less than ideal situation.
Here's an example from a Turkish movie scene:
merhaba Temmuz Ben hoş geldin kardeş e nasılsınız keyifler iyidir inşallah İyi valla koşturuyoruz nasıl olsun Hem kardeş lafı uzatmadan konuya girsek anlattı bana ikinci el işçiliği Tabii sen güzel bir şey yapıyor Dernek falan da işte ilişkin bir delikanlı eve gelip gidiyor
This one is basically incomprehensible. It picks up some words here and there, but it's hard to connect them unlike in English example.
Does this mean that Google is not using a custom solution for Turkish? Maybe they want for repurposing their English language engines for Turkish ?
Just for fun, I sent a clip from Azeri language speaker. He's speech is clearly enunciated but the API barely got a few words. I used Turkish setting, so it's not fair, really, but the languages are similar:
o akşam Çağlayan Doruk sevgilin kim bu kim baktı Bülent Serttaş çok pis