Convert IPA Symbols to Speech Sounds Online (MP3)

For a long time, I had been curious what an IPA to speech sounds / MP3 audio converter would sound like. Google doesn't list any free tools. So today, I wrote a free converter in Python. The code is extremely basic and not a good example of programming, but it does illustrate the point of why so few converters probably exist: the results are nearly worthless--at least, when using such simple logic (i.e., just play each IPA sound in order as transcribed). Here is a zip file of the sounds I used, which you can obviously record (improve) on your own. Note that in the "pipe" example, I purposely added a diphthong mp3 called ai.mp3 and added it to the replacements list to improve the results.

judge d͡ʒʌd͡ʒ judge.mp3
pie pʰaɪ pie.mp3
pipe pʰaɪpʰ pipe.mp3

IPA Transcription to Sounds code

To use it, input your IPA transcription into the i variable. Adjust the path (p variable) accordingly. You'll need ffmpeg. The code converts d͡ʒ in the i variable into d͡ʒ.mp3 and ʌ into ʌ.mp3, then combines the various MP3's into one MP3 called output.mp3. The hard part is to represent each IPA sound as 1 character, which is difficult with affricates like d͡ʒ and t͡ʃ, so it temporarily converts them into a number (1, 2, 3, 4) before creating a list of elements.

from subprocess import call

i = u"phaɪph"
p = "/tmp/s/"
r = {
        u"tʃ": u"1",
        u"dʒ": u"2",
        u"t͡ʃ": u"1",
        u"d͡ʒ": u"2",
        #u"ph": u"3",
        u"ph": u"3",
        u"aɪ": u"4",
    } # replacements dictionary

i = i.replace(u" ",u"0") # replace spaces with 0

#input_str = raw_input("Enter something: ")

# http://stackoverflow.com/questions/17295776/how-to-replace-elements-in-a-list-using-dictionary-lookup
b = { v:k for k,v in r.iteritems()} # reverse the dict

# make 2 character sounds represented by 1
for e in r:
    i = i.replace(e,str(r[e]))
l = list(i)
   
f = []
for e in l:
    match = False
   
    for q in b:
        if e==q:
            match = True
            f.append(b[q])
           
    if match == False:
        f.append(e)


c = u'|'.join((p + s+".mp3" for s in f))
o = []
o.append("/usr/bin/ffmpeg")
o.append("-i")
o.append(""concat:" + c + """)
o.append("-y")
o.append("-acodec")
o.append("copy")
o.append(p + "output.mp3")
o = u" ".join(o)
print o
call(o, shell=True)

Improving Things

To improve this, you need a more sophisticated approach. First, download TextAloud. Then download an IVONA voice. IVONA voices accept PLS (Pronunciation Lexicon Specification) files (open the IVONA MiniReader and look for the PLS tab). PLS, by the way, is a system from the W3C. In a PLS file, you tell IVONA how to pronounce words, and, crucially, you can specify IPA as the transcription system. Here is an example.

<?xml version="1.0" encoding="UTF-8"?>
<lexicon
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
     alphabet="x-sampa" xml:lang="en-US">
<lexeme>
     <grapheme>judge</grapheme>
     <phoneme alphabet="ipa">d͡ʒʌd͡ʒ</phoneme>  
   </lexeme>

<lexeme>
     <grapheme>pie</grapheme>
     <phoneme alphabet="ipa">ˈpʰaɪ</phoneme>  
   </lexeme>

      <lexeme>
     <grapheme>pipe</grapheme>
     <phoneme alphabet="ipa">pʰaɪp</phoneme>  
   </lexeme>

<lexeme>
<grapheme>affricates</grapheme>
<phoneme alphabet="ipa">ˈæfrəkəts</phoneme>
</lexeme>
<lexeme>
<grapheme>affricate</grapheme>
<phoneme alphabet="ipa">ˈæfrəkət</phoneme>
</lexeme>
<lexeme>
     <grapheme>dz</grapheme>
<phoneme alphabet="ipa">d͡ʒə:</phoneme>
   </lexeme>
   <lexeme>
     <grapheme>je</grapheme>
n <phoneme alphabet="ipa">ʒə:ː</phoneme>
   </lexeme>
   <lexeme>
     <grapheme>test</grapheme>
     <phoneme alphabet="ipa">a:ɪ wɛnt tə ðə stɔɚ ən bɑ?t ə næɪs bɑɾɫ ə wɑɪn</phoneme>  
   </lexeme>
   <lexeme>
<grapheme>alveolar</grapheme>
<phoneme alphabet="ipa">ˌalviˈolər</phoneme>
</lexeme>
   <lexeme>
<grapheme>postalveolar</grapheme>
<phoneme alphabet="ipa">ˈpoʊs ˌdalviˈolər</phoneme>
</lexeme>
<lexeme>
<grapheme>chello</grapheme>
<phoneme alphabet="ipa">ˈt͡ʃɛlo</phoneme>
</lexeme>
<lexeme>
     <grapheme>test3</grapheme>
     <phoneme alphabet="ipa">ˈpʰaɪp</phoneme>  
   </lexeme>
</lexicon>

IVONA voice: IPA to Sounds transcription using a PLS file

Listen to the difference.

judge d͡ʒʌd͡ʒ judge.mp3
pie pʰaɪ pie.mp3
pipe pʰaɪp pipe.mp3

Comments

I've been searching for software that converts IPA to a sound file but can't seem to find one. I understand that it isn't a straightforward problem but it would be useful given that IPA is used in Wikipedia so much

sean, i got inspired from this article and tried using ivona for converting ipa to speech , for my local language. it turned out that these .In the final lexeme tag you have mapped 'test3' to 'phaip'. That is not happening in my case. :(

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.