well.. I finally have a program that takes a speech clip and breaks it down into its phonemes, with a start time and duration for each phoneme. The JSON file looks like this:
}, { "word": "letter", "start": 945, "duration": 36, "phonemes": [{ "phoneme": "L", "start": 945, "duration": 6 }, { "phoneme": "EH", "start": 951, "duration": 7 }, { "phoneme": "T", "start": 958, "duration": 8 }, { "phoneme": "ER", "start": 966, "duration": 16 }] }]
The next step is to take this information and convert it into animation data for a ManuelBastioniLab character. The code is here. Still needs work to make it compile and build easily. (Linux only... sorry )