zaterdag, april 21, 2007

 

Avatars on the cheap: sculptoris

If used sensibly, avatars have the potential to enhance the user interaction with your web site and the retention and fun factor of your e-learning modules.

Today I want to discuss one cheap way to include high quality talking avatars. I've been playing around with the Sculptoris Voice Lite program. On their site (www.sculptoris.com) you are greeted by one of the characters, and you can download a 30 day trial version. After that the license is only 99$ and you don't need to pay any redistribution rights of your character files that are exported as Flash 5 files. Flash has about 99% coverage if I believe Adobe, so it will play everywhere.

The program could not be more easy to use:

1- Choose the sound file (.wav) that contains the voice. It can be your own recording or a Text-To-Speech voice.

2- Optionally select the text file (.txt) that contains the text spoken in step 1. If you provide the file, the lip synchronisation will be more accurate.

3- Select the character to use. There are several build-in characters, male or female or even animal. If you purchase the additional Voice Character Creation Development Kit, you can make your own avatars or derive them from the free source files of the default ones.

4- Select the output flash file location and name.

5- Press 'Build Character'. Done. You will find the flash (.swf) and html preview file in the folder you selected. To include in your site, just copy/paste the HTML code and upload the flash file.

I've made two examples, one with Text-To-Speech synthetic voice, and one with my own voice.

First example can be viewed here.
The avatar (I named him Otto) asks one of the questions that I received from Lut for my birthday after my call in this blog post. The question is in English.

The Text-To-Speech voice was made with TextAloud, as described in this previous post.




The second example can be viewed here.
It is my own voice and gives you another question donated by my neighbour and brother-in-law Hans, this time in Dutch.
The recording is very basic and noisy because it is made with the default Sound Recorder tool found in Windows. There is probably much better available, but I'm not an expert on sound engineering and background noise reduction.

I think Sculptoris is a good and relatively cheap way to add some fun and interaction to your site or e-learning, and I'm considering moderate use of the flash avatars on about2findout.com.

Labels: , , ,


woensdag, april 11, 2007

 

Text-To-Speech and education

I've been toying with Text-To-Speech (TTS). These products allow you to input a text (from a document, typed in, from a web page,...) and hear it out loud via a synthetic, artificial voice.

The million dollar question: is TTS technology good enough today to include in e-learning? The answer: no. There has been a remarkable progress and some premium voices are sounding quite natural. But you can still tell the difference, which distracts a learner from the content. And they are OK for short periods, but you don't want to listen 15 minutes to a TTS voice unless you really have to.

The usage

That doesn't mean TTS cannot be used for learning. I suggest these usages:
Before we discuss some products, first some terminology:
A TTS solution comes in two parts:
- The software to generate the sound output from the text input
- The voices

The software

There are many text readers available. For our usage we need a tool that can also export the sound to files, preferably in a batch or automated process. For complete automation, the tool should support an API (interface) or command line that other programs can call.

I recommend TextAloud from Nextup.com. It's a popular and good shareware that is very cheap (29.95$ and a discount of 5$ when you purchase some voices as well). You can try it out for 30 days. It has an easy interface, supports both SAPI4 and SAPI5 voices and allows for changing pitch, tone and volume in a voice. Out of the box it exports to mp3 and wav file formats, and when you install a free extra ActiveX encoder it also exports to wma files. But the most interesting feature is the batch conversion. Just put the text files in a folder, point to it, and the tool creates the corresponding voice files. Other features I like are the possibility to change voice within a text and to add your own vocabulary.

TextAloud also has an API and a command line interface, but for that you need to pay an extra license of 250$.

Other tools I came accross:
- Ultra Hal reader; comes with the NeoSpeech voices Kate and Paul for only 24.95$ which makes it a cheaper package. No batch export or any automation.
- TextSound from ByteCool : another shareware tool, with command line tool for about $29.95. You can download a trial that lasts 50 conversions.
- Balabolka: (thanks for the link Ralph!) a freeware tool that reads text in SAPI4 or SAPI5 voices and can export to wav. No automation or batch. But totally free and good.

My recommendation for the tool: Buy TextAloud. All samples below are created with it.

The voices

Regardless of the tool you use for conversion, you want good voices. Together with the MS Agent, Microsoft did release the voices 'Sam', 'Mike' and 'Mary'. That was 5 years ago. You can tell by listening to the samples below. They are the typical robot-like monotone voices we have come to hate. Most tools will ship with those because they are free. Available in SAPI4 and SAPI5 versions.

Sample of Microsoft Sam (wav)
Sample of Microsoft Mike (wav)
Sample of Microsoft Mary (mp3)

Microsoft also included the L&H TruVoice voice engines for free download. (See the section 'free voices' at the end of this link for download.) Without going into the dramatic national story of the Belgian pride Lernout&Hauspie Speech Technologies going bust, these voices are what is left from that area. They are equally old now but better, and have non-English voices and I guess they are free because who will sue you for using them? Available in SAPI4 versions.

Sample of TruVoice CarolUK (mp3)
Sample of TruVoice PeterUK (mp3)

Now we get to the acceptable voices. Two examples below are 'special voices', also free but they represent a whisper or robot voice. Since both are not naturally speaking by definition, it doesn't matter so much it is a synthetic voice.

Sample of male whisper (mp3)
Sample of robot voice (wma)

So far the free voices, also available on http://www.bytecool.com/voices.htm.

There are also many 'premium voices' available from companies like AT&T, Cepstral, NeoSpeech, Acapela and others. They charge for voices, but the quality is much much better. These companies have invested millions in their voices, so voices for commercial use can become quite expensive.

Sample of Cepstral Amy (mp3) - unregistered voice includes a registration message
Sample of Cepstral David (mp3) - unregistered voice includes a registration message

You can buy voices here. They will be between 30-50$ each for personal use.
Very nice and recommended voices are available from NeoSpeech. You can do your own demo online at

NeoSpeech demo: http://www.neospeech.com/demo/demo_text.php
Acapela demo: http://demo.acapela-group.com

But license restrictions apply: you cannot distribute the sound files unless you pay extra. In general, AT&T Natural Voices licensing can be very expensive, Cepstral and Neospeech more reasonable, but none of the redistribtuion rights licenses start below $1500. The only affordable distribution licenses are available on the web store of Cepstral. They are offering an Audio Distribution License for $199 per voice.

My recommendation: buy the NeoSpeech Paul and Kate voices (35$) for personal use. For commercial use or distribution, buy CepStral voices.



Labels: , , , , ,