Viewing file: festival_23.html (8.79 KB) -rw-r--r-- Select action/file-type: (+) | (+) | (+) | Code (+) | Session (+) | (+) | SDB (+) | (+) | (+) | (+) | (+) | (+) |
Festival Speech Synthesis System - 23 Audio output
Go to the first, previous, next, last section, table of contents.
If you have never heard any audio ever on your machine then you must
first work out if you have the appropriate hardware. If you do, you
also need the appropriate software to drive it. Festival can directly
interface with a number of audio systems or use external
methods for playing audio.
The currently supported audio methods are
- `NAS'
-
NCD's NAS, is a network transparent audio system (formerly called
netaudio). If you already run servers on your machines you
simply need to ensure your
AUDIOSERVER environment variable
is set (or your DISPLAY variable if your audio output device is the
same as your X Windows display).
You may set NAS as your audio output method by the command
(Parameter.set 'Audio_Method 'netaudio)
- `/dev/audio'
-
On many systems `/dev/audio' offers a simple low level method for
audio output. It is limited to mu-law encoding at 8KHz. Some
implementations of `/dev/audio' allow other sample rates and sample
types but as that is non-standard this method only uses the common
format. Typical systems that offer these are Suns, Linux and FreeBSD
machines. You may set direct `/dev/audio' access as your audio
method by the command
(Parameter.set 'Audio_Method 'sunaudio)
- `/dev/audio (16bit)'
-
Later Sun Microsystems workstations support 16 bit
linear audio at various sample rates. Support for this form
of audio output is supported. It is a compile time option (as
it requires include files that only exist on Sun machines. If
your installation supports it (check the members of the list
*modules* ) you can select 16 bit audio output on
Suns by the command
(Parameter.set 'Audio_Method 'sun16audio)
Note this will send it to the local machine where the festival binary
is running, this might not be the one you are sitting next to--that's
why we recommend netaudio. A hacky solution to playing audio on a local
machine from a remote machine without using netaudio is described
in section 6 Installation
- `/dev/dsp (voxware)'
-
Both FreeBSD and Linux have a very similar audio interface through
`/dev/dsp'. There is compile time support for these in the speech
tools and when compiled with that option Festival may utilise it.
Check the value of the variable
*modules* to see which audio
devices are directly supported. On FreeBSD, if supported, you
may select local 16 bit linear audio by the command
(Parameter.set 'Audio_Method 'freebsd16audio)
While under Linux, if supported, you may use the command
(Parameter.set 'Audio_Method 'linux16audio)
Some earlier (and smaller machines) only have 8bit audio even though
they include a `/dev/dsp' (Soundblaster PRO for example). This was
not dealt with properly in earlier versions of the system but now the
support automatically checks to see the sample width supported and uses
it accordingly. 8 bit at higher frequencies that 8K sounds better than
straight 8k ulaw so this feature is useful.
- `mplayer'
-
Under Windows NT or 95 you can use the `mplayer' command which
we have found requires special treatement to get its parameters right.
Rather than using
Audio_Command you can select this on
Windows machine with the following command
(Parameter.set 'Audio_Method 'mplayeraudio)
Alternatively built-in audio output is available with
(Parameter.set 'Audio_Method 'win32audio)
- `SGI IRIX'
-
Builtin audio output is now available for SGI's IRIX 6.2 using
the command
(Parameter.set 'Audio_Method 'irixaudio)
- `Audio Command'
-
Alternatively the user can provide a command that can play an audio
file. Festival will execute that command in an environment where the
shell variables
SR is set to the sample rate (in Hz) and
FILE which, by default, is the name of an unheadered raw, 16bit
file containing the synthesized waveform in the byte order of the
machine Festival is running on. You can specify your audio play command
and that you wish Festival to execute that command through the following
command
(Parameter.set 'Audio_Command "sun16play -f $SR $FILE")
(Parameter.set 'Audio_Method 'Audio_Command)
On SGI machines under IRIX the equivalent would be
(Parameter.set 'Audio_Command
"sfplay -i integer 16 2scomp rate $SR end $FILE")
(Parameter.set 'Audio_Method 'Audio_Command)
The Audio_Command method of playing waveforms Festival supports
two additional audio parameters. Audio_Required_Rate allows you
to use Festival's internal sample rate conversion function to any desired
rate. Note this may not be as good as playing the waveform at the
sample rate it is originally created in, but as some hardware devices
are restrictive in what sample rates they support, or have naive
resample functions this could be optimal. The second additional
audio parameter is Audio_Required_Format which can be
used to specify the desired output forms of the file. The default
is unheadered raw, but this may be any of the values supported by
the speech tools (including nist, esps, snd, riff, aiff, audlab, raw
and, if you really want it, ascii). For example suppose you
have a program that only plays sun headered files at 16000 KHz you can
set up audio output as
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Required_Rate 16000)
(Parameter.set 'Audio_Required_Format 'snd)
(Parameter.set 'Audio_Command "sunplay $FILE")
Where the audio method supports it, you can specify alternative audio
device for machine that have more than one audio device.
(Parameter.set 'Audio_Device "/dev/dsp2")
If Netaudio is not available and you need to play audio on a
machine different from teh one Festival is running on we have
had reports that `snack' (http://www.speech.kth.se/snack/)
is a possible solution. It allows remote play but importnatly
also supports Windows 95/NT based clients.
Because you do not want to wait for a whole file to be synthesized
before you can play it, Festival also offers an audio spooler
that allows the playing of audio files while continuing to synthesize
the following utterances. On reasonable workstations this allows the
breaks between utterances to be as short as your hardware allows them
to be.
The audio spooler may be started by selecting asynchronous
mode
(audio_mode async)
This is switched on by default be the function tts .
You may put Festival back into synchronous mode (i.e. the utt.play
command will wait until the audio has finished playing before returning).
by the command
(audio_mode sync)
Additional related commands are
(audio_mode 'close)
-
Close the audio server down but wait until it is cleared. This is
useful in scripts etc. when you wish to only exit when all audio is
complete.
(audio_mode 'shutup)
-
Close the audio down now, stopping the current file being played and
any in the queue. Note that this may take some time to take effect
depending on which audio method you use. Sometimes there can be
100s of milliseconds of audio in the device itself which cannot
be stopped.
(audio_mode 'query)
-
Lists the size of each waveform currently in the queue.
Go to the first, previous, next, last section, table of contents.
|