Viewing file: festival_17.html (3.75 KB) -rw-r--r-- Select action/file-type: (+) | (+) | (+) | Code (+) | Session (+) | (+) | SDB (+) | (+) | (+) | (+) | (+) | (+) |
Festival Speech Synthesis System - 17 Phrase breaks
Go to the first, previous, next, last section, table of contents.
There are two methods for predicting phrase breaks in Festival, one
simple and one sophisticated. These two methods are selected through
the parameter Phrase_Method and phrasing is achieved by the
module Phrasify .
The first method is by CART tree. If parameter Phrase_Method is
cart_tree , the CART tree in the variable phrase_cart_tree
is applied to each word to see if a break should be inserted or not.
The tree should predict categories BB (for big break), B
(for break) or NB (for no break). A simple example of a tree to
predict phrase breaks is given in the file `lib/phrase.scm'.
(set! simple_phrase_cart_tree
'
((R:Token.parent.punc in ("?" "." ":"))
((BB))
((R:Token.parent.punc in ("'" "\"" "," ";"))
((B))
((n.name is 0)
((BB))
((NB))))))
The second and more elaborate method of phrase break prediction is used
when the parameter Phrase_Method is prob_models . In this
case a probabilistic model using probabilities of a break after a word
based on the part of speech of the neighbouring words and the previous
word. This is combined with a ngram model of the distribution of breaks
and non-breaks using a Viterbi decoder to find the optimal phrasing of
the utterance. The results using this technique are good and even show
good results on unseen data from other researchers' phrase break tests
(see black97b). However sometimes it does sound wrong,
suggesting there is still further work required.
Parameters for this module are set through the feature list held
in the variable phr_break_params , and example of which
for English is set in english_phr_break_params in
the file `lib/phrase.scm'. The features names and meaning are
pos_ngram_name
-
The name of a loaded ngram that gives probability distributions of B/NB
given previous, current and next part of speech.
pos_ngram_filename
-
The filename containing
pos_ngram_name .
break_ngram_name
-
The name of a loaded ngram of B/NB distributions. This is typically
a 6 or 7-gram.
break_ngram_filename
-
The filename containing
break_ngram_name .
gram_scale_s
-
A weighting factor for breaks in the break/non-break ngram. Increasing
the value insertes more breaks, reducing it causes less breaks to be
inserted.
phrase_type_tree
-
A CART tree that is used to predict type of break given the predict
break position. This (rather crude) technique is current used to
distinguish major and minor breaks.
break_tags
-
A list of the break tags (typically
(B NB) ).
pos_map
-
A part of speech map used to map the
pos feature of words
into a smaller tagset used by the phrase predictor.
Go to the first, previous, next, last section, table of contents.
|