AzurePersonalVoice Class

Azure personal voice configuration.

Constructor

AzurePersonalVoice(*args: Any, **kwargs: Any)

Variables

Name Description
type
str or <xref:azure.ai.voicelive.models.AZURE_PERSONAL>

Required. Azure personal voice.

name
str

Voice name cannot be empty. Required.

temperature

Temperature must be between 0.0 and 1.0.

model

Underlying neural model to use for personal voice. Required. Known values are: "DragonLatestNeural", "PhoenixLatestNeural", "PhoenixV2Neural", "DragonHDOmniLatestNeural", and "MAI-Voice-1".

custom_lexicon_url
str

URL of a custom lexicon file for pronunciation customization.

custom_text_normalization_url
str

URL of a custom text normalization endpoint.

prefer_locales

Preferred locales in BCP-47 format that change the accents of languages. If not set, TTS uses the default accent for each language (e.g., American English for English, Mexican Spanish for Spanish). Setting this to ["en-GB", "es-ES"] changes the English accent to British English and the Spanish accent to European Spanish, while TTS can still speak other languages like French or Chinese with their default accents.

locale
str

Enforced locale in BCP-47 format for TTS output. If set, TTS will always use the specified locale to speak. For example, setting locale to en-US forces American English accent for all text content, even if the text is in another language, and TTS will output silence for unsupported languages (e.g., Chinese text with en-US locale). If not set, TTS automatically detects the language from the text content.

style
str

Speaking style for the voice (e.g., 'cheerful', 'sad').

pitch
str

Pitch adjustment for the voice output. Follows the same rules as the pitch attribute of the SSML prosody element (see https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-prosody). Typical values: a named level (x-low, low, medium, high, x-high, default), a relative change (e.g., +10%, -5%, +50Hz, -2st), or an absolute frequency (e.g., 200Hz).

rate
str

Speaking rate adjustment for the voice output. Follows the same rules as the rate attribute of the SSML prosody element (see https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-prosody). Typical values: a named level (x-slow, slow, medium, fast, x-fast, default), a relative percentage (e.g., +20%, -10%), or a non-negative multiplier (e.g., 0.5, 1.5).

volume
str

Volume adjustment for the voice output. Follows the same rules as the volume attribute of the SSML prosody element (see https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-prosody). Typical values: a named level (silent, x-soft, soft, medium, loud, x-loud, default), an absolute number from 0.0 to 100.0, or a relative change (e.g., +10, -6dB).

Methods

as_dict

Return a dict that can be turned into json using json.dump.

clear

Remove all items from D.

copy
get

Get the value for key if key is in the dictionary, else default. :param str key: The key to look up. :param any default: The value to return if key is not in the dictionary. Defaults to None :returns: D[k] if k in D, else d. :rtype: any

items
keys
pop

Removes specified key and return the corresponding value. :param str key: The key to pop. :param any default: The value to return if key is not in the dictionary :returns: The value corresponding to the key. :rtype: any :raises KeyError: If key is not found and default is not given.

popitem

Removes and returns some (key, value) pair :returns: The (key, value) pair. :rtype: tuple :raises KeyError: if D is empty.

setdefault

Same as calling D.get(k, d), and setting D[k]=d if k not found :param str key: The key to look up. :param any default: The value to set if key is not in the dictionary :returns: D[k] if k in D, else d. :rtype: any

update

Updates D from mapping/iterable E and F. :param any args: Either a mapping object or an iterable of key-value pairs.

values

as_dict

Return a dict that can be turned into json using json.dump.

as_dict(*, exclude_readonly: bool = False) -> dict[str, Any]

Keyword-Only Parameters

Name Description
exclude_readonly

Whether to remove the readonly properties.

Default value: False

Returns

Type Description

A dict JSON compatible object

clear

Remove all items from D.

clear() -> None

copy

copy() -> Model

get

Get the value for key if key is in the dictionary, else default. :param str key: The key to look up. :param any default: The value to return if key is not in the dictionary. Defaults to None :returns: D[k] if k in D, else d. :rtype: any

get(key: str, default: Any = None) -> Any

Parameters

Name Description
key
Required
default
Default value: None

items

items() -> ItemsView[str, Any]

Returns

Type Description

set-like object providing a view on D's items

keys

keys() -> KeysView[str]

Returns

Type Description

a set-like object providing a view on D's keys

pop

Removes specified key and return the corresponding value. :param str key: The key to pop. :param any default: The value to return if key is not in the dictionary :returns: The value corresponding to the key. :rtype: any :raises KeyError: If key is not found and default is not given.

pop(key: str, default: ~typing.Any = <object object>) -> Any

Parameters

Name Description
key
Required
default

popitem

Removes and returns some (key, value) pair :returns: The (key, value) pair. :rtype: tuple :raises KeyError: if D is empty.

popitem() -> tuple[str, Any]

setdefault

Same as calling D.get(k, d), and setting D[k]=d if k not found :param str key: The key to look up. :param any default: The value to set if key is not in the dictionary :returns: D[k] if k in D, else d. :rtype: any

setdefault(key: str, default: ~typing.Any = <object object>) -> Any

Parameters

Name Description
key
Required
default

update

Updates D from mapping/iterable E and F. :param any args: Either a mapping object or an iterable of key-value pairs.

update(*args: Any, **kwargs: Any) -> None

values

values() -> ValuesView[Any]

Returns

Type Description

an object providing a view on D's values

Attributes

custom_lexicon_url

URL of a custom lexicon file for pronunciation customization.

custom_lexicon_url: str | None

custom_text_normalization_url

URL of a custom text normalization endpoint.

custom_text_normalization_url: str | None

locale

Enforced locale in BCP-47 format for TTS output. If set, TTS will always use the specified locale to speak. For example, setting locale to en-US forces American English accent for all text content, even if the text is in another language, and TTS will output silence for unsupported languages (e.g., Chinese text with en-US locale). If not set, TTS automatically detects the language from the text content.

locale: str | None

model

Underlying neural model to use for personal voice. Required. Known values are: "DragonLatestNeural", "PhoenixLatestNeural", "PhoenixV2Neural", "DragonHDOmniLatestNeural", and "MAI-Voice-1".

model: str | _models.PersonalVoiceModels

name

Voice name cannot be empty. Required.

name: str

pitch

Pitch adjustment for the voice output. Follows the same rules as the pitch attribute of the SSML prosody element (see https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-prosody). Typical values: a named level (x-low, low, medium, high, x-high, default), a relative change (e.g., +10%, -5%, +50Hz, -2st), or an absolute frequency (e.g., 200Hz).

pitch: str | None

prefer_locales

Preferred locales in BCP-47 format that change the accents of languages. If not set, TTS uses the default accent for each language (e.g., American English for English, Mexican Spanish for Spanish). Setting this to ["en-GB", "es-ES"] changes the English accent to British English and the Spanish accent to European Spanish, while TTS can still speak other languages like French or Chinese with their default accents.

prefer_locales: list[str] | None

rate

Speaking rate adjustment for the voice output. Follows the same rules as the rate attribute of the SSML prosody element (see https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-prosody). Typical values: a named level (x-slow, slow, medium, fast, x-fast, default), a relative percentage (e.g., +20%, -10%), or a non-negative multiplier (e.g., 0.5, 1.5).

rate: str | None

style

Speaking style for the voice (e.g., 'cheerful', 'sad').

style: str | None

temperature

Temperature must be between 0.0 and 1.0.

temperature: float | None

type

Required. Azure personal voice.

type: AZURE_PERSONAL: 'azure-personal'>]

volume

Volume adjustment for the voice output. Follows the same rules as the volume attribute of the SSML prosody element (see https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-prosody). Typical values: a named level (silent, x-soft, soft, medium, loud, x-loud, default), an absolute number from 0.0 to 100.0, or a relative change (e.g., +10, -6dB).

volume: str | None