Share via

Azure Custom Avatar Model Deployment through API

DARSHIL SHAH7 60 Reputation points
2026-04-07T08:33:32.55+00:00

I recently trained and developed a Custom Avatar using Azure Speech Service (Speech Studio).
Once the Training was done, I deployed the model from the Speech Service Portal Itself.
Since my usage is less, I want to optimize my costs by only deploying it when its in use.
I don't want to open the portal everytime I wish to deploy a model or delete a deployment. rather I wish to do it through an API or other service provided by Azure.

I found API parameters for Custom Voice, but could not find any API parameters in the documentations for Custom Avatar and wanted a solution for this problem.

Please let me know how I can proceed?
Is there any API documentation available for this?
Is there any other service available for this?
Or is the custom avatar model deployment/delete deployment only possible through the Azure Speech Studio Portal for now?

Azure AI Speech
Azure AI Speech

An Azure service that integrates speech processing into apps and services.

0 comments No comments

Answer accepted by question author
  1. SRILAKSHMI C 16,625 Reputation points Microsoft External Staff Moderator
    2026-04-07T14:17:47.3433333+00:00

    Hello DARSHIL SHAH7,

    Thank you for reaching out to Microsoft Q&A,

    What you’re trying to do makes perfect sense from a cost perspective—treat the avatar deployment like a resource you can spin up when needed and shut down when idle. The challenge is that Custom Avatar in Azure Speech doesn’t behave like other Azure resources yet.

    Right now, after you train and deploy your avatar in Azure Speech Studio, that deployment lifecycle (create / delete / start / stop) is only controllable through the portal UI. There isn’t a public REST API, SDK method, CLI command, or ARM/Bicep support available to manage it programmatically.

    This is where the confusion usually comes in. If you look at documentation under Foundry or OpenAI-style APIs, you’ll see endpoints like:

    • /openai/deployments/.../audio/speech

    Those are only inference APIs. In simple terms, they let you use the avatar once it’s already deployed. They don’t let you deploy it, scale it, or delete it. So even though it looks like there’s API coverage, it stops at runtime usage not lifecycle management.

    Internally, Custom Avatar is still not exposed as a fully “resource-managed” service like Custom Voice or Azure AI Search. That’s why you see a mismatch: other Speech features have APIs for automation, but Avatar still relies on the portal for provisioning actions.

    So if you try to find a way to:

    • deploy avatar via API → not available
    • delete deployment via API → not available
    • automate via ARM/Terraform → not available

    you’ll keep hitting a dead end, because those capabilities simply haven’t been released publicly yet.

    If automation is absolutely required, there are only two realistic directions today.

    One option is to go through Microsoft and ask to be onboarded into the limited-access / managed-customer program. There are internal or private APIs for avatar lifecycle, but they’re not generally available. Customers with specific needs (like cost optimization or large-scale deployments) can sometimes get early access through a support request or account team.

    The other option some teams experiment with is UI automation basically scripting the portal using tools like Playwright or Selenium. It works in a basic sense, but it’s fragile and not something you’d want to rely on in production. Any UI change can break your flow, and it’s not an officially supported approach.

    Because of these limitations, most teams take a slightly different approach: instead of trying to bring the deployment up and down, they optimize how and when the avatar is used. That means keeping the deployment active but tightly controlling when inference calls are made avoiding long-running sessions, triggering avatar generation only when needed, and shutting down usage at the application level rather than the infrastructure level. It’s not as ideal as true start/stop control, but it’s the only stable option right now.

    Please refer this

    • Troubleshoot & Guidance for Accessing Custom Neural Voice & Custom Avatar https://dotnet.territoriali.olinfo.it/azure/ai-services/cognitive-services-limited-access#what-is-limited-access

    • Create & Deploy Custom Video Avatar (Foundry) https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-create?pivots=ai-foundry-portal

    • Create & Deploy Custom Avatar (Speech Studio) https://dotnet.territoriali.olinfo.it/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-create?wt.mc_id=knowledgesearch_inproduct_azure-cxp-community-insider#step-5-deploy-and-use-your-avatar-model

    • Azure OpenAI in Microsoft Foundry Models REST API https://dotnet.territoriali.olinfo.it/azure/foundry/openai/reference-preview#speech---create

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.