监督者 API （Beta 版）

Important

此功能在 Beta 版中。工作区管理员可以从“预览版”页启用此功能。请参阅 Manage Azure Databricks 预览版。

监督程序 API 简化了在 Azure Databricks 上构建自定义代理的过程，并支持长时间运行任务的后台模式。在向 OpenResponses 兼容的终结点（POST ai-gateway/mlflow/v1/responses）的一个请求中定义模型、工具和指令，Azure Databricks为你运行代理循环：反复调用模型、选择和执行工具以及合成最终响应。

可通过三种方法在 Azure Databricks上生成自定义的工具调用代理：

Agent Bricks 监督代理 （推荐）：完全声明式，并通过人类反馈优化以实现最高质量。
Supervisor API：以编程方式构建自定义代理——在运行时选择模型，控制请求使用的工具，或在开发期间进行迭代。当需要将代理循环管理卸载到 Azure Databricks 时，并且仍需控制模型选择，它依然是正确的选择。
AI 网关统一或本地 API：编写您自己的代理循环程序。 Azure Databricks仅提供 LLM 推理层。在将现有代码移植到 Azure Databricks 或使用提供程序特定功能时，尽可能使用统一的 API；如果需要使用提供程序特定的功能，则使用提供程序特定的原生 API（/openai、/anthropic、/gemini）。

Requirements

您的账户已启用 Unity AI Gateway 的 AI 治理。请参阅 Manage Azure Databricks 预览版。
- 由于监督程序 API 通过 Unity AI 网关运行，因此应用推理表、速率限制和回退等 AI 网关功能。此 beta 版不支持使用情况跟踪。
在为您的帐户启用的 Unity 目录中存储 OpenTelemetry 跟踪。请参阅 Manage Azure Databricks 预览版。
- 在 Unity Catalog 表中存储来自 Supervisor API 代理循环的跟踪记录。
在支持的区域中的 Azure Databricks 工作区。
为工作区启用 Unity 目录。请参阅为工作区启用 Unity Catalog。
您交付的工具（Genie Spaces、Unity Catalog 函数、MCP 服务器、知识助理、应用）必须已配置并可访问。
已安装的 databricks-openai 包： pip install databricks-openai

步骤 1：创建单轮 LLM 调用

从没有工具的基本调用开始。客户端 DatabricksOpenAI 会自动为工作区配置基本 URL 和身份验证：

from databricks_openai import DatabricksOpenAI

client = DatabricksOpenAI(use_ai_gateway=True)

response = client.responses.create(
  model="databricks-claude-sonnet-4-5",
  input=[{"type": "message", "role": "user", "content": "Tell me about Databricks"}],
  stream=False
)

print(response.output_text)

步骤 2：添加托管工具以运行代理循环

在请求中包含工具时，Azure Databricks代表你管理多轮次循环：模型决定调用哪些工具、Azure Databricks执行它们、将结果馈送回模型，并重复操作，直到模型生成最终答案。

response = client.responses.create(
  model="databricks-claude-sonnet-4-5",
  input=[{"type": "message", "role": "user", "content": "Summarize recent customer reviews and flag any urgent issues."}],
  tools=[
    {
      "type": "genie_space",
      "name": "Customer reviews",
      "description": "Answers customer review questions using SQL",
      "genie_space": {"space_id": "<genie-space-id>"}
    },
    {
      "type": "dashboard",
      "name": "Customer reviews dashboard",
      "description": "Answers questions about the customer reviews dashboard",
      "dashboard": {"dashboard_id": "<dashboard-id>"}
    },
    {
      "type": "uc_function",
      "name": "Flag urgent review",
      "description": "Flags a review as requiring urgent attention",
      "uc_function": {"name": "<catalog>.<schema>.<function_name>"}
    },
    {
      "type": "table",
      "table": {
        "name": "<catalog>.<schema>.<table_name>",
        "description": "Reads from the customer reviews table"
      }
    },
    {
      "type": "vector_search_index",
      "vector_search_index": {
        "name": "<catalog>.<schema>.<index_name>",
        "description": "Searches the product documentation index for relevant passages"
      }
    },
    {
      "type": "knowledge_assistant",
      "name": "Internal docs",
      "description": "Answers questions from internal documentation",
      "knowledge_assistant": {"knowledge_assistant_id": "<knowledge-assistant-id>"}
    },
    {
      "type": "serving_endpoint",
      "name": "Custom agent",
      "description": "Calls a custom agent served from a Databricks model serving endpoint",
      "serving_endpoint": {"name": "<serving-endpoint-name>"}
    },
    {
      "type": "vector_search_index",
      "name": "Product docs",
      "description": "Looks up product documentation by semantic search",
      "vector_search_index": {
        "name": "<catalog>.<schema>.<index>",
        "columns": ["title", "content"]
      }
    },
    {
      "type": "app",
      "name": "Support agent",
      "description": "Custom application endpoint",
      "app": {"name": "<app-name>"}
    },
    {
      "type": "uc_connection",
      "name": "GitHub",
      "description": "Searches GitHub for issues and pull requests",
      "uc_connection": {"name": "<uc-connection-name>"}
    },
    {
      "type": "web_search",
      "name": "Web search",
      "description": "Searches the public web for current information and returns a synthesized answer with citations",
      "web_search": {}
    },
    {
      "type": "volume",
      "volume": {
        "name": "<catalog>.<schema>.<volume>",
        "description": "Searches files in a Unity Catalog volume"
      }
    },
  ],
  stream=True
)

for event in response:
  print(event)

步骤 3 （可选）：使用系统托管连接连接到第三方服务

Azure Databricks为常用的第三方服务（如 Google Drive、GitHub、Atlassian、SharePoint 和 Glean）提供系统托管连接。这些连接是设置自己的外部 MCP 服务器的快速替代方法，你仍然可以使用 uc_connection 工具类型连接到你自己配置的任何外部 MCP 服务器。

系统托管连接要求在工作区中启用 适用于代理 Beta 的第三方连接器 。请参阅 Manage Azure Databricks 预览版。

支持以下连接器：

连接器	Description
`system_ai_agent_google_drive`	从 Google Drive 搜索和读取文件。
`system_ai_agent_github_mcp`	访问GitHub存储库、问题和拉取请求。
`system_ai_agent_atlassian_mcp`	搜索和管理 Atlassian 资源（Jira、Confluence）。
`system_ai_agent_sharepoint`	从SharePoint搜索和读取文件。
`system_ai_agent_glean_mcp`	搜索由 Glean 编制索引的企业内容。

在tools数组中使用工具类型uc_connection传递连接器，并将name字段设置为连接器名称：

response = client.responses.create(
  model="databricks-claude-sonnet-4-5",
  input=[{"type": "message", "role": "user", "content": "List my open GitHub pull requests."}],
  tools=[
    {
      "type": "uc_connection",
      "uc_connection": {
        "name": "system_ai_agent_github_mcp"
      }
    }
  ],
)

用户到计算机（U2M）身份验证

每个用户单独进行身份验证。 OAuth 令牌不会在用户之间共享。在使用连接器的第一个请求中，用户尚未进行身份验证，响应完成并 status: "failed" 出现包含 oauth 登录 URL 的错误：

{
  "status": "failed",
  "error": {
    "code": "oauth",
    "message": "Failed request to <connector>. Please login first at <login-url>."
  }
}

在浏览器中打开 URL，完成 OAuth 流，然后重新运行相同的请求。

步骤 4 （可选）：添加客户端函数工具

当希望应用程序与Azure Databricks托管的工具一起执行自定义逻辑时，请使用 function 工具。使用 type: "function"、name、可选的 description 以及一个 JSON Schema parameters 对象声明一个函数工具：

response = client.responses.create(
  model="databricks-claude-sonnet-4-5",
  input=[{"type": "message", "role": "user", "content": "<user prompt>"}],
  tools=[
    {
      "type": "function",
      "name": "<client-side-function-name>",
      "description": "<description of what this function does>",
      "parameters": {
        "type": "object",
        "properties": {"<param-name>": {"type": "string"}},
        "required": ["<param-names>"],
        "additionalProperties": False,
      },
    }
  ],
)

监督程序 API 不会在请求之间存储会话状态，因此客户端函数调用需要两轮：

第 1 轮。 模型返回一个function_call条目（例如，“调用get_weather并使用location=Paris”），而不是最终答案。
代码在本地运行函数并生成结果。
第2轮。 再次调用 responses.create()，传入原始输入、模型的 function_call，以及一个包含你的结果的新 function_call_output。模型使用结果生成最终答案。

客户端函数工具示例

import json
from databricks_openai import DatabricksOpenAI

client = DatabricksOpenAI(use_ai_gateway=True)
MODEL = "databricks-claude-sonnet-4-5"

GET_WEATHER = {
    "type": "function",
    "name": "get_weather",
    "description": "Get the current weather for a location.",
    "parameters": {
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
        "additionalProperties": False,
    },
}

def run_get_weather(args):
    return json.dumps({
        "location": args["location"],
        "temp_c": 18,
        "condition": "sunny",
    })

CLIENT_TOOLS = {"get_weather": run_get_weather}
TOOLS = [GET_WEATHER]

input_list = [{"role": "user", "content": "What's the weather in Paris?"}]

# Turn 1 — model emits a function_call
resp = client.responses.create(model=MODEL, input=input_list, tools=TOOLS)

# Echo the model's turn into history, then execute pending client function_calls
input_list += [item.model_dump() for item in resp.output]
for item in resp.output:
    if item.type == "function_call" and item.name in CLIENT_TOOLS:
        args = json.loads(item.arguments)
        # Execute the client-side function with the model's arguments
        # and append the result so the model can use it on the next turn.
        tool_output = CLIENT_TOOLS[item.name](args)
        input_list.append({
            "type": "function_call_output",
            "call_id": item.call_id,
            "output": tool_output,
        })

# Turn 2 — model produces the final answer using the tool result
final = client.responses.create(model=MODEL, input=input_list, tools=TOOLS)
print(final.output_text)

有关更多模式（流式处理、托管和客户端工具、MCP 审批、故障排除），请参阅监督 API 客户端函数调用技能。

步骤 5：启用跟踪

在请求主体中传递trace_destination以将跟踪从代理循环发送到 Unity Catalog 表。每个请求都会生成一个跟踪，用于捕获模型调用和工具执行的完整序列。如果未设置 trace_destination，则不会写入任何跟踪。有关设置详细信息，请参阅在 Unity Catalog 中存储 OpenTelemetry 跟踪。

使用 databricks-openai Python 客户端，通过 extra_body 传递它：

response = client.responses.create(
  model="databricks-claude-sonnet-4-5",
  input=[{"type": "message", "role": "user", "content": "Tell me about Databricks"}],
  tools=[...],
  extra_body={
    "trace_destination": {
      "catalog_name": "<catalog>",
      "schema_name": "<schema>",
      "table_prefix": "<table-prefix>"
    }
  }
)

若要在 API 响应中直接返回跟踪，请传入 "databricks_options": {"return_trace": True}extra_body。

还可以使用 MLflow 分布式跟踪将应用程序代码中的跟踪和监督 API 代理循环合并到单个端到端跟踪中。使用 extra_headers 字段传播跟踪上下文标头：

import mlflow
from mlflow.tracing import get_tracing_context_headers_for_http_request

with mlflow.start_span("client-root") as root_span:
  root_span.set_inputs({"input": "Tell me about Databricks"})

  trace_headers = get_tracing_context_headers_for_http_request()

  response = client.responses.create(
    model="databricks-claude-sonnet-4-5",
    input=[{"type": "message", "role": "user", "content": "Tell me about Databricks"}],
    tools=[...],
    extra_body={
      "trace_destination": {
        "catalog_name": "<catalog>",
        "schema_name": "<schema>",
        "table_prefix": "<table-prefix>"
      }
    },
    extra_headers=trace_headers,
  )

后台模式

使用后台模式可以运行长时间运行的代理工作流，这些工作流涉及多个工具调用和复杂的推理，而无需等待它们同步完成。使用 background=True 提交请求，立即接收响应 ID，并在结果准备就绪时轮询结果。这对于在单个请求中查询多个数据源或将多个工具链接在一起的代理特别有用。

创建后台请求

response = client.responses.create(
  model="databricks-claude-sonnet-4-5",
  input=[{"type": "message", "role": "user", "content": "Tell me about Databricks"}],
  tools=[...],
  background=True,
)

print(response.id)     # Use this ID to poll for the result
print(response.status) # "queued" or "in_progress"

轮询结果

使用 responses.retrieve() 来检查状态，直到达到终态：

from time import sleep

while response.status in {"queued", "in_progress"}:
  sleep(2)
  response = client.responses.retrieve(response.id)

print(response.output_text)

使用 MCP 的后台模式

为了安全，Supervisor API 需要在后台模式下执行任何 MCP 工具调用之前获得用户的明确批准。当代理循环选择 MCP 工具时，响应将完成一个 mcp_approval_request。可以查看模型想要传递的工具名称、服务器标签和参数：

{
  "type": "mcp_approval_request",
  "id": "<tool-call-id>",
  "arguments": "{\"query\": \"what is Databricks\", \"count\": 5}",
  "name": "you-search",
  "server_label": "<server-label>",
  "status": "completed"
}

若要批准调用工具并继续代理循环，请在包含完整对话历史记录的mcp_approval_response字段中返回input即可。

{
  "type": "mcp_approval_response",
  "id": "<tool-call-id>",
  "approval_request_id": "<tool-call-id>",
  "approve": true
}

注释

后台模式响应在数据库中保留最多 30 天。

支持的工具

您在请求的数组 tools 中定义工具。每个工具对象共享三个顶级字段：

type （字符串，必需）：选择工具类型的鉴别器。
name （字符串，可选）：显示模型的名称。
description （字符串，可选）：提示模型何时调用此工具。

此外，每个工具对象都带有一个嵌套的配置对象，其键与 type 的值相匹配。下表记录了每个受支持的工具类型的嵌套配置。

工具类型	Example	Scope
`genie_space`	`{` `"type": "genie_space",` `"name": "Customer reviews",` `"genie_space": {` `"space_id": "<id>"` `}` `}`	`genie`
`dashboard`	`{` `"type": "dashboard",` `"name": "Sales dashboard",` `"dashboard": {` `"dashboard_id": "<id>"` `}` `}`	`dashboards`
`uc_function`	`{` `"type": "uc_function",` `"name": "Flag urgent review",` `"uc_function": {` `"name": "<catalog>.<schema>.<function>"` `}` `}`	`unity-catalog`
`table`	`{` `"type": "table",` `"name": "Customer reviews",` `"table": {` `"name": "<catalog>.<schema>.<table_name>"` `}` `}`	`unity-catalog`
`knowledge_assistant`	`{` `"type": "knowledge_assistant",` `"name": "Internal docs",` `"knowledge_assistant": {` `"knowledge_assistant_id": "<id>"` `}` `}`	`model-serving`
`serving_endpoint`	`{` `"type": "serving_endpoint",` `"name": "Custom agent",` `"serving_endpoint": {` `"name": "<endpoint-name>"` `}` `}`	`model-serving`
`web_search`	`{` `"type": "web_search",` `"name": "Web search",` `"web_search": {}` `}`	`model-serving`
`vector_search_index`	`{` `"type": "vector_search_index",` `"name": "Product docs",` `"vector_search_index": {` `"name": "<catalog>.<schema>.<index>",` `"columns": ["title", "content"]` `}` `}`	`vector-search`
`volume`	`{` `"type": "volume",` `"volume": {` `"name": "<catalog>.<schema>.<volume>",` `"description": "Searches files in a Unity Catalog volume"` `}` `}`	`unity-catalog`
`app`	`{` `"type": "app",` `"name": "Support agent",` `"app": {` `"name": "<app-name>"` `}` `}`	`apps`
`uc_connection`	`{` `"type": "uc_connection",` `"name": "GitHub",` `"uc_connection": {` `"name": "system_ai_agent_github_mcp"` `}` `}`	`unity-catalog`
`function`	`{` `"type": "function",` `"name": "get_weather",` `"description": "Get the current weather for a location.",` `"parameters": {` `"type": "object",` `"properties": { "location": { "type": "string" } },` `"required": ["location"]` `}` `}`	没有

对于 serving_endpoint，仅支持 ResponseAgent、ChatCompletions 和 ChatAgent 端点。

对于 app，仅支持 MCP 应用（带有 mcp- 前缀）和自定义 ResponseAgent 应用（带有 agent- 前缀）。

对于 uc_connection，请使用为外部 MCP 服务器或 system_ai_agent_* 系统管理的连接器创建的连接名称（请参阅步骤 3（可选）：使用系统托管连接连接到第三方服务。不支持应用上的自定义 MCP 服务器。

代码执行

当请求需要计算时，监督程序在沙盒无服务器计算会话中运行模型生成的代码，以分析数据、转换文件或运行计算。它支持Python（默认）、SQL 和 shell 命令。监督程序根据需要编写并运行代码本身，因此你不会启用、配置或提供代码。

代码执行在锁定的沙盒中运行，其中包含：

无 Internet 访问。 它会阻止所有出站网络流量，不受工作区网络策略的影响，因此在沙盒中运行的代码无法访问外部端点。
仅限对 Azure Databricks 的访问权限。 它没有自己的数据访问权限。它可以在同一请求中读取使用 table 工具声明的 Unity Catalog 表。

支持的参数

对监督程序 API 的每个请求都接受以下参数。

model：以下支持模型之一。将此字段更改为更换提供商，而无需更改其余代码。
- 克劳德-海库-4.5 （databricks-claude-haiku-4-5）
- Claude-Opus-4.1 （databricks-claude-opus-4-1）
- Claude-Opus-4.5 （databricks-claude-opus-4-5）
- 克劳德-Opus-4.6 （databricks-claude-opus-4-6）
- 克劳德-桑内特-4 （databricks-claude-sonnet-4）
- Claude-Sonnet-4.5 （databricks-claude-sonnet-4-5）
- 克劳德-Sonnet-4.6 （databricks-claude-sonnet-4-6）

input：要发送的对话消息。
tools：托管工具定义（genie_space、dashboard、uc_function、table、knowledge_assistant、serving_endpoint、web_search、vector_search_index、volume、app、uc_connection）和客户端函数工具（function）。请参阅步骤 4（可选）：添加客户端函数工具。
instructions：系统提示来指导监督者的行为。
stream：设置为 true 以流式传输响应。
background：设置为 true 异步运行请求。返回一个用于通过 responses.retrieve() 进行轮询的响应 ID。请参阅后台模式。
trace_destination：具有 catalog_name、 schema_name字段和 table_prefix 字段的可选对象。设置后，监督程序 API 会将完整代理循环的跟踪写入指定的 Unity 目录表。在 Python 客户端中通过 extra_body 传递。

API 不支持推理参数，例如 temperature。服务器在内部管理这些内容。

授权

监督程序 API 使用调用方凭据运行代理循环，因此调用的工具尊重调用者的 Unity 目录权限。直接调用 API 时， DatabricksOpenAI 客户端会随你进行身份验证。

从 Azure Databricks 应用中调用 Supervisor API 时，可以使用应用的服务主体（应用授权）或发出请求的用户（用户授权）的身份来运行工具。对于应用授权，请为每个工具授予应用的服务主体权限。对于用户授权，请将用户的令牌转发到 DatabricksOpenAI 客户端，并添加所需的用户授权范围。请参阅以发出请求的用户身份运行工具。

Limitations

监督程序 API 具有以下限制：

后台模式运行时：后台模式请求的执行时间最长为 30 分钟。
后台模式下的流式处理：stream 和 background 不能同时出现在同一请求中。
持久执行：不支持代理循环的精确单次执行保证，因此无法自动从故障或中断中恢复。
Web 搜索工作区适用范围：web_search 工具对于启用了 HIPAA/BAA 合规性的工作区不可用。它仅在配备支持网页搜索的原生模型或启用跨地域处理的区域中可用。来自不符合条件的工作区且包含 web_search 的请求会被拒绝。

其他资源

反馈

此页面是否有帮助？

Last updated on 2026-07-01

监督者 API （Beta 版）

Requirements

步骤 1：创建单轮 LLM 调用

步骤 2：添加托管工具以运行代理循环

步骤 3 （可选）：使用系统托管连接连接到第三方服务

用户到计算机 （U2M） 身份验证

步骤 4 （可选）：添加客户端函数工具

步骤 5：启用跟踪

后台模式

创建后台请求

轮询结果

使用 MCP 的后台模式

支持的工具

代码执行

支持的参数

授权

Limitations

其他资源

反馈

其他资源

用户到计算机（U2M）身份验证