OpenAI

OpenAITools 允许代理与 OpenAI 模型进行交互，以执行音频转录、图像生成和文本转语音。

前提条件

在使用 OpenAITools 之前，请确保已安装 openai 库并已配置好你的 OpenAI API 密钥。

安装库：
```
pip install -U openai
```
设置 API 密钥： 从 OpenAI 获取你的 API 密钥，并将其设置为环境变量。
export OPENAI_API_KEY="your-openai-api-key"

初始化

导入 OpenAITools 并将其添加到 Agent 的工具列表中。

from agno.agent import Agent
from agno.tools.openai import OpenAITools

agent = Agent(
    name="OpenAI Agent",
    tools=[OpenAITools()],
    show_tool_calls=True,
    markdown=True,
)

使用示例

1. 转录音频

此示例演示了一个转录音频文件的 Agent。

transcription_agent.py

from pathlib import Path
from agno.agent import Agent
from agno.tools.openai import OpenAITools
from agno.utils.media import download_file

audio_url = "https://agno-public.s3.amazonaws.com/demo_data/sample_conversation.wav"

local_audio_path = Path("tmp/sample_conversation.wav")
download_file(audio_url, local_audio_path)

agent = Agent(
    name="OpenAI Transcription Agent",
    tools=[OpenAITools(transcription_model="whisper-1")],
    show_tool_calls=True,
    markdown=True,
)

agent.print_response(f"Transcribe the audio file located at '{local_audio_path}'")

2. 生成图像

此示例演示了一个根据文本提示生成图像的 Agent。

image_generation_agent.py

from agno.agent import Agent
from agno.tools.openai import OpenAITools
from agno.utils.media import save_base64_data

agent = Agent(
    name="OpenAI Image Generation Agent",
    tools=[OpenAITools(image_model="dall-e-3")],
    show_tool_calls=True,
    markdown=True,
)

response = agent.run("Generate a photorealistic image of a cozy coffee shop interior")

if response.images:
    save_base64_data(response.images[0].content, "tmp/coffee_shop.png")

3. 生成语音

此示例演示了一个从文本生成语音的 Agent。

speech_synthesis_agent.py

from agno.agent import Agent
from agno.tools.openai import OpenAITools
from agno.utils.media import save_base64_data

agent = Agent(
    name="OpenAI Speech Agent",
    tools=[OpenAITools(
        text_to_speech_model="tts-1",
        text_to_speech_voice="alloy",
        text_to_speech_format="mp3"
    )],
    show_tool_calls=True,
    markdown=True,
)

agent.print_response("Generate audio for the text: 'Hello, this is a synthesized voice example.'")

response = agent.run_response
if response and response.audio:
    save_base64_data(response.audio[0].base64_audio, "tmp/hello.mp3")

在此查看更多示例：here

定制化

你可以定制用于转录、图像生成和 TTS 的底层 OpenAI 模型：

OpenAITools(
    transcription_model="whisper-1",
    image_model="dall-e-3",
    text_to_speech_model="tts-1-hd",
    text_to_speech_voice="nova",
    text_to_speech_format="wav"
)

工具集函数

OpenAITools 工具集提供以下函数：

Function	Description
`transcribe_audio`	从本地文件路径或公共 URL 转录音频
`generate_image`	根据文本提示生成图像
`generate_speech`	从文本合成语音

简介

概念

其他

操作指南

前提条件

初始化

使用示例

1. 转录音频

2. 生成图像

3. 生成语音

定制化

工具集函数

开发者资源

简介

概念

其他

操作指南

​前提条件

​初始化

​使用示例

​1. 转录音频

​2. 生成图像

​3. 生成语音

​定制化

​工具集函数

​开发者资源

前提条件

初始化

使用示例

1. 转录音频

2. 生成图像

3. 生成语音

定制化

工具集函数

开发者资源