OpenAITools 允许代理与 OpenAI 模型进行交互,以执行音频转录、图像生成和文本转语音。
前提条件
在使用 OpenAITools
之前,请确保已安装 openai
库并已配置好你的 OpenAI API 密钥。
-
安装库:
-
设置 API 密钥: 从 OpenAI 获取你的 API 密钥,并将其设置为环境变量。
export OPENAI_API_KEY="your-openai-api-key"
初始化
导入 OpenAITools
并将其添加到 Agent 的工具列表中。
from agno.agent import Agent
from agno.tools.openai import OpenAITools
agent = Agent(
name="OpenAI Agent",
tools=[OpenAITools()],
show_tool_calls=True,
markdown=True,
)
使用示例
1. 转录音频
此示例演示了一个转录音频文件的 Agent。
from pathlib import Path
from agno.agent import Agent
from agno.tools.openai import OpenAITools
from agno.utils.media import download_file
audio_url = "https://agno-public.s3.amazonaws.com/demo_data/sample_conversation.wav"
local_audio_path = Path("tmp/sample_conversation.wav")
download_file(audio_url, local_audio_path)
agent = Agent(
name="OpenAI Transcription Agent",
tools=[OpenAITools(transcription_model="whisper-1")],
show_tool_calls=True,
markdown=True,
)
agent.print_response(f"Transcribe the audio file located at '{local_audio_path}'")
2. 生成图像
此示例演示了一个根据文本提示生成图像的 Agent。
image_generation_agent.py
from agno.agent import Agent
from agno.tools.openai import OpenAITools
from agno.utils.media import save_base64_data
agent = Agent(
name="OpenAI Image Generation Agent",
tools=[OpenAITools(image_model="dall-e-3")],
show_tool_calls=True,
markdown=True,
)
response = agent.run("Generate a photorealistic image of a cozy coffee shop interior")
if response.images:
save_base64_data(response.images[0].content, "tmp/coffee_shop.png")
3. 生成语音
此示例演示了一个从文本生成语音的 Agent。
speech_synthesis_agent.py
from agno.agent import Agent
from agno.tools.openai import OpenAITools
from agno.utils.media import save_base64_data
agent = Agent(
name="OpenAI Speech Agent",
tools=[OpenAITools(
text_to_speech_model="tts-1",
text_to_speech_voice="alloy",
text_to_speech_format="mp3"
)],
show_tool_calls=True,
markdown=True,
)
agent.print_response("Generate audio for the text: 'Hello, this is a synthesized voice example.'")
response = agent.run_response
if response and response.audio:
save_base64_data(response.audio[0].base64_audio, "tmp/hello.mp3")
定制化
你可以定制用于转录、图像生成和 TTS 的底层 OpenAI 模型:
OpenAITools(
transcription_model="whisper-1",
image_model="dall-e-3",
text_to_speech_model="tts-1-hd",
text_to_speech_voice="nova",
text_to_speech_format="wav"
)
工具集函数
OpenAITools
工具集提供以下函数:
Function | Description |
---|
transcribe_audio | 从本地文件路径或公共 URL 转录音频 |
generate_image | 根据文本提示生成图像 |
generate_speech | 从文本合成语音 |
开发者资源
Responses are generated using AI and may contain mistakes.