Cartesia

CartesiaTools 使 Agent 能够执行文本转语音、列出可用语音以及使用 Cartesia 进行语音本地化。

先决条件

以下示例需要 cartesia 库和 API 密钥。

pip install cartesia

export CARTESIA_API_KEY="your_api_key_here"

示例

from agno.agent import Agent
from agno.tools.cartesia import CartesiaTools
from agno.utils.audio import write_audio_to_file

agent = Agent(
    name="Cartesia TTS Agent",
    description="An agent that uses Cartesia for text-to-speech",
    tools=[CartesiaTools()],
    show_tool_calls=True,
)

response = agent.run(
    "Generate a simple greeting using Text-to-Speech: Say \"Welcome to Cartesia, the advanced speech synthesis platform. This speech is generated by an agent.\""
)
if response.audio:
    write_audio_to_file(
        response.audio[0].base64_audio,
        filename="greeting.mp3",
    )

高级示例：翻译和语音本地化

此示例演示如何使用 CartesiaTools 翻译文本、分析情绪、本地化新语音以及生成语音备注。

from textwrap import dedent
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.cartesia import CartesiaTools
from agno.utils.audio import write_audio_to_file

agent_instructions = dedent(
    """Follow these steps SEQUENTIALLY to translate text and generate a localized voice note:
    1. Identify the text to translate and the target language from the user request.
    2. Translate the text accurately to the target language.
    3. Analyze the emotion conveyed by the translated text.
    4. Call `list_voices` to retrieve available voices.
    5. Select a base voice matching the language and emotion.
    6. Call `localize_voice` to create a new localized voice.
    7. Call `text_to_speech` to generate the final audio.
    """
)

agent = Agent(
    name="Emotion-Aware Translator Agent",
    description="Translates text, analyzes emotion, selects a suitable voice, creates a localized voice, and generates a voice note (audio file) using Cartesia TTS tools.",
    instructions=agent_instructions,
    model=OpenAIChat(id="gpt-4o"),
    tools=[CartesiaTools(voice_localize_enabled=True)],
    show_tool_calls=True,
)

agent.print_response(
    "Translate 'Hello! How are you? Tell me more about the weather in Paris?' to French and create a voice note."
)
response = agent.run_response

if response.audio:
    write_audio_to_file(
        response.audio[0].base64_audio,
        filename="french_weather.mp3",
    )

Toolkit 参数

参数	类型	默认值	描述
`api_key`	`str`	`None`	用于身份验证的 Cartesia API 密钥。如果未提供，将使用 `CARTESIA_API_KEY` 环境变量。
`model_id`	`str`	`sonic-2`	用于文本转语音的模型 ID。
`default_voice_id`	`str`	`78ab82d5-25be-4f7d-82b3-7ad64e5b85b2`	用于文本转语音和语音本地化的默认语音 ID。
`text_to_speech_enabled`	`bool`	`True`	启用文本转语音功能。
`list_voices_enabled`	`bool`	`True`	启用列出可用语音功能。
`voice_localize_enabled`	`bool`	`False`	启用语音本地化功能。

Toolkit 函数

函数	描述
`list_voices`	列出 Cartesia 可用的语音。
`text_to_speech`	将文本转换为语音。
`localize_voice`	创建一个新的本地化语音。

开发者资源

查看 Tools
查看 Cookbook

简介

概念

其他

操作指南

先决条件

示例

高级示例：翻译和语音本地化

Toolkit 参数

Toolkit 函数

开发者资源

简介

概念

其他

操作指南

​先决条件

​示例

​高级示例：翻译和语音本地化

​Toolkit 参数

​Toolkit 函数

​开发者资源

先决条件

示例

高级示例：翻译和语音本地化

Toolkit 参数

Toolkit 函数

开发者资源