提示缓存 - Agno

提示缓存可以帮助减少处理时间和成本。如果在一个流程中多次使用相同的提示，可以考虑使用它。

您可以在此处了解更多关于 Anthropic 模型提示缓存的信息。

用法

要在 Agno 设置中使用提示缓存，请在初始化 Claude 模型时传递 cache_system_prompt 参数：

from agno.agent import Agent
from agno.models.anthropic import Claude

agent = Agent(
    model=Claude(
        id="claude-3-5-sonnet-20241022",
        cache_system_prompt=True,
    ),
)

请注意，为了使提示缓存生效，提示需要具有一定的长度。您可以阅读有关此内容的更多信息，请参阅 Anthropic 的文档。

扩展缓存

您还可以使用 Anthropic 的扩展缓存 Beta 功能。这会将缓存时长从 5 分钟更新为 1 小时。要激活它，请传递 extended_cache_time 参数和以下 Beta 标头：

from agno.agent import Agent
from agno.models.anthropic import Claude

agent = Agent(
    model=Claude(
        id="claude-3-5-sonnet-20241022",
        default_headers={"anthropic-beta": "extended-cache-ttl-2025-04-11"},
        cache_system_prompt=True,
        extended_cache_time=True,
    ),
)

工作示例

cookbook/models/anthropic/prompt_caching_extended.py

from pathlib import Path

from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.utils.media import download_file

# 从 S3 加载一个示例大型系统消息。像这样的大型提示将受益于缓存。
txt_path = Path(__file__).parent.joinpath("system_promt.txt")
download_file(
    "https://agno-public.s3.amazonaws.com/prompts/system_promt.txt",
    str(txt_path),
)
system_message = txt_path.read_text()

agent = Agent(
    model=Claude(
        id="claude-sonnet-4-20250514",
        default_headers={"anthropic-beta": "extended-cache-ttl-2025-04-11"}, # 设置 Beta 标头以使用扩展缓存时间
        system_prompt=system_message,
        cache_system_prompt=True,  # 为 Anthropic 激活提示缓存以缓存系统提示
        extended_cache_time=True,  # 将缓存时间从默认的 5 分钟延长到 1 小时
    ),
    system_message=system_message,
    markdown=True,
)


# 第一次运行 - 这将创建缓存
response = agent.run(
    "Explain the difference between REST and GraphQL APIs with examples"
)
print(f"First run cache write tokens = {response.metrics['cache_write_tokens']}")  # type: ignore

# 第二次运行 - 这将使用缓存的系统提示
response = agent.run(
    "What are the key principles of clean code and how do I apply them in Python?"
)
print(f"Second run cache read tokens = {response.metrics['cached_tokens']}")  # type: ignore

示例

代理概念

模型

​用法

​扩展缓存

​工作示例

用法

扩展缓存

工作示例