MLX 语音转写

MLX Transcribe 是一款使用 MLX Whisper 转写音频文件的工具。

前提条件

安装 ffmpeg
- macOS: brew install ffmpeg
- Ubuntu: sudo apt-get install ffmpeg
- Windows: 从 https://ffmpeg.org/download.html 下载
安装 mlx-whisper 库
```
pip install mlx-whisper
```
准备音频文件
- 创建一个 ‘storage/audio’ 目录
- 将您的音频文件放入此目录
- 支持的格式：mp3, mp4, wav 等。
下载示例音频 (可选)
- 前往 audio-samples (作为示例)，并将音频文件保存至 storage/audio 目录。

示例

以下 Agent 将使用 MLX Transcribe 来转写音频文件。

cookbook/tools/mlx_transcribe_tools.py


from pathlib import Path
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.mlx_transcribe import MLXTranscribeTools

# 从 storage/audio 目录获取音频文件
agno_root_dir = Path(__file__).parent.parent.parent.resolve()
audio_storage_dir = agno_root_dir.joinpath("storage/audio")
if not audio_storage_dir.exists():
    audio_storage_dir.mkdir(exist_ok=True, parents=True)

agent = Agent(
    name="Transcription Agent",
    model=OpenAIChat(id="gpt-4o"),
    tools=[MLXTranscribeTools(base_dir=audio_storage_dir)],
    instructions=[
        "要转写音频文件，请使用 `transcribe` 工具，并将音频文件名作为参数。",
        "您可以使用 `read_files` 工具查找所有可用的音频文件。",
    ],
    markdown=True,
)

agent.print_response("总结 Reid Hoffman 的 TED 演讲，并将其分成几个部分", stream=True)

Toolkit 参数

参数	类型	默认值	描述
`base_dir`	`Path`	`Path.cwd()`	音频文件的基础目录
`read_files_in_base_dir`	`bool`	`True`	是否注册 read_files 函数
`path_or_hf_repo`	`str`	`"mlx-community/whisper-large-v3-turbo"`	模型的路径或 HuggingFace 仓库
`verbose`	`bool`	`None`	启用详细输出
`temperature`	`float` 或 `Tuple[float, ...]`	`None`	采样温度
`compression_ratio_threshold`	`float`	`None`	压缩率阈值
`logprob_threshold`	`float`	`None`	对数概率阈值
`no_speech_threshold`	`float`	`None`	无语音阈值
`condition_on_previous_text`	`bool`	`None`	是否基于先前文本进行条件设置
`initial_prompt`	`str`	`None`	转写的初始提示
`word_timestamps`	`bool`	`None`	启用词级时间戳
`prepend_punctuations`	`str`	`None`	在句首添加的标点符号
`append_punctuations`	`str`	`None`	在句尾添加的标点符号
`clip_timestamps`	`str` 或 `List[float]`	`None`	裁剪时间戳
`hallucination_silence_threshold`	`float`	`None`	幻觉静默阈值
`decode_options`	`dict`	`None`	其他解码选项

Toolkit 函数

函数	描述
`transcribe`	使用 MLX Whisper 转写音频文件
`read_files`	列出基础目录中的所有音频文件

开发者资源

查看 Tools
查看 Cookbook

简介

概念

其他

操作指南

MLX 语音转写

前提条件

示例

Toolkit 参数

Toolkit 函数

开发者资源

简介

概念

其他

操作指南

​前提条件

​示例

​Toolkit 参数

​Toolkit 函数

​开发者资源

前提条件

示例

Toolkit 参数

Toolkit 函数

开发者资源