Azure Cognitive Services Speech SDK 示例项目：多平台语音识别与合成

本文介绍 Azure Cognitive Services Speech SDK 官方示例项目，一个覆盖多平台、多语言的语音功能开发资源集合，帮助开发者快速集成语音识别、合成和翻译能力。

项目简介

Azure Cognitive Services Speech SDK Samples 是微软官方提供的示例代码仓库，截至目前在 GitHub 上已获得 3.4k stars，主要使用 C# 编写，同时涵盖 C++、Java、JavaScript、Python、Swift、Objective-C 等多种语言。该项目展示了如何通过 Microsoft Cognitive Services Speech SDK 为应用添加语音功能。

核心特性

语音识别（Speech-to-Text）：支持从麦克风或音频文件进行实时语音识别
语音合成（Text-to-Speech）：将文本转换为自然语音
语音翻译（Speech Translation）：实时将语音翻译为目标语言
语音助手（Voice Assistant）：通过 DialogServiceConnector 构建对话式语音助手
批量转录：通过 REST API 实现批量语音转录和合成
设备枚举工具：获取麦克风/扬声器设备 ID
Custom Speech 数据：支持自定义语音模型训练数据

技术栈

主语言：C# (39.3%)
其他语言：C++、Java、JavaScript/Node.js、Python、Objective-C、Swift
平台支持：Windows、Linux、macOS、Android、iOS、Web
云服务：Azure Cognitive Services

支持的语言与平台

该示例项目覆盖了几乎所有主流开发平台和语言组合：

语言	支持平台
C++	Windows、Linux、macOS
C#	Windows (.NET/UWP)、Linux、macOS (.NET Core)
Java	Android、Windows、Linux、macOS (JRE)
JavaScript	Web 浏览器
Node.js	Node.js
Python	Windows、Linux、macOS
Objective-C	iOS、macOS
Swift	iOS、macOS

快速入门

前置要求

获取 Azure Cognitive Services 订阅密钥（免费试用）
安装对应平台的 Speech SDK
克隆或下载示例代码

获取示例代码

bash

# 克隆仓库
git clone https://github.com/Azure-Samples/cognitive-services-speech-sdk.git

# 或直接下载 ZIP
# https://github.com/Azure-Samples/cognitive-services-speech-sdk/archive/master.zip

语音识别示例（Python）

python

import azure.cognitiveservices.speech as speechsdk

# 配置密钥和区域
speech_config = speechsdk.SpeechConfig(
    subscription="YourSubscriptionKey",
    region="YourServiceRegion"
)

# 从麦克风进行语音识别
speech_recognizer = speechsdk.SpeechRecognizer(
    speech_config=speech_config
)

print("请说话...")
result = speech_recognizer.recognize_once()

if result.reason == speechsdk.ResultReason.RecognizedSpeech:
    print(f"识别结果: {result.text}")
elif result.reason == speechsdk.ResultReason.NoMatch:
    print("未识别到语音")
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation = result.cancellation_details
    print(f"识别取消: {cancellation.reason}")

语音合成示例（C#）

csharp

using Microsoft.CognitiveServices.Speech;

var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
using var synthesizer = new SpeechSynthesizer(config);

var result = await synthesizer.SpeakTextAsync("Hello, this is Azure Speech SDK");

if (result.Reason == ResultReason.SynthesizingAudioCompleted) {
    Console.WriteLine("语音合成完成");
} else if (result.Reason == ResultReason.Canceled) {
    var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
    Console.WriteLine($"合成取消: {cancellation.Reason}");
}

快速入门目录

该仓库按功能模块组织了大量快速入门示例：

语音识别（麦克风输入）：

C++: Windows / Linux / macOS
C#: .NET / .NET Core / UWP
Java: Android / JRE
JavaScript: 浏览器 / Node.js
Python: Windows / Linux / macOS
Objective-C / Swift: iOS / macOS

语音翻译：

C++ / C# (.NET / .NET Core / UWP)
Java JRE

语音合成：

C++ / C# / Java / Python / JavaScript
支持文本合成和 SSML 格式

项目链接

GitHub 仓库：https://github.com/Azure-Samples/cognitive-services-speech-sdk
Speech SDK 文档：https://aka.ms/csspeech
更新日志：https://aka.ms/csspeech/whatsnew

字节笔记本