Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dart API for SenseVoice #1159

Merged
merged 2 commits into from
Jul 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/scripts/test-dart.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ cd dart-api-examples

pushd non-streaming-asr

echo '----------SenseVoice----------'
./run-sense-voice.sh
rm -rf sherpa-onnx-*

echo '----------NeMo transducer----------'
./run-nemo-transducer.sh
rm -rf sherpa-onnx-*
Expand Down
1 change: 1 addition & 0 deletions dart-api-examples/non-streaming-asr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ This folder contains examples for non-streaming ASR with Dart API.
|[./bin/whisper.dart](./bin/whisper.dart)| Use whisper for speech recognition. See [./run-whisper.sh](./run-whisper.sh)|
|[./bin/zipformer-transducer.dart](./bin/zipformer-transducer.dart)| Use a zipformer transducer for speech recognition. See [./run-zipformer-transducer.sh](./run-zipformer-transducer.sh)|
|[./bin/vad-with-paraformer.dart](./bin/vad-with-paraformer.dart)| Use a [silero-vad](https://github.com/snakers4/silero-vad) with paraformer for speech recognition. See [./run-vad-with-paraformer.sh](./run-vad-with-paraformer.sh)|
|[./bin/sense-voice.dart](./bin/sense-voice.dart)| Use a SenseVoice CTC model for speech recognition. See [./run-sense-voice.sh](./run-sense-voice.sh)|

61 changes: 61 additions & 0 deletions dart-api-examples/non-streaming-asr/bin/sense-voice.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
// Copyright (c) 2024 Xiaomi Corporation
import 'dart:io';
import 'dart:typed_data';

import 'package:args/args.dart';
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

import './init.dart';

void main(List<String> arguments) async {
await initSherpaOnnx();

final parser = ArgParser()
..addOption('model', help: 'Path to the paraformer model')
..addOption('tokens', help: 'Path to tokens.txt')
..addOption('language',
help: 'auto, zh, en, ja, ko, yue, or leave it empty to use auto',
defaultsTo: '')
..addOption('use-itn',
help: 'true to use inverse text normalization', defaultsTo: 'false')
..addOption('input-wav', help: 'Path to input.wav to transcribe');

final res = parser.parse(arguments);
if (res['model'] == null ||
res['tokens'] == null ||
res['input-wav'] == null) {
print(parser.usage);
exit(1);
}

final model = res['model'] as String;
final tokens = res['tokens'] as String;
final inputWav = res['input-wav'] as String;
final language = res['language'] as String;
final useItn = (res['use-itn'] as String).toLowerCase() == 'true';

final senseVoice = sherpa_onnx.OfflineSenseVoiceModelConfig(
model: model, language: language, useInverseTextNormalization: useItn);

final modelConfig = sherpa_onnx.OfflineModelConfig(
senseVoice: senseVoice,
tokens: tokens,
debug: true,
numThreads: 1,
);
final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);
final recognizer = sherpa_onnx.OfflineRecognizer(config);

final waveData = sherpa_onnx.readWave(inputWav);
final stream = recognizer.createStream();

stream.acceptWaveform(
samples: waveData.samples, sampleRate: waveData.sampleRate);
recognizer.decode(stream);

final result = recognizer.getResult(stream);
print(result.text);

stream.free();
recognizer.free();
}
2 changes: 1 addition & 1 deletion dart-api-examples/non-streaming-asr/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ environment:

# Add regular dependencies here.
dependencies:
sherpa_onnx: ^1.10.16
sherpa_onnx: ^1.10.17
path: ^1.9.0
args: ^2.5.0

Expand Down
18 changes: 18 additions & 0 deletions dart-api-examples/non-streaming-asr/run-sense-voice.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env bash

set -ex

dart pub get

if [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
fi

dart run \
./bin/sense-voice.dart \
--model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx \
--tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \
--use-itn true \
--input-wav ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav
2 changes: 1 addition & 1 deletion dart-api-examples/streaming-asr/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ environment:

# Add regular dependencies here.
dependencies:
sherpa_onnx: ^1.10.16
sherpa_onnx: ^1.10.17
path: ^1.9.0
args: ^2.5.0

Expand Down
2 changes: 1 addition & 1 deletion dart-api-examples/tts/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ environment:

# Add regular dependencies here.
dependencies:
sherpa_onnx: ^1.10.16
sherpa_onnx: ^1.10.17
path: ^1.9.0
args: ^2.5.0

Expand Down
2 changes: 1 addition & 1 deletion dart-api-examples/vad/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ environment:
sdk: ^3.4.0

dependencies:
sherpa_onnx: ^1.10.16
sherpa_onnx: ^1.10.17
path: ^1.9.0
args: ^2.5.0

Expand Down
4 changes: 2 additions & 2 deletions flutter-examples/streaming_asr/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: >

publish_to: 'none'

version: 1.10.16
version: 1.10.17

topics:
- speech-recognition
Expand All @@ -30,7 +30,7 @@ dependencies:
record: ^5.1.0
url_launcher: ^6.2.6

sherpa_onnx: ^1.10.16
sherpa_onnx: ^1.10.17
# sherpa_onnx:
# path: ../../flutter/sherpa_onnx

Expand Down
4 changes: 2 additions & 2 deletions flutter-examples/tts/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: >

publish_to: 'none' # Remove this line if you wish to publish to pub.dev

version: 1.10.16
version: 1.10.17

environment:
sdk: '>=3.4.0 <4.0.0'
Expand All @@ -17,7 +17,7 @@ dependencies:
cupertino_icons: ^1.0.6
path_provider: ^2.1.3
path: ^1.9.0
sherpa_onnx: ^1.10.16
sherpa_onnx: ^1.10.17
url_launcher: ^6.2.6
audioplayers: ^5.0.0

Expand Down
31 changes: 30 additions & 1 deletion flutter/sherpa_onnx/lib/src/offline_recognizer.dart
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,23 @@ class OfflineTdnnModelConfig {
final String model;
}

class OfflineSenseVoiceModelConfig {
const OfflineSenseVoiceModelConfig({
this.model = '',
this.language = '',
this.useInverseTextNormalization = false,
});

@override
String toString() {
return 'OfflineSenseVoiceModelConfig(model: $model, language: $language, useInverseTextNormalization: $useInverseTextNormalization)';
}

final String model;
final String language;
final bool useInverseTextNormalization;
}

class OfflineLMConfig {
const OfflineLMConfig({this.model = '', this.scale = 1.0});

Expand All @@ -98,6 +115,7 @@ class OfflineModelConfig {
this.nemoCtc = const OfflineNemoEncDecCtcModelConfig(),
this.whisper = const OfflineWhisperModelConfig(),
this.tdnn = const OfflineTdnnModelConfig(),
this.senseVoice = const OfflineSenseVoiceModelConfig(),
required this.tokens,
this.numThreads = 1,
this.debug = true,
Expand All @@ -110,14 +128,15 @@ class OfflineModelConfig {

@override
String toString() {
return 'OfflineModelConfig(transducer: $transducer, paraformer: $paraformer, nemoCtc: $nemoCtc, whisper: $whisper, tdnn: $tdnn, tokens: $tokens, numThreads: $numThreads, debug: $debug, provider: $provider, modelType: $modelType, modelingUnit: $modelingUnit, bpeVocab: $bpeVocab, telespeechCtc: $telespeechCtc)';
return 'OfflineModelConfig(transducer: $transducer, paraformer: $paraformer, nemoCtc: $nemoCtc, whisper: $whisper, tdnn: $tdnn, senseVoice: $senseVoice, tokens: $tokens, numThreads: $numThreads, debug: $debug, provider: $provider, modelType: $modelType, modelingUnit: $modelingUnit, bpeVocab: $bpeVocab, telespeechCtc: $telespeechCtc)';
}

final OfflineTransducerModelConfig transducer;
final OfflineParaformerModelConfig paraformer;
final OfflineNemoEncDecCtcModelConfig nemoCtc;
final OfflineWhisperModelConfig whisper;
final OfflineTdnnModelConfig tdnn;
final OfflineSenseVoiceModelConfig senseVoice;

final String tokens;
final int numThreads;
Expand Down Expand Up @@ -219,6 +238,14 @@ class OfflineRecognizer {

c.ref.model.tdnn.model = config.model.tdnn.model.toNativeUtf8();

c.ref.model.senseVoice.model = config.model.senseVoice.model.toNativeUtf8();

c.ref.model.senseVoice.language =
config.model.senseVoice.language.toNativeUtf8();

c.ref.model.senseVoice.useInverseTextNormalization =
config.model.senseVoice.useInverseTextNormalization ? 1 : 0;

c.ref.model.tokens = config.model.tokens.toNativeUtf8();

c.ref.model.numThreads = config.model.numThreads;
Expand Down Expand Up @@ -254,6 +281,8 @@ class OfflineRecognizer {
calloc.free(c.ref.model.modelType);
calloc.free(c.ref.model.provider);
calloc.free(c.ref.model.tokens);
calloc.free(c.ref.model.senseVoice.language);
calloc.free(c.ref.model.senseVoice.model);
calloc.free(c.ref.model.tdnn.model);
calloc.free(c.ref.model.whisper.task);
calloc.free(c.ref.model.whisper.language);
Expand Down
10 changes: 10 additions & 0 deletions flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,14 @@ final class SherpaOnnxOfflineTdnnModelConfig extends Struct {
external Pointer<Utf8> model;
}

final class SherpaOnnxOfflineSenseVoiceModelConfig extends Struct {
external Pointer<Utf8> model;
external Pointer<Utf8> language;

@Int32()
external int useInverseTextNormalization;
}

final class SherpaOnnxOfflineLMConfig extends Struct {
external Pointer<Utf8> model;

Expand Down Expand Up @@ -115,6 +123,8 @@ final class SherpaOnnxOfflineModelConfig extends Struct {
external Pointer<Utf8> modelingUnit;
external Pointer<Utf8> bpeVocab;
external Pointer<Utf8> telespeechCtc;

external SherpaOnnxOfflineSenseVoiceModelConfig senseVoice;
}

final class SherpaOnnxOfflineRecognizerConfig extends Struct {
Expand Down
12 changes: 6 additions & 6 deletions flutter/sherpa_onnx/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ topics:
- voice-activity-detection

# remember to change the version in ../sherpa_onnx_macos/macos/sherpa_onnx_macos.podspec
version: 1.10.16
version: 1.10.17

homepage: https://github.com/k2-fsa/sherpa-onnx

Expand All @@ -30,19 +30,19 @@ dependencies:
flutter:
sdk: flutter

sherpa_onnx_android: ^1.10.16
sherpa_onnx_android: ^1.10.17
# path: ../sherpa_onnx_android

sherpa_onnx_macos: ^1.10.16
sherpa_onnx_macos: ^1.10.17
# path: ../sherpa_onnx_macos

sherpa_onnx_linux: ^1.10.16
sherpa_onnx_linux: ^1.10.17
# path: ../sherpa_onnx_linux
#
sherpa_onnx_windows: ^1.10.16
sherpa_onnx_windows: ^1.10.17
# path: ../sherpa_onnx_windows

sherpa_onnx_ios: ^1.10.16
sherpa_onnx_ios: ^1.10.17
# sherpa_onnx_ios:
# path: ../sherpa_onnx_ios

Expand Down
2 changes: 1 addition & 1 deletion flutter/sherpa_onnx_ios/ios/sherpa_onnx_ios.podspec
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
# https://groups.google.com/g/dart-ffi/c/nUATMBy7r0c
Pod::Spec.new do |s|
s.name = 'sherpa_onnx_ios'
s.version = '1.10.16'
s.version = '1.10.17'
s.summary = 'A new Flutter FFI plugin project.'
s.description = <<-DESC
A new Flutter FFI plugin project.
Expand Down
2 changes: 1 addition & 1 deletion flutter/sherpa_onnx_macos/macos/sherpa_onnx_macos.podspec
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#
Pod::Spec.new do |s|
s.name = 'sherpa_onnx_macos'
s.version = '1.10.16'
s.version = '1.10.17'
s.summary = 'sherpa-onnx Flutter FFI plugin project.'
s.description = <<-DESC
sherpa-onnx Flutter FFI plugin project.
Expand Down
2 changes: 1 addition & 1 deletion scripts/dart/sherpa-onnx-pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ topics:
- voice-activity-detection

# remember to change the version in ../sherpa_onnx_macos/macos/sherpa_onnx.podspec
version: 1.10.16
version: 1.10.17

homepage: https://github.com/k2-fsa/sherpa-onnx

Expand Down
Loading
Loading