Cantonese to mandarin voice translator

12/20/2023

□□ 2022.03.28: PaddleSpeech CLI is available for Speaker Verification.□□ 2022.05.06: PaddleSpeech Server is available for Audio Classification, Automatic Speech Recognition and Text-to-Speech, Speaker Verification and Punctuation Restoration.□□ 2022.05.06: PaddleSpeech Streaming Server is available for Streaming ASR with Punctuation Restoration and Token Timestamp and Text-to-Speech.□ 2022.06.22: All TTS models support ONNX format.⚡ 2022.08.03: Add ONNXRuntime infer for TTS CLI.□ 2022.08.09: Release Chinese English mixed TTS.□ 2022.08.15: Add g2pW into TTS Chinese Text Frontend.□ 2022.08.22: Add ERNIE-SAT models: ERNIE-SAT-vctk、 ERNIE-SAT-aishell3、 ERNIE-SAT-zh_en.⚡ 2022.08.25: Release TTS finetune example.⚡ 2022.09.09: Add AISHELL-3 Voice Cloning example with ECAPA-TDNN speaker encoder.□ 2022.09.26: Add Voice Cloning, TTS finetune, and ERNIE-SAT in PaddleSpeech Web Demo.□ 2022.10.11: Add Wav2vec2ASR-en, wav2vec2.0 fine-tuning for ASR on LibriSpeech.□ 2022.10.21: Add SSML for TTS Chinese Text Frontend.□ 2022.10.26: Add Prosody Prediction for TTS.□ 2022.11.01: Add Adversarial Loss for Chinese English mixed TTS.□ 2022.11.07: Add U2/U2++ C++ High Performance Streaming ASR Deployment.□ 2022.11.18: Add Wav2vec2 CLI and Demos, Support ASR and Feature Extraction.□ 2022.11.18: Add Whisper CLI and Demos, support multi language recognition and translation.□ 2022.11.28: PP-TTS and PP-ASR demos are available in AIStudio and official website.□ 2022.12.02: Add end-to-end Prosody Prediction pipeline (including using prosody labels in Acoustic Model).□ 2023.01.10: Add code-switch asr CLI and Demos.□ 2023.03.03 Add Voice Conversion StarGANv2-VC synthesize pipeline.□ 2023.03.07: Add TTS ARM Linux C++ Demo (with C++ Chinese Text Frontend).□ 2023.03.14: Add SVS(Singing Voice Synthesis) examples with Opencpop dataset, including DiffSinger、 PWGAN and HiFiGAN, the effect is continuously optimized.□ 2023.04.06: Add subtitle file (.srt format) generation example.⚡ 2023.04.28: Fix 0-d tensor, with the upgrade of paddlepaddle=2.5, the problem of modifying 0-d tensor has been solved.□ 2023.05.04: Add HuBERT ASR-en, HuBERT fine-tuning for ASR on LibriSpeech.□ 2023.05.31: Add WavLM ASR-en, WavLM fine-tuning for ASR on LibriSpeech.□ Cascaded models application: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV).

□ Integration of mainstream models and datasets: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc.□️ Implementation of critical audio tasks: this toolkit contains audio functions like Automatic Speech Recognition, Text-to-Speech Synthesis, Speaker Verfication, KeyWord Spotting, Audio Classification, and Speech Translation, etc.□ Varieties of Functions that Vitalize both Industrial and Academia:.Moreover, we use self-defined linguistic rules to adapt Chinese context. □ Rule-based Chinese frontend: our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi).

□ Streaming ASR and TTS System: we provide production ready streaming asr and streaming tts system.
□ Align to the State-of-the-Art: we provide high-speed and ultra-lightweight models, and also cutting-edge technology.
□ Ease of Use: low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey.To be more specific, this toolkit features at: Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process.

0 Comments

Cantonese to mandarin voice translator

Leave a Reply.

Author

Archives

Categories