找回密码
 注册免广告
搜索
长桥证券羊毛 📈熊猫速汇 50 元券 🔥ByBit 交易所羊毛🐑MyFin 5 欧元羊毛 🔥
人人必备的 Wise 💳英、德、香港转运 📦,送 $25币安手续费 9 折Ledger 硬件钱包送比特币
最便宜的 eSIM 流量手机号 📱数字货币银行卡,注册送 7 美元💲IBKR开户送 $1000 股票 
查看: 127|回复: 0

[其它] 记录一下 coqui-ai/TTS 启动 http 服务的方式

[复制链接]
HelloWorld 发表于 4 天前 | 显示全部楼层 |阅读模式

注册免广告

您需要 登录 才可以下载或查看,没有账号?注册免广告

×
本帖最后由 HelloWorld 于 2025-4-18 19:10 编辑

参考官方教程:https://docs.coqui.ai/en/latest/docker_images.html#start-a-server

持久化运行容器:
  1. docker run -d --restart always -p 5002:5002 --entrypoint python3 ghcr.io/coqui-ai/tts-cpu TTS/server/server.py --model_name tts_models/en/ljspeech/vits
复制代码


运行后直接打开网页测试:http://your_ip:5002/
音频 wav 音频文件链接为:http://your_ip:5002/api/tts?speaker_id=p364&text=nice%20to%20meet%20you

镜像很大,拉取时你要忍一下:

Snipaste_2025-04-15_23-05-22.png

如果你想运行多个 model,需要跑多个容器,然后用不同端口映射去提供服务

有些模型不支持服务器模式,例如 tts_models/en/multi-dataset/tortoise-v2,所以你运行容器,发现无法通过端口访问,那估计是模型不支持

如果你想在 ARM 电脑上测试,需要用下面的方式去仿真 AMD:
  1. docker pull ghcr.io/coqui-ai/tts-cpu --platform linux/amd64
  2. docker run --rm -it -p 5002:5002 --platform linux/amd64 ghcr.io/coqui-ai/tts-cpu
复制代码


以下是我打印出的模型列表,每个版本的镜像可能模型不一样

不推荐用 /fast_pitch 结尾的模型,这种可能是因为追求速度,声音马赛克感很重

对比了多个模型,这个效果最好:tts_models/en/ljspeech/vits

  1. python3 TTS/server/server.py --list_models


  2. Name format: type/language/dataset/model
  3. 1: tts_models/multilingual/multi-dataset/xtts_v2
  4. 2: tts_models/multilingual/multi-dataset/xtts_v1.1
  5. 3: tts_models/multilingual/multi-dataset/your_tts
  6. 4: tts_models/multilingual/multi-dataset/bark
  7. 5: tts_models/bg/cv/vits
  8. 6: tts_models/cs/cv/vits
  9. 7: tts_models/da/cv/vits
  10. 8: tts_models/et/cv/vits
  11. 9: tts_models/ga/cv/vits
  12. 10: tts_models/en/ek1/tacotron2
  13. 11: tts_models/en/ljspeech/tacotron2-DDC
  14. 12: tts_models/en/ljspeech/tacotron2-DDC_ph
  15. 13: tts_models/en/ljspeech/glow-tts
  16. 14: tts_models/en/ljspeech/speedy-speech
  17. 15: tts_models/en/ljspeech/tacotron2-DCA
  18. 16: tts_models/en/ljspeech/vits
  19. 17: tts_models/en/ljspeech/vits--neon
  20. 18: tts_models/en/ljspeech/fast_pitch
  21. 19: tts_models/en/ljspeech/overflow
  22. 20: tts_models/en/ljspeech/neural_hmm
  23. 21: tts_models/en/vctk/vits
  24. 22: tts_models/en/vctk/fast_pitch
  25. 23: tts_models/en/sam/tacotron-DDC
  26. 24: tts_models/en/blizzard2013/capacitron-t2-c50
  27. 25: tts_models/en/blizzard2013/capacitron-t2-c150_v2
  28. 26: tts_models/en/multi-dataset/tortoise-v2
  29. 27: tts_models/en/jenny/jenny
  30. 28: tts_models/es/mai/tacotron2-DDC
  31. 29: tts_models/es/css10/vits
  32. 30: tts_models/fr/mai/tacotron2-DDC
  33. 31: tts_models/fr/css10/vits
  34. 32: tts_models/uk/mai/glow-tts
  35. 33: tts_models/uk/mai/vits
  36. 34: tts_models/zh-CN/baker/tacotron2-DDC-GST
  37. 35: tts_models/nl/mai/tacotron2-DDC
  38. 36: tts_models/nl/css10/vits
  39. 37: tts_models/de/thorsten/tacotron2-DCA
  40. 38: tts_models/de/thorsten/vits
  41. 39: tts_models/de/thorsten/tacotron2-DDC
  42. 40: tts_models/de/css10/vits-neon
  43. 41: tts_models/ja/kokoro/tacotron2-DDC
  44. 42: tts_models/tr/common-voice/glow-tts
  45. 43: tts_models/it/mai_female/glow-tts
  46. 44: tts_models/it/mai_female/vits
  47. 45: tts_models/it/mai_male/glow-tts
  48. 46: tts_models/it/mai_male/vits
  49. 47: tts_models/ewe/openbible/vits
  50. 48: tts_models/hau/openbible/vits
  51. 49: tts_models/lin/openbible/vits
  52. 50: tts_models/tw_akuapem/openbible/vits
  53. 51: tts_models/tw_asante/openbible/vits
  54. 52: tts_models/yor/openbible/vits
  55. 53: tts_models/hu/css10/vits
  56. 54: tts_models/el/cv/vits
  57. 55: tts_models/fi/css10/vits
  58. 56: tts_models/hr/cv/vits
  59. 57: tts_models/lt/cv/vits
  60. 58: tts_models/lv/cv/vits
  61. 59: tts_models/mt/cv/vits
  62. 60: tts_models/pl/mai_female/vits
  63. 61: tts_models/pt/cv/vits
  64. 62: tts_models/ro/cv/vits
  65. 63: tts_models/sk/cv/vits
  66. 64: tts_models/sl/cv/vits
  67. 65: tts_models/sv/cv/vits
  68. 66: tts_models/ca/custom/vits
  69. 67: tts_models/fa/custom/glow-tts
  70. 68: tts_models/bn/custom/vits-male
  71. 69: tts_models/bn/custom/vits-female
  72. 70: tts_models/be/common-voice/glow-tts

  73. Name format: type/language/dataset/model
  74. 1: vocoder_models/universal/libri-tts/wavegrad
  75. 2: vocoder_models/universal/libri-tts/fullband-melgan
  76. 3: vocoder_models/en/ek1/wavegrad
  77. 4: vocoder_models/en/ljspeech/multiband-melgan
  78. 5: vocoder_models/en/ljspeech/hifigan_v2
  79. 6: vocoder_models/en/ljspeech/univnet
  80. 7: vocoder_models/en/blizzard2013/hifigan_v2
  81. 8: vocoder_models/en/vctk/hifigan_v2
  82. 9: vocoder_models/en/sam/hifigan_v2
  83. 10: vocoder_models/nl/mai/parallel-wavegan
  84. 11: vocoder_models/de/thorsten/wavegrad
  85. 12: vocoder_models/de/thorsten/fullband-melgan
  86. 13: vocoder_models/de/thorsten/hifigan_v1
  87. 14: vocoder_models/ja/kokoro/hifigan_v1
  88. 15: vocoder_models/uk/mai/multiband-melgan
  89. 16: vocoder_models/tr/common-voice/hifigan
  90. 17: vocoder_models/be/common-voice/hifigan

  91. Name format: type/language/dataset/model
  92. 1: voice_conversion_models/multilingual/vctk/freevc24
复制代码
如果帖子/回帖帮助到你,请给作者评分/点赞
您需要登录后才可以回帖 登录 | 注册免广告

本版积分规则

排行榜|意见建议|数字居民论坛

GMT+8, 2025-4-19 16:45

Powered by Discuz! X3.5

© 2001-2025 Discuz! Team.

快速回复 返回顶部 返回列表