[00:00:00.000 --> 00:00:07.680] 现在来你跟我说两句话 [00:00:07.680 --> 00:00:10.240] 我看一次 [00:00:10.240 --> 00:00:11.780] 两句话 [00:00:11.780 --> 00:00:13.320] 好 [00:00:13.320 --> 00:00:17.400] 请一下落音效果怎么样 [00:00:17.400 --> 00:00:20.480] 你这三两句话 [00:00:20.480 --> 00:00:23.040] 好 [00:00:29.180 --> 00:00:30.180] 好多吗? whisper_print_timings: load time = 46.42 ms whisper_print_timings: fallbacks = 1 p / 0 h whisper_print_timings: mel time = 23.61 ms whisper_print_timings: sample time = 82.86 ms / 282 runs ( 0.29 ms per run) whisper_print_timings: encode time = 706.36 ms / 2 runs ( 353.18 ms per run) whisper_print_timings: decode time = 1.88 ms / 1 runs ( 1.88 ms per run) whisper_print_timings: batchd time = 193.39 ms / 269 runs ( 0.72 ms per run) whisper_print_timings: prompt time = 50.64 ms / 96 runs ( 0.53 ms per run) whisper_print_timings: total time = 1128.47 ms
base测试
如果不指定语言,会如下所示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
[00:00:00.000 --> 00:00:02.000] "What do you want to say?" [00:00:02.000 --> 00:00:05.000] "What do you want to say to me?" [00:00:05.000 --> 00:00:06.000] "What do you want to say to me?" [00:00:06.000 --> 00:00:07.000] "What do you want to say to me?" [00:00:07.000 --> 00:00:08.000] "What do you want to say to me?" [00:00:08.000 --> 00:00:09.000] "What do you want to say to me?" [00:00:09.000 --> 00:00:10.000] "What do you want to say to me?" [00:00:10.000 --> 00:00:11.000] "What do you want to say to me?" [00:00:11.000 --> 00:00:12.000] "What do you want to say to me?" [00:00:12.000 --> 00:00:13.000] "What do you want to say to me?" [00:00:13.000 --> 00:00:14.000] "What do you want to say to me?" [00:00:14.000 --> 00:00:15.000] "What do you want to say to me?" [00:00:15.000 --> 00:00:16.000] "What do you want to say to me?" [00:00:16.000 --> 00:00:17.000] "What do you want to say to me?" [00:00:17.000 --> 00:00:18.000] "What do you want to say to me?" [00:00:18.000 --> 00:00:19.000] "What do you want to say to me?" [00:00:19.000 --> 00:00:20.000] "What do you want to say to me?" [00:00:20.000 --> 00:00:21.000] "What do you want to say to me?" [00:00:21.000 --> 00:00:36.000] "What do you want to say to me?"
[00:00:00.000 --> 00:00:07.680] 现在来你跟我说两句话 [00:00:07.680 --> 00:00:10.240] 我看一次 [00:00:10.240 --> 00:00:11.780] 两句话 [00:00:11.780 --> 00:00:13.320] 好 [00:00:13.320 --> 00:00:17.400] 请一下落音效果怎么样 [00:00:17.400 --> 00:00:20.480] 你这三两句话 [00:00:20.480 --> 00:00:23.040] 好 [00:00:29.180 --> 00:00:30.180] 好多吗? whisper_print_timings: load time = 46.42 ms whisper_print_timings: fallbacks = 1 p / 0 h whisper_print_timings: mel time = 23.61 ms whisper_print_timings: sample time = 82.86 ms / 282 runs ( 0.29 ms per run) whisper_print_timings: encode time = 706.36 ms / 2 runs ( 353.18 ms per run) whisper_print_timings: decode time = 1.88 ms / 1 runs ( 1.88 ms per run) whisper_print_timings: batchd time = 193.39 ms / 269 runs ( 0.72 ms per run) whisper_print_timings: prompt time = 50.64 ms / 96 runs ( 0.53 ms per run) whisper_print_timings: total time = 1128.47 ms
whisper_print_timings: load time = 208.93 ms whisper_print_timings: fallbacks = 1 p / 0 h whisper_print_timings: mel time = 24.02 ms whisper_print_timings: sample time = 515.76 ms / 940 runs ( 0.55 ms per run) whisper_print_timings: encode time = 5609.63 ms / 2 runs ( 2804.82 ms per run) whisper_print_timings: decode time = 0.00 ms / 1 runs ( 0.00 ms per run) whisper_print_timings: batchd time = 3176.09 ms / 928 runs ( 3.42 ms per run) whisper_print_timings: prompt time = 231.88 ms / 96 runs ( 2.42 ms per run) whisper_print_timings: total time = 9850.00 ms
medium
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
[00:00:00.000 --> 00:00:08.000] 你跟我說兩句話 [00:00:08.000 --> 00:00:11.000] 好你試 [00:00:11.000 --> 00:00:13.000] 你想說啥 [00:00:13.000 --> 00:00:15.000] 好 [00:00:15.000 --> 00:00:18.000] 聽一下錄音效果怎麼樣 [00:00:18.000 --> 00:00:21.000] 你再說兩句話 [00:00:21.000 --> 00:00:23.000] 好 [00:00:23.000 --> 00:00:33.000] 他怎麼了 whisper_print_timings: load time = 594.03 ms whisper_print_timings: fallbacks = 0 p / 0 h whisper_print_timings: mel time = 24.38 ms whisper_print_timings: sample time = 80.54 ms / 253 runs ( 0.32 ms per run) whisper_print_timings: encode time = 18115.92 ms / 2 runs ( 9057.96 ms per run) whisper_print_timings: decode time = 62.90 ms / 3 runs ( 20.97 ms per run) whisper_print_timings: batchd time = 2026.68 ms / 243 runs ( 8.34 ms per run) whisper_print_timings: prompt time = 329.93 ms / 48 runs ( 6.87 ms per run) whisper_print_timings: total time = 21424.55 ms
whisper_print_timings: load time = 1698.18 ms whisper_print_timings: fallbacks = 0 p / 0 h whisper_print_timings: mel time = 29.04 ms whisper_print_timings: sample time = 78.72 ms / 232 runs ( 0.34 ms per run) whisper_print_timings: encode time = 34552.24 ms / 2 runs ( 17276.12 ms per run) whisper_print_timings: decode time = 77.53 ms / 2 runs ( 38.77 ms per run) whisper_print_timings: batchd time = 3319.69 ms / 223 runs ( 14.89 ms per run) whisper_print_timings: prompt time = 547.42 ms / 43 runs ( 12.73 ms per run) whisper_print_timings: total time = 40676.03 ms
whisper_print_timings: load time = 1131.77 ms whisper_print_timings: fallbacks = 0 p / 0 h whisper_print_timings: mel time = 27.35 ms whisper_print_timings: sample time = 68.56 ms / 208 runs ( 0.33 ms per run) whisper_print_timings: encode time = 31061.01 ms / 2 runs ( 15530.50 ms per run) whisper_print_timings: decode time = 0.00 ms / 1 runs ( 0.00 ms per run) whisper_print_timings: batchd time = 527.58 ms / 201 runs ( 2.62 ms per run) whisper_print_timings: prompt time = 102.62 ms / 44 runs ( 2.33 ms per run) whisper_print_timings: total time = 32967.73 ms