Speedy Offline Speech to Text using Node.js
, Typescript
, and Vosk
- an offline open source speech recognition toolkit.
Clone the repository ๐
You can find it yourself at https://github.com/EdwardKrayer/typescript-vosk-demo
, or follow the instructions below for a short demo. Open up PowerShell
and run the following commands:
|
|
Installation ๐
Install package dependencies by running the following:
|
|
You can find a lots of models at Alpha Cephei
, you’ll want to extract it in to “models” in the projects base directory, in this example we’re using vosk-model-small-en-us-0.15
. Open up PowerShell
and run the following commands from your projects base directory:
|
|
Compile ๐
|
|
Test ๐
|
|
If all goes well, you should something similiar to the below output, enjoy!
1LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=10 max-active=3000 lattice-beam=2
2LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
3LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
4LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
5LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0320191 seconds in looped compilation.
6LOG (VoskAPI:ReadDataFiles():model.cc:248) Loading i-vector extractor from ../model/ivector/final.ie
7LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
8LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
9LOG (VoskAPI:ReadDataFiles():model.cc:281) Loading HCL and G from ../model/graph/HCLr.fst ../model/graph/Gr.fst
10LOG (VoskAPI:ReadDataFiles():model.cc:302) Loading winfo ../model/graph/phones/word_boundary.int
11VLOG[2] (VoskAPI:AccStats():sausages.cc:197) L = 11.1962
12VLOG[2] (VoskAPI:MbrDecode():sausages.cc:51) Changing word 152188 to 55014
13VLOG[2] (VoskAPI:MbrDecode():sausages.cc:98) Iter = 0, delta-Q = -0.236686
14VLOG[2] (VoskAPI:AccStats():sausages.cc:197) L = 10.9595
15VLOG[2] (VoskAPI:MbrDecode():sausages.cc:98) Iter = 1, delta-Q = 0
16VLOG[2] (VoskAPI:PrintDiagnostics():online-ivector-feature.cc:369) By the end of the utterance, objf change/frame from estimating iVector (vs. default) was 35.3905 and iVector length was 11.8675
17{
18 "text" : "what if somebody decides to break it be careful that you keep adequate coverage but look for places to save money maybe it's taking longer to get things squared away than the bankers expected hiring the wife for one company may when her tax say that retirement income to boost is helpful but inadequate new self the saving rags or hurriedly tossed on the to naked bones what a discussion can ensue when the title of this type of song is in question there is no dying or waxing or gassing need a paperweight may be person last known back barclays leather hard place work on a flat surface and smooth out the simplest kind of separate system uses a single self contained unit the old shop added still holds a good mechanic is usually a bad boss so figures would go higher in later years some make beautiful chairs cabinets chest doll houses etc"
19}
20Time Taken to execute = 4.747618900000118 seconds