Sphinx Decoders

Mosur K. Ravishankar (aka Ravi Mosur)
Sphinx Speech Group
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213

What's Around and Where

Name Remarks
Sphinx-3.3 (fast decoder)
  • Location: Open source, release available.
  • Fast Sphinx-3 decoder using lextree organization:
    • 5-10x real time speed on large vocabulary tasks
    • Continuous density acoustic models only
    • Batch-Mode or live operation
  • gausubvq: Sub-vector clustered acoustic model building
    • Needed for fast acoustic model evaluation
Sphinx-3.2
  • Location: Open source, module archive_s3/s3.2 in cvs tree.
  • Same features as s3.3, but capable of batch-mode operation only.
Sphinx-3 (slow decoder)
  • Location: Open source, module archive_s3/s3 in cvs tree.
  • Original Sphinx-3 decoder
  • Slow; 50-100x real time speed on large vocabulary tasks
  • Any kind of acoustic model (discrete, semi-continuous, continuous, others)
  • Major applications:
    • s3decode and s3decode-anytopo: Speech-to-text Decoding
    • s3align: Forced alignment
    • s3allphone: Allphone decoding
    • s3astar: A* search, nbest generation
    • s3dag: Shortest-path search
  • Other utilities:
    • stseg-read: State-segmentation binary file reader
    • sen2s2: Sphinx-II "sendump" file creation from Sphinx-3 acoustic model
Sphinx-2 (fbs8)
  • Location: Open source, release available.
  • Sphinx-II decoder
  • Real-time operation
  • Semi-continuous, Sphinx-II acoustic models only (Sphinx-II format)
  • User applications support:
    • Compiled into a library with a straightforward API for building speech-enabled applications
    • Continuous-listening support
    • Dynamic language model loading and switching
  • Several test applications:
    • Basic dictation with and without "push-to-talk"
    • Basic audio recording and playback
    • Audio segmentation using the continuous listener
  • Additional recognition modes:
    • Forced alignment
    • Allphone decoding
    • A* search, nbest generation
    • Shortest-path search
¤
Maintained by Evandro B. Gouvêa
Last modified: Mon Nov 25 18:25:40 EST 2002