Group: AutomaticTranscription

From LibrePlanet
Jump to: navigation, search
Line 6: Line 6:
  
 
[http://cmusphinx.sourceforge.net CMU Sphinx] might be usable for transcription.  If I (cwebber) remember correctly, there was some effort to look into video transcription for PyCon using CMU Sphinx, but it didn't become production ready.
 
[http://cmusphinx.sourceforge.net CMU Sphinx] might be usable for transcription.  If I (cwebber) remember correctly, there was some effort to look into video transcription for PyCon using CMU Sphinx, but it didn't become production ready.
 +
: seems like mostly a "GUI problem"; shouldn't it be possible to clone universalsubtitles and run sphinx on the audio track of whatever video is uploaded to create a first-pass transcription? After that, the main problem is creating LM's for new languages, but that's an upstream issue …
  
 
I think [http://simon-listens.org simon] might be worth a look when considering speech recognition solutions. It may focus on command and control right now but the underlying technology should allow for dictation with the right speech model. In that regard, I'd also like to point to [http://voxforge.org Voxforge]. (Disclaimer: I am a developer of simon)
 
I think [http://simon-listens.org simon] might be worth a look when considering speech recognition solutions. It may focus on command and control right now but the underlying technology should allow for dictation with the right speech model. In that regard, I'd also like to point to [http://voxforge.org Voxforge]. (Disclaimer: I am a developer of simon)

Revision as of 04:19, 7 October 2011

Automatic transcription is an FSF Priority Project. We need free software that is capable of transcribing recordings. YouTube is starting to offer this service, but this is a kind of computing we should be doing on our systems with free software.

If you are interested, please join this group (start by leaving your name here). We should start by surveying any existing free software that is in or close to this area, and making a list of features that are needed.

[UniversalSubtitles.org] is not automated, but provides great user interface for creating transcriptions & subtitles for online video (an audio). Ideal for public content. It is a collaborative platform (one could call it a "wiki with an ui dedicated to subtitling").

CMU Sphinx might be usable for transcription. If I (cwebber) remember correctly, there was some effort to look into video transcription for PyCon using CMU Sphinx, but it didn't become production ready.

seems like mostly a "GUI problem"; shouldn't it be possible to clone universalsubtitles and run sphinx on the audio track of whatever video is uploaded to create a first-pass transcription? After that, the main problem is creating LM's for new languages, but that's an upstream issue …

I think simon might be worth a look when considering speech recognition solutions. It may focus on command and control right now but the underlying technology should allow for dictation with the right speech model. In that regard, I'd also like to point to Voxforge. (Disclaimer: I am a developer of simon)


"software" is not in the list (interest, location, project, school) of allowed values for the "Organized around" property.