You are viewing a single comment's thread from:

RE: Automating Multi-Lingual and Multi-Speaker Closed-Captioning and Transcripting Workflow with srt2vtt

in #beyondbitcoin8 years ago

Well it just makes no sense prefiltering when with Speech-to-Text you could see where the actual words are in time and filter the noise out accordingly. That's what you can see in the YouTube web app but it doesn't let you edit the audio unfortunately. But if any mulitrack editor could have the words alongside it like YT web app does that would be excellent.

Sort:  

not sure I understand this, but exporting YouTube caption via WebVTT gaves me the individual timings of each word shown in the auto-captioning. The multitrack editing (cutting across all tracks) phase would take place BEFORE any captioning takes place, and you would have already removed any of those undesirable "artifacts".

Interestingly though you might be able to do it that way too in reverse, get combined transcript and convert into audacity labels.. http://wiki.audacityteam.org/wiki/Movie_subtitles_(*.SRT)

Even if all the artifacts aren't there, maybe it makes it easier to find some of them to cut out.

btw, also added a feature to srt2vtt that converts caption files to the audacity label format:

  • srt2vtt audacity transcript.{srt|vtt}
    outputs audacity-compatible text labels from captions file