Subtitle Worflow: OCR or download & timeshift?

So I have a lot of movies & shows with image-based subtitles (i.e. PGS, VobSub, etc.). I’m interested in converting them to text-based (probably SRT and VTT).

Option 1: OCR
I discovered Subtitle Edit yesterday (thank you @OttoKerner for this excellent intro). I went through the process for one movie, did a lot of clean-up and finally got some great results. The great part of this is, because the subtitles came from my own rip, the timings are perfect. The down side is this was a rather significant manual effort for 1 movie and I need to do this for hundreds.

Option 2: Download and Timeshift
I have tried Plex’ built-in ability to search for a download subtitles from external sources. I abandoned it because I have never found 3rd party subtitles where the timings aligned with my media. The further you get in the movie, the more misaligned the subtitles are.

I’ve heard of (and saw in Subtitle Edit yesterday) functionality that allows you to adjust timings on subtitles. I’m assuming that while you can, you don’t have to manually adjust every timing entry, but can probably pick a beginning & ending anchor point then stretch/shrink all the timings accordingly. Then, in theory, you get properly aligned subtitles and have bypassed the OCR process.

Which brings me to my question: before I embark on this journey, have any of you gone down these paths and what recommendations would have?

It seems like more clients are now supporting direct playback of those image based subtitles… raising the question why to go through the effort :wink:

Totally hear you. Part of my motivation is some of the subtitle images are really crappy (from older DVDs). So if there’s a low effort/good results path to be taken, I’d love to know. But if the only options folks are aware of require a lot of time and manual effort per movie, I’m happy to live with crappy subtitle images.

Based on my experience, crappy subtitles take more effort to clean after OCR’ing. That being said… my experience goes back a few years and might be a bit outdated. I still consider OCR’ing your own subtitles to give you a better quality then manually tinkering with the timelines of those subtitles (unless you’re e.g. from the US looking for US/English subtitles).

1 Like

Yeah, that is possible in Subtitle Edit. I use it all the time, exactly like you describe it.
You can save some time for OCR and editing the results if you just OCR the first and last 10 lines of a graphical subtitle.
Then use “Point Sync via another subtitle” to sync a downloaded subtitle to that. Works usually very well (unless you downloaded a subtitle which was for a different “cut” version).

There are some finer details when doing this.
If you try to sync a different language or different translation, the result are not always perfect. Sometime one subtitle combines several lines into one line. So you need to dig in a little bit more and find a few shorter lines, which are relatively isolated and so you can be certain that both subtitle are in sync around those lines.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.