doc:appunti:linux:video:vobcopy
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
doc:appunti:linux:video:vobcopy [2021/02/16 16:04] – [How to rip subtitles from the DVD] niccolo | doc:appunti:linux:video:vobcopy [2021/02/19 17:13] (current) – [Converting a DVD with subtitles to MKV using ffmpeg] niccolo | ||
---|---|---|---|
Line 60: | Line 60: | ||
</ | </ | ||
- | FIXME The .IFO file is required to know the palette to apply to the bitmaps. | + | The .IFO file is required to know the palette to apply to the bitmaps. |
- | The following command will do the OCR on each subtitle image using **tesseract** (it requires several minutes to run): | + | The following command, working on the two files **vobsubs-it.idx** and **vobsubs-it.sub**, |
< | < | ||
vobsub2srt \ | vobsub2srt \ | ||
--ifo / | --ifo / | ||
- | --index 0 \ | ||
--dump-images \ | --dump-images \ | ||
--tesseract-lang ita \ | --tesseract-lang ita \ | ||
Line 74: | Line 73: | ||
The result will be a **vobsubs-it.srt** text file, containing the subtitles text and timing information. If you want to keep **one pgm image** file for each subtitle, add the **%%--dump-images%%** option. | The result will be a **vobsubs-it.srt** text file, containing the subtitles text and timing information. If you want to keep **one pgm image** file for each subtitle, add the **%%--dump-images%%** option. | ||
+ | |||
+ | ===== Converting a DVD with subtitles to MKV using ffmpeg ===== | ||
+ | |||
+ | I got a rather complicate DVD to rip from, basically the problems are: | ||
+ | |||
+ | * Subtitles are in **dvdsub** format (which is normal for DVD), which need **palette** info to be displayed correctly. | ||
+ | * Different subtitles streams **start at different times**, some do start **after several minutes**. The automatic detection performed by '' | ||
+ | * **Languages of subtitles** are not automatically detected. | ||
+ | |||
+ | === Inspect the disk === | ||
+ | |||
+ | Using lsdvd directly on the DVD disk, you can see the **video** tracks, **audio** streams and **subtitles** availables: | ||
+ | |||
+ | < | ||
+ | lsdvd -s /dev/dvd | ||
+ | Disc Title: FREEDOMDOWNTIME | ||
+ | Title: 01, Length: 02: | ||
+ | Subtitle: 01, Language: da - Dansk, Content: Undefined, Stream id: 0x20, | ||
+ | Subtitle: 02, Language: de - Deutsch, Content: Undefined, Stream id: 0x21, | ||
+ | ... | ||
+ | Title: 02, Length: 01: | ||
+ | Title: 03, Length: 00: | ||
+ | Title: 04, Length: 00: | ||
+ | </ | ||
+ | |||
+ | === Rip the track === | ||
+ | |||
+ | First of all I **ripped the first track** (the only one I'm really interested in) from the DVD into a directory: | ||
+ | |||
+ | < | ||
+ | vobcopy -n 1 -i /dev/dvd --large-file -o ./track1/ | ||
+ | </ | ||
+ | |||
+ | Using the **mediainfo** tool you can inspect the resulting vob file to verify that **video**, **audio** and **text** (subtitles) streams are the ones we expect. | ||
+ | |||
+ | === Get subtitles palette info === | ||
+ | |||
+ | Then I extracted the first (#0) **dvdsub stream** (there are 22!) from the DVD: | ||
+ | |||
+ | < | ||
+ | mencoder -dvd-device /dev/dvd dvd:// | ||
+ | -nosound \ | ||
+ | -ovc ' | ||
+ | -ifo / | ||
+ | -sid 0 -vobsubout vobsubs-sid0 | ||
+ | </ | ||
+ | |||
+ | This command will produce two files: **vobsubs-sid0.idx** | ||
+ | |||
+ | < | ||
+ | palette: d7410d, 101010, 0e00d7, d5ccc9, d4b1cb, aac5d0, abd3af, d5ff0c, | ||
+ | | ||
+ | </ | ||
+ | |||
+ | As an alternative you can get the the **.IFO** of the track (for the first track it is **VIDEO_TS/ | ||
+ | |||
+ | Also **lsdvd** should be able to print the palette, using the option **-P**. But in my tests it produced a palette with different color values, which displayed incorrectly in the final result. | ||
+ | |||
+ | === Transcode with ffmpeg === | ||
+ | |||
+ | Finally I launched the **ffmpeg** incantation: | ||
+ | |||
+ | <code bash> | ||
+ | ffmpeg -probesize 500M -analyzeduration 500M \ | ||
+ | -palette ' | ||
+ | -i ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -map ' | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata title=' | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -metadata: | ||
+ | -codec:s ' | ||
+ | -vf yadif \ | ||
+ | -codec:v ' | ||
+ | -b:v ' | ||
+ | -ac 2 -codec:a ' | ||
+ | ' | ||
+ | </ | ||
+ | |||
+ | Without the **%%-probesize%%** and **%%-analyzeduration%%** options (both are required), '' | ||
+ | |||
+ | < | ||
+ | Stream map ' | ||
+ | </ | ||
+ | |||
+ | If you don't explicitly map the streams, you will get only a warning message during the transcode: | ||
+ | |||
+ | < | ||
+ | New subtitle stream 0:27 at pos:8284174 and DTS: | ||
+ | </ | ||
+ | |||
+ | I mapped (i.e. selected to be inserted into the output) the **video track**, then **two adio tracks** (there were four), and finally **24 text subtitles tracks** (they are actually bitmaps in dvdsub format). The order of the **%%-map%%** options is used to re-arrange the position of the subtitles, overriding the autodetect performed by '' | ||
+ | |||
+ | It is mandatory to use the **%%-codec: | ||
+ | |||
+ | Yes, the source video has annoying **interlacing artifacts**, | ||
+ |
doc/appunti/linux/video/vobcopy.1613487843.txt.gz · Last modified: 2021/02/16 16:04 by niccolo