doc:appunti:linux:video:subtitleripper
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
doc:appunti:linux:video:subtitleripper [2024/02/01 11:13] – [OCR the images from the .sub file] niccolo | doc:appunti:linux:video:subtitleripper [2024/02/01 11:56] (current) – [How to rip DVD subtitles with vobsub2srt] niccolo | ||
---|---|---|---|
Line 9: | Line 9: | ||
* **lsdvd** - From the official Debian repository. | * **lsdvd** - From the official Debian repository. | ||
* **vobcopy** - From the official Debian repository. | * **vobcopy** - From the official Debian repository. | ||
+ | * **mediainfo** - From the official Debian repository. | ||
* **mkvtoolnix** - From the official Debian repository. | * **mkvtoolnix** - From the official Debian repository. | ||
* **vobsub2srt** - From the Deb Multimedia repository. | * **vobsub2srt** - From the Deb Multimedia repository. | ||
===== Ripping the .vob from the DVD ===== | ===== Ripping the .vob from the DVD ===== | ||
+ | |||
+ | A DVD can contain several **titles** and you should identify which one you want to rip; generally it is the longer one or the one with most chapters. We check the DVD content using the **lsdvd** tool: | ||
+ | |||
+ | < | ||
+ | lsdvd /dev/sr0 | ||
+ | Disc Title: DVD_TITLE | ||
+ | Title: 01, Length: 01: | ||
+ | Title: 02, Length: 00: | ||
+ | Title: 03, Length: 00: | ||
+ | Title: 04, Length: 00: | ||
+ | Title: 05, Length: 00: | ||
+ | Title: 06, Length: 00: | ||
+ | Longest track: 01 | ||
+ | </ | ||
+ | |||
+ | The longest title is the **#1**, so we will extract it using **vobcopy**: | ||
<code bash> | <code bash> | ||
vobcopy -n ' | vobcopy -n ' | ||
</ | </ | ||
+ | |||
+ | The resulting file will be saved into the working directory (as specified by the **%%-o%%** option) and it will be named by the DVD title, something like **DVD_TITLE.vob**. | ||
+ | |||
+ | You can inspect the content of the file using the **mediainfo** tool, in our case the file contains one video stream, two audio streams and three subtitle streams. The subtitles are in the standard DVD format: VobSub, which is a images (bitmap) format, not text. | ||
+ | |||
===== Converting the .vob into .mkv format ===== | ===== Converting the .vob into .mkv format ===== | ||
Line 22: | Line 44: | ||
As far I know, there is not a tool capable of extracting the VobSub subtitles directly from the vob file; we might hope that **ffmpeg** was capable of doing this, but it seems not. | As far I know, there is not a tool capable of extracting the VobSub subtitles directly from the vob file; we might hope that **ffmpeg** was capable of doing this, but it seems not. | ||
- | Fortunately the **mkvextract** can extract the VobSub stream from a //mkv// file, so we firstly use ffmpeg to convert the //vob// into //mkv//. In the following example all the stream are copied, without re-encoding. At this step you may want to re-encode the video to squeeze the MPEG2 stream into the more efficient H264 format. | + | Fortunately the **mkvextract** |
<code bash> | <code bash> | ||
Line 37: | Line 59: | ||
===== Extracting .sub and .idx files from the .vob ===== | ===== Extracting .sub and .idx files from the .vob ===== | ||
+ | |||
+ | From the //mkv// file it is now possibile to create **two files** (.sub and .idx) for each subtitles stream. The stream numbering expected by '' | ||
<code bash> | <code bash> | ||
mkvextract ' | mkvextract ' | ||
</ | </ | ||
+ | |||
+ | The result will be two files: **subtitles-3.sub** and **subtitles-3.idx**. It is possible to repeat the command to extract the other subtitles (**#4** and **#5** in our example). | ||
===== OCR the images from the .sub file ===== | ===== OCR the images from the .sub file ===== | ||
<code bash> | <code bash> | ||
- | vobsub2srt --ifo ' | + | vobsub2srt --ifo ' |
</ | </ | ||
The .IFO file is required to get the correct palette, width and hight, but it is not mandatory. | The .IFO file is required to get the correct palette, width and hight, but it is not mandatory. | ||
doc/appunti/linux/video/subtitleripper.1706782423.txt.gz · Last modified: 2024/02/01 11:13 by niccolo