doc:appunti:linux:video:ripping_dvds_with_mencoder
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
doc:appunti:linux:video:ripping_dvds_with_mencoder [2017/10/12 10:20] – [Extracting the subtitles] niccolo | doc:appunti:linux:video:ripping_dvds_with_mencoder [2020/04/21 17:05] (current) – [OCRing] niccolo | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Ripping DVDs with Mencoder ====== | ====== Ripping DVDs with Mencoder ====== | ||
+ | :!: For a simple recipe to rip (extract) the content of a DVD using Debian 10, see **[[vobcopy]]**. | ||
===== Install the necessary programs ===== | ===== Install the necessary programs ===== | ||
Line 200: | Line 201: | ||
===== Extract Subtitles with transcode ===== | ===== Extract Subtitles with transcode ===== | ||
+ | |||
+ | FIXME The following programs are **missing in Debian 10 Buster**: **tcextract**, | ||
DVDs have subtitles stored as images. There are some options for dealing with them: | DVDs have subtitles stored as images. There are some options for dealing with them: | ||
Line 264: | Line 267: | ||
< | < | ||
- | subtitle2pgm | + | cat subtitles_stream.ps1 | subtitle2pgm |
</ | </ | ||
- | Each subtitle should now be one pgm file, and a srtx file will be created | + | If you want to control how grey levels are converted, try to use the **%%-c%%** option of subtitle2pgm, |
- | Now to ocr all that with gocr (using a nice wrapper for the job): | + | Each subtitle should now be one file named like **movie_subtitle0003.pgm**, |
+ | |||
+ | === With Tesseract OCR === | ||
+ | |||
+ | <code bash> | ||
+ | #!/bin/sh | ||
+ | find . -type f -name ' | ||
+ | echo -n " | ||
+ | tesseract -l eng --psm 4 " | ||
+ | done | ||
+ | </ | ||
+ | |||
+ | === With Gocr === | ||
+ | |||
+ | **NOTICE**: Dont' use the following, because Gocr is not the best tool for OCR. Use **Tesseract OCR** instead. | ||
+ | |||
+ | To ocr all the .pgm image with **gocr** (using a nice wrapper for the job): | ||
< | < | ||
- | pgm2txt | + | pgm2txt |
</ | </ | ||
It will prompt you for tons of characters that it doesn' | It will prompt you for tons of characters that it doesn' | ||
- | We will re-merge all these text files produced into a big subtitle file: | + | ==== Make a single .srt file ==== |
+ | |||
+ | Now we will re-merge all these text files produced into a big subtitle file: | ||
< | < | ||
- | srttool -s -w < english.srtx > english.srt | + | srttool -s -w < movie_subtitle.srtx > movie_subtitle.srt |
</ | </ | ||
Line 302: | Line 323: | ||
You can now add english.srt onto the end of your '' | You can now add english.srt onto the end of your '' | ||
+ | ==== Fixing time, etc ==== | ||
+ | |||
+ | Finally you can proof-check the final .srt file using the graphical interface of **Gaupol**, a full-featured subtitle editor program. It can handle some of the more common operation required: | ||
+ | |||
+ | * **Shift times**, from //Tools//, //Shift Positions...// | ||
+ | * **Renumber subtitles**, | ||
===== Links ===== | ===== Links ===== | ||
doc/appunti/linux/video/ripping_dvds_with_mencoder.1507796420.txt.gz · Last modified: 2017/10/12 10:20 by niccolo