doc:appunti:linux:video:ripping_dvds_with_mencoder
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| doc:appunti:linux:video:ripping_dvds_with_mencoder [2017/10/12 23:34] – [OCRing] niccolo | doc:appunti:linux:video:ripping_dvds_with_mencoder [2020/04/21 17:05] (current) – [OCRing] niccolo | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Ripping DVDs with Mencoder ====== | ====== Ripping DVDs with Mencoder ====== | ||
| + | :!: For a simple recipe to rip (extract) the content of a DVD using Debian 10, see **[[vobcopy]]**. | ||
| ===== Install the necessary programs ===== | ===== Install the necessary programs ===== | ||
| Line 200: | Line 201: | ||
| ===== Extract Subtitles with transcode ===== | ===== Extract Subtitles with transcode ===== | ||
| + | |||
| + | FIXME The following programs are **missing in Debian 10 Buster**: **tcextract**, | ||
| DVDs have subtitles stored as images. There are some options for dealing with them: | DVDs have subtitles stored as images. There are some options for dealing with them: | ||
| Line 264: | Line 267: | ||
| < | < | ||
| - | cat subtitles_stream.ps1 | subtitle2pgm | + | cat subtitles_stream.ps1 | subtitle2pgm |
| </ | </ | ||
| + | |||
| + | If you want to control how grey levels are converted, try to use the **%%-c%%** option of subtitle2pgm, | ||
| Each subtitle should now be one file named like **movie_subtitle0003.pgm**, | Each subtitle should now be one file named like **movie_subtitle0003.pgm**, | ||
| - | Now to ocr all that with gocr (using a nice wrapper for the job): | + | === With Tesseract OCR === |
| + | |||
| + | <code bash> | ||
| + | #!/bin/sh | ||
| + | find . -type f -name ' | ||
| + | echo -n " | ||
| + | tesseract -l eng --psm 4 " | ||
| + | done | ||
| + | </ | ||
| + | |||
| + | === With Gocr === | ||
| + | |||
| + | **NOTICE**: Dont' use the following, because Gocr is not the best tool for OCR. Use **Tesseract OCR** instead. | ||
| + | |||
| + | To ocr all the .pgm image with **gocr** (using a nice wrapper for the job): | ||
| < | < | ||
| Line 277: | Line 296: | ||
| It will prompt you for tons of characters that it doesn' | It will prompt you for tons of characters that it doesn' | ||
| - | We will re-merge all these text files produced into a big subtitle file: | + | ==== Make a single .srt file ==== |
| + | |||
| + | Now we will re-merge all these text files produced into a big subtitle file: | ||
| < | < | ||
| Line 302: | Line 323: | ||
| You can now add english.srt onto the end of your '' | You can now add english.srt onto the end of your '' | ||
| + | ==== Fixing time, etc ==== | ||
| + | |||
| + | Finally you can proof-check the final .srt file using the graphical interface of **Gaupol**, a full-featured subtitle editor program. It can handle some of the more common operation required: | ||
| + | |||
| + | * **Shift times**, from //Tools//, //Shift Positions...// | ||
| + | * **Renumber subtitles**, | ||
| ===== Links ===== | ===== Links ===== | ||
doc/appunti/linux/video/ripping_dvds_with_mencoder.1507844059.txt.gz · Last modified: by niccolo
