Differences

This shows you the differences between two versions of the page.

--- doc:appunti:linux:video:ffmpeg [2018/07/27 07:58] – [Rotate and Normalize] niccolo
+++ doc:appunti:linux:video:ffmpeg [2024/07/25 08:12] (current) – [AVC (x264) is better than ASP (xvid4)] niccolo
@@ Line 33: / Line 33: @@
 Other muxer brand availables are: **mp41**, **mp42**, **isom**, **iso2**, ... (discovered inspecting the binary file with ''strings'', which alternatives?).
-===== Normalize (Resize and Re-encode) =====
+===== Mixing videos from different sources =====
+Sometimes we needed to mix video clips originating from different sources. To apply our simple cut-and-paste recipe (wich preserve the original video quality, without re-encoding), we normalized all the "different" video clips to the same format as the one produced by the Xiaomi Yi camera.
+One annoying effect of videos created by joining clips with different encoding parameters, is seen in **mplayer**: if you switch the play at **full screen**, the videos jumps back automatically at **non-full screen** when there is such a clip join.
+==== Normalize (Resize and Re-encode) ====
 If we want to mix videos from **different sources** (e.g. a smartphone and the Xiaomi Yi camera), we first convert the clips all into **the same format**.
@@ Line 65: / Line 70: @@
 ^ Frame rate                 | 29.970 (30000/1001) FPS     | -r 30000/1001                    |
-===== Rotate and Normalize =====
+==== Rotate ====
-Sometime it is necessary to **rotate a video by 180 degrees** (e.g. a video made with a smartphone, using the wrong orientation). We can apply the same normalize recipe adding the **transpose** filter to the **%%-vf%%** option.
+Sometime it is necessary to **rotate a video by 180 degrees** (e.g. a video made with a smartphone, using the wrong orientation). It is necessary to **remove the video rotation metadata** beforehand, because ffmpeg does not seem able to remove them and apply the video filter in a single pass.
-It is necessary to remove the video rotation metadata beforehand, because ffmpeg does not seem able to remove them and apply the video filter in a single pass.
 <code>
@@ Line 75: / Line 78: @@
 </code>
-FIXME Add the complete recipe! Explain the transpose=1 filter.
+Then the ffmpeg **transpose** video filter is required, to rotate a video by 180 degrees we need to apply the ''transpose=1'' two times:
+<code>
+ffmpeg -i tmp_video.mp4 -vf transpose=1,transpose=1 rotated_video.mp4
+</code>
+You can apply the transpose and the normalization (if required, see the above paragraph) in a single pass: just add the transpose operation to the whole normalization recipe above.
+**INFO**: The number accepted by the transpose filter means:
+  * 0 = Rotate 90 deg. counterclockwise and do a vertical flip (default)
+  * 1 = Rotate 90 deg. clockwise
+  * 2 = Rotate 90 deg. counterclockwise
+  * 3 = Rotate 90 deg. clockwise and do a vertical flip
+A 90-degree rotation involves the problem of the height / width ratio, so a more complex recipe is needed, such as the one described on the page: **[[fix_smartphone_portrait_videos]]**.
+==== Add an Audio Track ====
+The **time-lapse videos** taken by the Xiaomi Yi Action camera **do not have an audio track**. This causes problems when cutting and concatenating clips with the recipes presented below: you can see the play (with **mplayer**) which **freezes for several seconds** at the joining point of a clip with no audio and a clip with audio.
+So I add a silence audio track to theese clips with **ffmpeg**:
+<code>
+ffmpeg -f lavfi -i anullsrc=sample_rate=48000 -i timelapse_video.mp4 \
+    -shortest -c:a libfdk_aac -b:a 128k -c:v copy timelapse_video_silence.mp4
+</code>
 ===== Concatenate the Clips =====
@@ Line 236: / Line 264: @@
 </code>
-====== Final rendering (re-encoding) ======
-The video stream recorded by the Xiaomi Yi camera is **1920x1080 pixels** at a variable bitrate of **12.0 Mb/s**. Because we watch it on a simple TV set capable only of 1366x768 pixels, we we re-encode it with the following settings:
+====== Re-encoding with tonal correction ======
-^ Video codec    | XViD4  |
+We had some video clips recorded with an **SJCAM Sj8 Pro** camera with a **bad color balance and saturation** due some bad tables [[..:..:hardware:sjcam-8pro-custom-firmware|loaded into the firmware]]. It is possibile to re-encode all the video clips applying an equalization filter keeping all the encoding paramteres as similar as possibile to the original ones.
-^ Video filter   | swresize, 1366x768, Bilinear  |
-^ Video bitrate  | Two Pass, average bitrate 6000 kb/s  |
+The video clips were **extracted from the original MP4 container** as **[[wp>MPEG transport stream|MPEG-TS]]** snippets containing only video (no audio). To re-encode each clip we used the following **ffmpeg** recipe:
-^ Audio codec    | Lame MP3  |
-^ Audio bitrate  | CBR 192  |
+<code bash>
+#!/bin/sh
+#
+# Re-encode video clips in MPEG transport stream (MPEG-TS) format applying
+# some saturation and gamma correction.
+#
+# saturation:           In range 0.0 to 3.0. The default value is "1".
+# gamma_{r|g|b}         In range 0.1 to 10.0. The default value is "1".
+INPUT="$1"
+OUTPUT="$INPUT.eq.ts"
+EQ_FILTER="eq=saturation=0.88:gamma_r=0.917:gamma_g=1.007:gamma_b=1.297"
+# Produces MPEG segments like the ones produced by the SJCAM SJ8Pro:
+ffmpeg -i "$INPUT" \
+    -vf "$EQ_FILTER" \
+    -codec:v libx264 \
+    -preset veryslow -profile:v main -level:v 4.2 -pix_fmt yuvj420p \
+    -x264-params 'vbv-maxrate=38000:vbv_bufsize=20000:nal-hrd=vbr:force-cfr=1:keyint=8:bframes=0:scenecut=-1:ref=1' \
+    -keyint_min 8 -brand avc1 -f 3gp \
+    -bsf:v h264_mp4toannexb -f mpegts \
+    "$OUTPUT"
+</code>
+The gamma correction for the three RGB channels was determined with the GIMP, using the //Colors// => //Levels// => //Pick the gray point for all channels// tool. The use of MPEG-TS clips allowed the montage of the final video by just concatenating them.
+===== AVC (x264) is better than ASP (xvid4) =====
+See this page: **[[https://www.avidemux.org/admWiki/doku.php?id=general:common_myths|Common myths]]** to understand the differences between formats (standards) and codecs (pieces of software). Read also this simple page: **[[https://www.cyberlink.com/support/product-faq-content.do?id=1901|Difference between MPEG-4 AVC and MPEG-4 ASP]]**. See also the Wikipedia article about **[[wp>Advanced Video Coding]]**.
+  * **MPEG-4 Part 2 ASP** (Advanced Simple Profile)
+     * Support only 16x16 block size.
+     * Implemented by the Xvid library/codec.
+  * **MPEG-4 Part 10 AVC** (Advanced Video Coding)
+    * Support variable motion compensation block sizes.
+    * Implemented by the x264 library.
+If you want to tweak with x264 codec options, here are some hints on the parameters meaning:
+  * **Preset**: Use the **slow** preset (or less) to achieve best compression ratio, at the expense of more time to encode.
+  * **Tuning**: Use the **film** option for real life scenes.
+  * **Profile**: The **High** profile is OK for high-definition television (e.g. 1280×720, or Blu-ray disc), //High 10// add support for 10 bits per sample, //High 444// up to 14 bits per sample (both for [[wp>High dynamic range|HDR]] videos). Do not use //Main//, which is for standard-definition TV, e.g. 720×576.
+  * **IDC Level**: Leave this parameter to **Auto**, setting it to higer values does not increase the video quality, but imposes higher constraints to the decoder. I.e. to decode a 1280×720@30 stream it is sufficient the 3.1 IDC level, to decode a 1920×1080@30 stream (max 20 Mbps) it is required the 4 IDC level (see [[wp>Advanced Video Coding]]).
 ====== More on ffmpeg Command Line ======
@@ Line 336: / Line 405: @@
 </code>
-the CSV format can be controlled by several options, e.g. if you want to know the key for each filed use:
+the CSV format can be controlled by several options, e.g. if you want to print each field as a **key=val** pair, use:
 <code>
@@ Line 402: / Line 471: @@
 Un test con un video molto mosso (telecamera impugnata a mano su motocicletta in movimento) con **zoom=1** e **smoothing=30** ha ridotto il film a circa il 77% dell'originale sulle due dimensioni. Dei 1920 pixel originali di larghezza ne sono stati estrapolati circa 1480 (il filmato viene comunque scalato per mantenere la dimensione originale). Impostando lo **zoom=0.5** si sono recuperati circa 15 pixel lineari (riduzione al 78% dell'originale).
-====== Doppiaggio audio con Ardour ======
+===== Aggiunta sottotitoli =====
-Per doppiare un video sostitutendo o mixando l'audio originale con musiche, ecc. si può utilizzare il programma Ardour. In questi appunti un possibile metodo di lavoro da utilizzare.
+Se si ha un file con i sottotitoli per un video, è possibile fare il muxing:
-===== Estrazione traccia audio originale =====
+<code>
+ffmpeg -i movie.mkv -i movie.sub -c copy \
+    -sub_charenc UTF-8 -c:s srt -metadata:s:s:0 language=eng \
+    movie_sub.mkv
+</code>
-La traccia audio originale verrà usata come riferimento per allineare i brani musicali da abbinare al video.
+====== Deinterlace ======
-**ATTENZIONE** alla conversione del file in formato WAV: controllare con ''mediainfo'' che la durata sia esattamente quella del video originale, al secondo! Può capitare che l'estrazione con ''ffmpeg'' e la successiva conversione in wav (ad esempio con Audacity) produca un file più corto di qualche secondo (1 secondo ogni 10 minuti di durata).
+We can use the video filter named **yadif** (//yet another deinterlacing filter//). In this example the result was encoded in MPEG-4 AVC using the libx264 library, forcing one keyframe every 8 frames:
 <code>
-ffmpeg -i video.mp4 -vn -c:a copy audio.m4a
+ffmpeg -i input-video.mkv -codec:a copy -vf yadif -codec:v libx264 -preset slow \
-ffmpeg -i 2018-05_portogallo.m4a -af aresample=async=1 2018-05_portogallo.fix.wav
+        -x264-params 'force-cfr=1:keyint=8:bframes=0:ref=1' \
+        output-video.mkv
 </code>
-===== Conversione dei brani audio =====
+====== Problem in MKV Remux ======
-Per importare in Ardour i brani musicali vanno convertiti in WAV quelli non direttamente supportati (es. i file MP3 e le tracce M4A solitamente estratte dai video MP4).
+It seems there is a bug in ffmpeg **[[https://trac.ffmpeg.org/ticket/6037|#6037 mkv muxing not broken]]**: muxing two working files into a mkv produces a broken file: seeking around can break (mute) audio. I experienced this bug (with ffmpeg 4.1.6) trying to mux one mkv file containing one audio and one subtitle streams to another mkv file conaining video and audio. The resulting file did not play good in mplayer: seeking into the file caused audio or vido to stop playing.
-===== Manipolazione di tracce e regioni =====
+This was the first try command line:
-Con Ardour si inizia un **nuovo progetto**, impostare **48 kHz, 32 bit float, stereo**, che sono i parametri che vanno per la maggiore nei video MP4 in alta risoluzione. Quando il progetto è avviato (pulsante **Start**) il sistema audio ALSA sarà impegnato in modo esclusivo, altri programmi non potranno usare l'audio.
+<code>
+# The resulting video is broken.
+ffmpeg -i input_file1.mkv -i input_file2.mkv \
+    -map '0:v:0' -map '0:a:0' \
+    -map '1:a:0' -map '1:s:0' \
+    -codec:v copy -codec:a copy -codec:s copy \
+    output_file.mkv
+</code>
-Con Ardour si importano le varie **tracce musicali** (menu //Session// => //Import//), se necessario il programma provvede automaticamente alla conversione del sample rate.
+The workaround was to extract each individual stream, and mux then together:
-Ogni **traccia** può essere manipolata separatamente (es. spostata nella timeline, ecc.), inizialmente la traccia è costituita da un'unica **regione**, ma è possibile effettuare ad esempio un taglio per ottenere due regioni manipolabili separatamente. L'oggetto che viene manipolato infatti è la singola regione, non l'intera traccia.
+<code>
+ffmpeg -i input_file1.mkv -map 0:v:0 -codec:v copy input-v_env.mkv
+ffmpeg -i input_file1.mkv -map 0:a:0 -codec:a copy input-a_ita.mkv
+ffmpeg -i input_file2.mkv -map 0:a:0 -codec:a copy input-a_eng.mkv
+ffmpeg -i input_file2.mkv -map 0:s:0 -codec:s copy input-s_eng.mkv
+ffmpeg \
+    -i input-v_env.mkv \
+    -i input-a_ita.mkv \
+    -i input-a_eng.mkv \
+    -i input-s_eng.mkv \
+    -codec:v copy -codec:a copy -codec:s copy \
+    -map '0' -map '1' -map '2' -map '3' \
+    output_file.mkv
+</code>
-=== Spostare ===
+====== ffmpeg: leggere la sequenza di VOB da un DVD ======
-=== Tagliare un pezzo ===
+Nella directory **VIDEO_TS** di un DVD la traccia principale è normalmente suddivisa in file numerati sequenzialmente, ad esempio: ''VTS_01_0.VOB'', ''VTS_01_1.VOB'', ...
-  * Strumento forbici
+In teoria è sufficiente concatenare i file in un solo file destinazione e quindi trattarlo come un normale file audio/video. Tuttavia è possibile indicare i singoli file come input senza la necessità di occupare ulteriore spazio disco con questa sintassi:
-  * Zoom opportuno
-  * Click sul punto: taglia immediatamente la traccia
-  * Strumento selezione, click sul pezzo da togliere, menu //Region// => //Remove//
-=== Bloccare una singola regione ===
+<code bash>
+SOURCE="concat:VTS_01_1.VOB|VTS_01_2.VOB|VTS_01_3.VOB|VTS_01_4.VOB|VTS_01_5.VOB"
+ffmpeg -i "$SOURCE" ...
+</code>
-  * Dopo averla selezionata, menu //Region// => //Position// => //Lock//
+====== ffmpeg: impostare un ritardo sui sottotitoli durante il muxing ======
-=== Fade in/out ===
+Se un flusso di sottotitoli (ad esempio nel formato Picture based DVD) non indica correttamente l'offset iniziale di riproduzione è possibile dire ad ffmpeg di impostarlo opportunamente in fase di muxing. In questo esempio il primo sottotitolo appare a 44.5 secondi:
-  * Con lo strumento "Grab mode" cliccare una regione
+<code bash>
-  * Trascinare il quadretto che appare all'estremità
+ffmpeg -i video-stream.mkv -i audio-stream.mkv -itsoffset 44.5 -i subtitles-stream.mkv ...
+</code>
-===  Esportare tutto il lavoro in WAV ===
+In generale dovrebbe essere possibile scoprire l'offset quando ffmpeg legge l'intero stream, al momento in cui trova il prmio frame dei subtitles mostra qualcosa del genere sulla console:
+<code>
+[mpeg @ 0x55f98bb2c6c0] New subtitle stream 0:7 at pos:14755854 and DTS:44.5s
+</code>
+====== Parameters for final rendering ======
+See the page **[[ffmpeg_final_rendering]]**.
+====== Doppiaggio audio con Ardour ======
-  * //Session// => //Export// => //Export to Audio File(s)//
+Vedere la pagina dedicata: **[[ardour_dubbing]]**.
-    * File formats: impostare WAV 16 bit, 48 KHz
-    * Time Span: Controllare che sia tutta la lunghezza originale
-    * Channels: selezionare tutti, ad eccezione di "Master" dell'audio in presa diretta (FIXME: ma davverso serve impostare così?)