High quality MPEG-4 transcoding with Mencoder

Homer's picture

Rather than just provide a script, I've decided to present this solution as a sort of informal thesis, so hopefully it will help others to reinterpret it for their own purposes. As of 10th Dec 2011 this article is still a work in progress, so if it seems incomplete then please have patience, and come back later.

Objective: Transcode video from any source to MPEG-4 ASP (note: this is DivX 4/5, not MPEG-4 AVC/H.264), for playback on most standalone devices, keeping the file size reasonable, whilst retaining as much quality as possible, but without any regard to transcoding time or CPU utilisation. In this case I'm also going to hardsub (render subtitles directly onto the output video) a SRT subtitles file previously ripped from the source's forced subs (subtitles that only appear when foreign language is spoken, in a soundtrack that is otherwise in your locale's language). You can rip your own subtitles files using SubRip (Windows, also works under Wine) or Avidemux (multi-platform), or just download them from places like opensubtitles.org.

Note: The method used here is extremely CPU intensive, which may cause your PC to die of exhaustion, and you to die of boredom. You have been warned. :) However, the result is worth it IMHO, as the video quality is exceptional. H.264 generally produces better results at lower bitrates (or so Messiah Jobs keeps telling us), but unfortunately it's not compatible with nearly as many devices as DivX/XviD (e.g. most Smart TVs will play MPEG-4 ASP DivX/XviD files, but not MPEG-4 AVC H.264 files, mainly because H.264 is infested with a ton of nasty patents that make it difficult and expensive for manufacturers to implement). Also for compatibility reasons I've chosen the AVI 2.0 container format, and forced the FourCC from FMP4 to DX50.

This is a two pass transcode, using the turbo video codec parameter for the first pass, which basically just turns off all CPU intensive flags that aren't needed to properly analyse the source. I'll also omit the audio parameters and use the -nosound switch on the first pass, since audio isn't needed for video analysis, and output the video to /dev/null, since the only output we actually need from the first pass is the passlogfile (usually called divx2pass.log). The second pass will omit the turbo and nosound flags, add the audio and misc parameters from above, and output to a file.

If you see the message ‘Using all of requested bitrate is not necessary for this video with these parameters’, that just means the opening sequence of the source is mostly blank (usually black), and Mencoder thinks it could transcode that part at a lower bitrate without loss of quality, but you've specified a high bitrate. The message is pointless, since it fails to take into consideration the whole source, so you can safely ignore it.

If you don't care about having huge files, you can omit the vbitrate flag completely, and allow Mencoder to automatically determine the optimal bitrates using vqmin and vqmax (the default values are 2 and 31 respectively, which is already optimal, and you don't need to set them). This will also suppress the above annoying message.

The final sequence of commands looks like this:

#!/bin/sh
# TODO: use args instead of embedded variables

basedir="isos/rips/District 9"
logfile="\"${basedir}/District 9.log\""
sourcefile="dvd://1 -dvd-device \"$basedir\""
outfile="\"${basedir}/District 9.avi\""
subsfile="\"${basedir}/District 9.srt\""
bitrate="1500"

cropargs=$(echo "mplayer -vf cropdetect $sourcefile -vo null -ao null \
-ss 100 -frames 100 | grep 'vf crop' | cut -d '(' -f2 | cut -d ')' -f1 | \
tail -n 1" | sh 2>/dev/null)

videoopts="-ovc lavc -lavcopts vcodec=mpeg4:vbitrate=${bitrate}:\
last_pred=2:lumi_mask=0.07:dark_mask=0.2:scplx_mask=0.1:tcplx_mask=0.1:\
mv0:cbp:mbd=2:v4mv:precmp=256:cmp=256:subcmp=256:naq:qpel:trell:qns=3:\
bidir_refine=4:autoaspect:psnr:vrc_eq=\(\(mcVar*mv*10^3+tex\)*isP\)^0.7:\
vpass="

cropnscale="$cropargs -vf scale -zoom -xy 640 -sws 9"
sync="-mc 0 -noskip"
audioopts="-oac mp3lame -lameopts abr:br=160:aq=1"
subs="-nosub -fontconfig -sub $subsfile -subfont-text-scale 3.3"
miscopts="-ffourcc DX50 -alang en"

echo "rm -f $logfile" | sh
echo "mencoder -passlogfile $logfile -frames 500 $sourcefile $subs $cropnscale \
$sync ${videoopts}1:vb_strategy=2:turbo -nosound -o /dev/null" | sh
echo "mencoder -passlogfile $logfile -frames 500 $sourcefile $subs $cropnscale \
$sync ${videoopts}2 $audioopts $miscopts -o $outfile" | sh


Parameter descriptions

  • -ovc lavc

    The output video codec, in this case lavc, which is short for libavcodec. Strictly speaking ‘lavc’ is actually a library containing various codecs, which is why you must further specify the codec to use in the vcodec parameter (see below)

  • -lavcopts

    The options passed to libavcodec

  • vcodec=mpeg4

    The MPEG-4 Part 2 codec, usually denoted as MPEG-4/(A)SP, DivX, XviD or just MPEG4. Note this is different from the H.264 codec, which is MPEG-4 Part 10, also denoted as MPEG-4/AVC or just H264. These ‘DivX’ videos are still the most compatible for most purposes

  • vbitrate=1500

    This is the main setting that determines the overall quality and size of your target video. It's the average bitrate in kilobits per second (1 kilobit = 10^3 bits = 1000 bits, not 1024 bits). The higher the number, the better the quality, but the bigger the file. A setting of 1500 on a 1H:50m film will produce a good quality video of about 1.4GB, or 2 CDs worth, which seems to be the preferred size these days (formerly 700MB, or 1 CD).

    For an idea of how to adjust this setting to suit different file sizes, please see the Videohelp bitrate calculator.

    (From Wikipedia) ‘Average bitrate (ABR) refers to the average amount of data transferred per unit of time, usually measured per second. This is commonly referred to for digital music or video. An MP3 file, for example, that has an average bit rate of 128 kbit/s transfers, on average, 128,000 bits every second. It can have higher bitrate and lower bitrate parts, and the average bitrate for a certain timeframe is obtained by dividing the number of bits used during the timeframe by the number of seconds in the timeframe. Bitrate is not reliable as a standalone measure of audio/video quality, since more efficient compression methods use lower bitrates to encode material at a similar quality.’

    ‘Average bitrate can also refer to a form of variable bitrate (VBR) encoding in which the encoder will try to reach a target average bitrate or file size while allowing the bitrate to vary between different parts of the audio or video. As it is a form of variable bitrate, this allows more complex portions of the material to use more bits and less complex areas to use fewer bits. However, bitrate will not vary as much as in variable bitrate encoding. At a given bitrate, VBR is usually higher quality than ABR, which is higher quality than CBR (constant bitrate). ABR encoding is desirable for users who want the general benefits of VBR encoding (an optimum bitrate from frame to frame) but with a relatively predictable file size. Two-pass encoding is usually needed for accurate ABR encoding, as on the first pass the encoder has no way of knowing what parts of the audio or video need the highest bitrates to be encoded’

  • last_pred=2

    TODO

  • lumi_mask=0.07

    (From the manpage) ‘Luminance masking is a “psychosensory” setting that is supposed to make use of the fact that the human eye tends to notice fewer details in very bright parts of the picture. Luminance masking compresses bright areas stronger than medium ones, so it will save bits that can be spent again on other frames, raising overall subjective quality, while possibly reducing PSNR.’

    The manpage recommends values of 0.0 to 0.3 as a ‘sane range’, but I'm insane, so I've set it to 0.5 [edit: I changed it to 0.07, 'cause I like Bond, James Bond]. If you don't get good results, or aren't insane, you might consider reducing that value, or removing the parameter altogether. My results have been (subjectively) excellent so far though

  • dark_mask=0.2

    TODO

  • scplx_mask=0.1

    (From the manpage) ‘Spatial complexity masking. Larger values help against blockiness, if no deblocking filter is used for decoding, which is maybe not a good idea. Imagine a scene with grass (which usually has great spatial complexity), a blue sky and a house; scplx_mask will raise the quantizers of the grass' macroblocks, thus decreasing its quality, in order to spend more bits on the sky and the house.’

    I've used a ‘sane’ value, but omitted a deblocking filter, so I'm probably still insane. However, once again, the results are (subjectively) excellent, and actually better than deblocking filters, IMO

  • tcplx_mask=0.1

    TODO

  • mv0

    TODO

  • cbp

    TODO

  • mbd=2

    (From the MPlayer manpage) ‘Macroblock decision algorithm (high quality mode), encode each macro block in all modes and choose the best. This is slow but results in better quality and file size. When mbd is set to 1 or 2, the value of mbcmp is ignored when comparing macroblocks (the mbcmp value is still used in other places though, in particular the motion search algorithms). If any comparison setting (precmp, subcmp, cmp, or mbcmp) is nonzero, however, a slower but better half-pel motion search will be used, regardless of what mbd is set to. If qpel is set, quarter-pel motion search will be used regardless.’

    0 = Use comparison function given by mbcmp (default).
    1 = Select the MB mode which needs the fewest bits (=vhq).
    2 = Select the MB mode which has the best rate distortion.

    I use ‘2’ for best results.

    (From Wikipedia) ‘Macroblock is an image compression component and technique based on discrete cosine transform used on still images and video frames. Macroblocks are usually composed of two or more blocks of pixels. In the JPEG standard macroblocks are called MCU blocks. The size of a block depends on the codec and is usually a multiple of 4. In MPEG2 and other early codecs the size is fixed at blocks of 8x8 pixels. In more modern codecs such as h.263 and h.264 the overarching macroblock size is fixed at 16x16 pixels, but this is broken down into smaller blocks or partitions which are either 4, 8, 12 or 16 pixels by 4, 8, 12 or 16 pixels. (Combinations of these smaller partitions must combine to form 16x16 macroblocks.)’

  • v4mv

    (From the manpage) ‘Allow 4 motion vectors per macroblock (slightly better quality). Works better if used with mbd>0.’

    (From Wikipedia) ‘In video compression, a motion vector is the key element in the motion estimation process. It is used to represent a macroblock in a picture based on the position of this macroblock (or a similar one) in another picture, called the reference picture. The H.264/MPEG-4 AVC standard defines motion vector as: A two-dimensional vector used for inter prediction that provides an offset from the coordinates in the decoded picture to the coordinates in a reference picture’

  • precmp=256:

    TODO

  • cmp=256:

    (From the manpage) ‘Sets the comparison function for full pel (pixel) motion estimation (see mbcmp for available comparison functions). And mbcmp sets the comparison function for the macroblock decision, has only an effect if mbd=0. This is also used for some motion search functions, in which case it has an effect regardless of mbd setting.’

    The specified setting of 256 means ‘also use chroma,’ but apparently it ‘currently does not work (correctly) with B-frames’. But I'm insane, remember, so who cares? To learn more about comparison algorithms, you could start by reading this article on Wikipedia, but be warned it's only the start of a very, very long journey into the dark side of mathematics

  • subcmp=256:

    TODO

  • naq

    (From the manpage) ‘Normalize adaptive quantization (experimental). When using adaptive quantization (*_mask), the average per-MB quantizer may no longer match the requested frame-level quantizer. Naq will attempt to adjust the per-MB quantizers to maintain the proper average.’

    (From Wikipedia) ‘Quantization (of digital data) is, essentially, the process of reducing the accuracy of a signal, by dividing it into some larger step size (i.e. finding the nearest multiple, and discarding the remainder/modulus).’

    ‘The frame-level quantizer is a number from 0 to 31 (although encoders will usually omit/disable some of the extreme values) which determines how much information will be removed from a given frame. The frame-level quantizer is either dynamically selected by the encoder to maintain a certain user-specified bitrate, or (much less commonly) directly specified by the user. In this script the quantizer is determined by the codec, in this case MPEG-4 ASP, which uses the H.263 quantizer matrix by default.’ The frame-level quantizer is then determined adaptively according to the requested average bitrate and other parameters, such a trellis and spatial masking, in reference to the quantization matrix.’

    ‘A quantization matrix is a string of 64-numbers (0-255) which tells the encoder how relatively important or unimportant each piece of visual information is. Each number in the matrix corresponds to a certain frequency component of the video image. Quantization is performed by taking each of the 64 frequency values of the DCT block, dividing them by the frame-level quantizer, then dividing them by their corresponding values in the quantization matrix. Finally, the result is rounded down. This significantly reduces, or completely eliminates, the information in some frequency components of the picture.’

    Did you understand all that? No, neither did I

  • qpel

    (From the manpage) ‘Use quarter pel motion compensation (mutually exclusive with ilme). HINT: This seems only useful for high bitrate encodings.’

    (From Wikipedia) ‘Quarter pixel (also known as Q-pel or Qpel) refers to a quarter of a standard pixel. It is used in many modern video encoding standards such as MPEG-4 ASP and H.264/AVC to refer to quarter pixel precision in motion estimation and motion compensation. Though higher precision motion vectors take more bits to encode, they can result in more efficient compression overall, both through the decreased bit cost of the residual and the increased quality of the macroblock’

  • trell

    (From the manpage) ‘Trellis searched quantization. This will find the optimal encoding for each 8x8 block. Trellis searched quantization is quite simply an optimal quantization in the PSNR versus bitrate sense (Assuming that there would be no rounding errors introduced by the IDCT, which is obviously not the case.). It simply finds a block for the minimum of error and lambda*bits.’

    (From Wikipedia) ‘Trellis quantization is an algorithm that can improve data compression in DCT-based encoding methods. It is used to optimize residual DCT coefficients after motion estimation in lossy video compression encoders such as Xvid and x264. Trellis quantization reduces the size of some DCT coefficients while recovering others to take their place. This process can increase quality because coefficients chosen by Trellis have the lowest rate-distortion ratio. Trellis quantization effectively finds the optimal quantization for each block to maximize the PSNR relative to bitrate. It has varying effectiveness depending on the input data and compression method’

  • qns

    (From the manpage) ‘Quantizer noise shaping. Rather than choosing quantization to most closely match the source video in the PSNR sense, it chooses quantization such that noise (usually ringing) will be masked by similar-frequency content in the image. Larger values are slower but may not result in better quality. This can and should be used together with trellis quantization, in which case the trellis quantization (optimal for constant weight) will be used as startpoint for the iterative search.’

    0 = disabled (default)
    1 = Only lower the absolute value of coefficients
    2 = Only change coefficients before the last non-zero coefficient + 1
    3 = Try all.

    I've set this to 3 because I'm insane. This is also the setting most likely to cause your PC to die from stress-related illness or depression.

    (From Wikipedia) ‘Noise shaping is a technique typically used in digital audio, image, and video processing, usually in combination with dithering, as part of the process of quantization or bit-depth reduction of a digital signal. Its purpose is to increase the apparent signal to noise ratio of the resultant signal by altering the spectral shape of the error that is introduced by dithering and quantization such that the noise power is at a lower level in frequency bands at which noise is perceived to be more undesirable and at a correspondingly higher level in bands where it is perceived to be less undesirable.’

    Not to be confused with nose shaping (or rhinoplasty), which is the process of having your gigantic, ugly honker beautified by a plastic surgeon, or in the case of Michael Jackson, completely removed. Another way to reshape or remove your nose is to snort copious amounts of cocaine, like Danniella Westbrook (formerly of Eastenders fame) did, although this isn't generally recommended, unless you're insane, or suffering from stress-related illness or depression

  • bdir_refine

    TODO

  • autoaspect

    (From the manpage) ‘Same as the aspect option, but automatically computes aspect, taking into account all the adjustments (crop/expand/scale/etc.) made in the filter chain.’

    The ‘aspect’ parameter is further explained: ‘Store movie aspect internally, just like with MPEG files. Much nicer than rescaling, because quality is not decreased.’

    The manpage also claims that ‘only MPlayer will play these files correctly, other players will display them with wrong aspect’, however I've found that not to be the case in any of my standalone equipment, probably because the files are already cropped and scaled, and not anamorohic widescreen squeezed into a fullscreen box.

    The other possibility is that all my embedded devices run MPlayer internally, which I suppose is not beyond the realm of possibility, given that manufacturers shamelessly steal Free Software and embed it into their proprietary firmware, without releasing the sources to their derivative work, as required by the GPL.

    The aspect parameter can be given as a ratio or a floating point number, or in this case is calculated for you automatically, then inserted as open metadata Markup Language (odML) into the video properties header (vprp) of the AVI 2.0 file. Think ID3 tags for video. See Wikipedia for more on aspect ratios. If this option gives you problems, simply remove it, and allow your player to scale the image according to its cropped size (usually works)

  • psnr

    TODO

  • vrc_eq

    The video rate control equation is by far the scariest thing in Mencoder, FFmpeg, or frankly anything known to mankind. Just reading the manpage for this parameter has been known to cause death by stress-related illness or depression. Although it doesn't actually explain much, it just lists the various equation variables, which is probably just as well.

    For those brave enough to venture into this terrifying landscape of mathematical insanity, you could try reading the Wikipedia article on rate-distortion theory, but personally I recommend you just curl-up on the sofa with a bar of chocolate and a hot cup of coffee, and watch your freshly-transcoded video in blissful ignorance of the scientific horrors that produced it. All you need to know is it was invented by someone called Claude Shannon, who coincidentally invented something variously called the ‘Ultimate Machine’ or ‘Useless Machine’, depending on who you ask.

  • The Super-Duper Rate Control Equation

    The formula ‘((mcVar*mv*10^3+tex)*isP)^0.7’ is a fixed version (missing opening parenthesis) of the one featured here, which is a slightly expanded version of the one described here. Neither I nor the other two people concerned have any idea what it means, and the only person in the world (the aforementioned Claude Shannon) who actually understands this gibberish was not available for comment at the time of going to press (because he's been dead for 10 years), therefore I hereby arbitrarily declare this to be the best damned rate control algorithm ever, so help me God. Probably.

    Anyway, the results are good, so that's all that matters, eh? Oh, and I'm insane, so I get a free pass.

    If anyone can actually validate the verisimilitude, rationale, correctness, prudence, sensibility or even sanity of this equation, please leave a comment explaining why, then accept a big hug from us mere mortals

  • vpass=<positive integer>

    (From the manpage) ‘Activates internal two (or more) pass mode, only specify if you wish to use two (or more) pass encoding. The first pass (vpass=1) writes the statistics file. You might want to deactivate some CPU-hungry options, like "turbo" mode does. In two pass mode, the second pass (vpass=2) reads the statistics file and bases ratecontrol decisions on it.’

    (From Wikipedia) ‘Single-pass encoding analyzes and encodes the data "on the fly" and it is also used in the constant bitrate encoding. Single-pass encoding is used when the encoding speed is most important - e.g. for real-time encoding.’

    ‘Single-pass VBR encoding is usually controlled by the fixed quality setting or by the bitrate range (minimum and maximum allowed bitrate) or by the average bitrate setting. Multi-pass encoding is used when the encoding quality is most important. Multi-pass encoding cannot be used in real-time encoding, live broadcast or live streaming. Multi-pass encoding takes much longer than single-pass encoding, because every pass means one pass through the input data (usually through the whole input file).’

    ‘Multi-pass encoding is used only for VBR encoding, because CBR encoding doesn't offer any flexibility to change the bitrate. The most common multi-pass encoding is two-pass encoding. In the first pass of two-pass encoding, the input data are being analyzed and the result is stored in a log file. In the second pass, the collected data from the first pass are used to achieve the best encoding quality. In a video encoding, two-pass encoding is usually controlled by the average bitrate setting or by the bitrate range setting (minimal and maximal allowed bitrate) or by the target video file size setting.’

  • -vf crop[=w:h:x:y],scale

    Video Filter, in this case one called ‘crop’ and one called ‘scale’, just two of many. The scale parameter uses whatever scaler you specify with the sws parameter (see above), in this case Lanczos (or number 9, as MPlayer devs like to call it). Cropping is cutting bits off that you don't want any more, like the black borders round an image, getting a haircut, removing your appendix, or amputating your legs. Scaling is making things bigger or smaller, in a clever Lanczos sort of a way, which nobody understands, but which apparently produces very ‘good’ results. I know because somebody else told me ... and through trial and error ... and because I'm insane

  • -zoom

    This is just a flag that enables software scaling. Frankly it seems a bit redundant, given that I've already specifically asked for the video to be scaled with the ‘-vf scale’ filter, and even specified the clever Lanczos software scaler algorithm. It's like one of those annoying ‘Are you sure?’ prompts that used to plague Windows users, before they gave up and bought Android devices instead. Bizarre

  • -xy <integer>

    For values less than or equal to 8, scale image by factor <value>. For everything else, Set width to <value> and calculate height to keep the correct aspect ratio. Finally, something I can understand! :)

  • -sws 9

    SoftWare Scaler, in this case number 9, which corresponds to Lanczos. Beyond that, the MPlayer manpage doesn't really say much, but Wikipedia claims: ‘The Lanczos algorithm is an iterative algorithm invented by Cornelius Lanczos that is an adaptation of power methods to find eigenvalues and eigenvectors of a square matrix or the singular value decomposition of a rectangular matrix. It is particularly useful for finding decompositions of very large sparse matrices. In latent semantic indexing, for instance, matrices relating millions of documents to hundreds of thousands of terms must be reduced to singular-value form.’

    Well there you have it, whatever ‘it’ is.

    This may account for why the MPlayer manpage is eerily quiet on the subject, because the only two people in the world who actually understand it were busy the day the manpage was written. This may also explain why FFMpeg's (the project that MPlayer is largely based on) documentation is so sparse and incomplete, because it'd take two rocket scientists: one to write it, and the other to read it. Mostly I think you're supposed to just take others' word for it that it's ‘good’, and be happy

  • -mc 0

    (From the manpage) ‘maximum A-V sync correction per frame (in seconds). -mc 0 should always be combined with -noskip for mencoder, otherwise it will almost certainly cause A-V desync.’

    I needed this for my source material - a heavily copy-protected (i.e. sabotaged) DVD, but it may not always be necessary. YMMV

  • -noskip

    Do not skip frames, because some DVD video producers like to sabotage their videos with DRM (Digital Restrictions Mangling), in a futile attempt to stop people ripping and making backup copies of their legally purchased property. This would cause audio sync issues, and other problems, were it not for software that circumvents this sabotage, like Mencoder

  • -oac

    Output Audio Codec. (From the manpage) ‘Encode with the given audio codec (no default set)’

  • mp3lame

    The ironically titled ‘LAME Ain't an Mp3 Encoder’ (although this used to be true).

    MP3 (or MPEG-2 Audio Layer III, to give its proper nomenclature), is the the de facto standard for audio compression, thanks mainly to the original Napster, which started the MP3 revolution by making it easy for people to share their music, as God intended. Then Metallica came along and beat Napster over the head with some blunt, heavy lawyers (because Nothing Else Matters to Metallica ... except greed), and Napster was transformed into a two-bit collection agency for the RIAA (the Racketeering Industry Ass. of America). But file sharing, and MP3, marched on regardless. MP3 is heavily patented by some twit called ‘Fraunhofer’ (and about a billion other Intellectual Monopolists), apparently, but people care even less about that than music copyrights, so everyone who listens to MP3 music uses LAME without a care in the world, as the racketeers at the RIAA and elsewhere grind their teeth, foam at the mouth, pop their bloodshot eyeballs, throb their pulsating temples, stroke their bulging wallets, and risk dying of stress-related illness and depression.

    (From the LAME website) ‘Today, LAME is considered the best MP3 encoder at mid-high bitrates and at VBR, mostly thanks to the dedicated work of its developers and the open source licensing model that allowed the project to tap into engineering resources from all around the world. Both quality and speed improvements are still happening, probably making LAME the only MP3 encoder still being actively developed.’

    (From Wikipedia) ‘The use in MP3 of a lossy compression algorithm is designed to greatly reduce the amount of data required to represent the audio recording and still sound like a faithful reproduction of the original uncompressed audio for most listeners. An MP3 file that is created using the setting of 128 kbit/s will result in a file that is about 11 times smaller than the CD file created from the original audio source. An MP3 file can also be constructed at higher or lower bit rates, with higher or lower resulting quality. The compression works by reducing accuracy of certain parts of sound that are considered to be beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding. It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner’

  • -lameopts

    TODO

  • abr

    Average Bit Rate, in this case for audio. (From the LAME manpage) ‘Turns on encoding with a targeted average bitrate of n kbits, allowing to use frames of different sizes. The allowed range of n is 8 - 310, you can use any integer value within that range.’ I've used 160 kilobits per second, because despite the ‘near CD quality’ label of 128 kb/s, anything lower than 160 kb/s on MP3 sucks ass, unlike Ogg Vorbis (which I prefer, but which is unfortunately not supported on much standalone equipment, for some mysterious reason)

  • aq

    Algorithmic Quality. (From the LAME manpage) ‘Bitrate is of course the main influence on quality. The higher the bitrate, the higher the quality. But for a given bitrate, we have a choice of algorithms to determine the best scalefactors and Huffman encoding (noise shaping). 0 = slowest & best possible version of all algorithms. 0 and 1 are slow and may not produce significantly higher quality.’

    I use 1 because I'm insane. For more information about LAME's quality algorithms, and psychoacoustic modelling in general, I recommend you read the LAME sources, then get a PhD in Mathematics, in reverse order. Or you could just read the Wikipedia page on psychoacoustics

  • -nosub

    (From the manpage) ‘Disables any otherwise auto-selected internal subtitles (as e.g. the Matroska/mkv demuxer supports).’ This includes DVDs, and presumably other media too. This is necessary when you either don't want subtitles at all, or you're using external subtitles files, as I am here

  • -fontconfig

    (From the manpage) ‘Enables the usage of fontconfig managed fonts.’

    (From Wikipedia) ‘Fontconfig (or fontconfig) is a computer program library designed to provide system-wide font configuration, customization, and application access. Fontconfig is written and was originally maintained by Keith Packard’ (of Hewlett Packard, formerly the world's biggest PC retailer, before people got fed up with Windows and stopped buying PCs, thus causing HP to exit the PC business). ‘Its current maintainer is Behdad Esfahbod’ (which looks like a drunken typo, but isn't). ‘Fontconfig is free software distributed under a permissive free software license. Applications can use fontconfig in two ways: 1. by querying it for the available fonts on the system, or 2. by asking it for a font matching certain parameters (pattern). Fontconfig will then return a font whose properties match those specified in the pattern as closely as possible.’

    In short, it's the GNU/Linux font system

  • -sub <file>

    Use <file> for subtitles.

    (From the manpage) ‘Mplayer/Mencoder supports 12 subtitle formats (MicroDVD, SubRip, OGM, SubViewer, Sami, VPlayer, RT, SSA, AQTitle, JACOsub, PJS and our own: MPsub) and DVD subtitles (SPU streams, VOBsub and Closed Captions).’

    (From Wikipedia ‘SubRip is a software program for Windows which "rips" (extracts) subtitles and their timings from video. It is free software, released under the GNU GPL. SubRip is also the name of the widely used and broadly compatible subtitle text file format created by this software.’

    (Also from Wikipedia) ‘Avidemux has built-in subtitle processing, both for Optical Character Recognition of DVD subtitles and for rendering hard subtitle. Avidemux supports various subtitle formats, including MicroDVD (.SUB), SubStation Alpha (.SSA), Advanced SubStation Alpha (.ASS) and SubRip (.SRT)’

  • -subfont-text-scale <real number>

    (From the manpage) ‘Sets the subtitle text autoscale coefficient as percentage of the screen size (default: 5).’

    I use 3.3 because it looks better, IMO, and because I'm insane

  • -ffourcc DX50

    Force the FourCC to be something else, in this case DX50, because some manufacturers claim to have never heard of FMP4, and thus don't support it, despite the fact that many of them steal FFMpeg to use in their proprietary firmware, get caught red handed, but still refuse to release the sources, in violation of the GPL license that FFMpeg is distributed under.

    (From Wikipedia) ‘A FourCC (literally, four-character code) is a sequence of four bytes used to uniquely identify data formats. The concept originated in the OSType scheme used in the Macintosh system software and was adopted for the Amiga/Electronic Arts Interchange File Format and derivatives. The idea was later reused to identify compressed data types in QuickTime and DirectShow. One of the most well-known uses of FourCCs is to identify the video codec used in AVI files. Common identifiers include DIVX, XVID, and H264’

  • -alang <language code>

    (From the manpage) ‘Specify a priority list of audio languages to use. Different container formats employ different language codes. DVDs use ISO 639-1 two letter language codes, Matroska, MPEG-TS and NUT use ISO 639-2 three letter language codes while OGM uses a free-form identifier. MPlayer prints the available languages when run in verbose (-v) mode’

That's it. If you managed to read all that without falling asleep, you win a cookie, a PhD in mathematics, and an autographed copy of my new book: ‘How Not To Die From Stress-related Illness or Depression’.