What does video compression actually do to your media?
by Brian Reisdorf, on Oct 21, 2019 12:29:54 PM
The term "compression" gives most video professionals anxiety at a level similar to watching your aunt shoot vertical video with her new iPhone 11 Pro that she got becase it 'came with best emojis'. There's just nothing good about compression, right?
The reality is that we know compression is a necessary fact of a modern video workflow due to the size of the media we're working with. Working with uncompressed video of any real quality (1080P and up) is something virtually no one is actually capable of doing in any useful way. In the last 10 years of building and installing high performance video servers, I've seen only a handful of systems that could even stand a chance at editing a single stream of uncompressed 4K, let alone the 5K, 6K, and 8K footage that we're seeing more and more of these days. Why? We compress , everything. Cameras compress on capture, software often compresses on ingest, and sometimes again on final render, and then it's compressed again for delivery. We don't need to spend insane amounts of money on ultra high end systems because good, modern compression techniques are astoundingly good at preserving visual data, and that's a really good thing.
So what is actually happening when you compress something? The simple answer is we're throwing away minute detail for space savings in the file. There's quite a bit more nuance in what compression is doing, but that's the most direct way to look at it and this one fact has several major benefits. With smaller file sizes you can store more footage on a camera card/SD card so you can shoot longer, you can transfer that data between systems and backup harddrives quicker, and you can edit that footage on a desktop that doesn't come with a 5-digit price tag (or even a laptop with some formats). The size saving for most compression is absolutely huge, and the loss of visual data is often unnoticeable. You can break the size savings down with a quick formula:
data rate = color depth * vertical resolution * horizontal resolution * framerate
So if you're recording some 10-bit 1080p30 footage, you're looking at this:
10 (bit depth) * 1080 (V. res) * 1920 (H. res) * 30 = 622,080,000b/s = 77.76MB/s
Nearly 80MB/s for pretty basic 1080p format is no joke. Now watch what happens if you punch in the numbers that a Canon C300 shoots at in 4k:
12 (bit depth) * 2160 (V. res) * 4096 (H. res) * 60 = 6,370,099,200 = 796MB/s
While a lot of new systems could technically play back an 800MB/s stream (thank you NVMe drives!), the reality of actually working with streams that size in a production environment is far more harsh. When you begin to consider multistream timelines, effects/color processing on huge frame sizes, and the fact that 1 minute of footage takes up nearly 50GB per stream, it becomes apparent quickly that you'd need significant hardware power to be successful with a real-time uncompressed workflow.
In contrast, if you were to compress that same 4k footage down into ProRes 422, you get a comparably fantastic size of 157MB/s. At that size we can easily play it off of an external SSD drive, or even some fast SD cards, and with a 10GbE network connection, you can easily run 2-3 streams of it at the same time from most video tuned network storage, perfect for most typical production workflows.
So what does this compression cost us?
When something is compressed with a codec like ProRes, we're mostly trading color information for file size savings. What does that actually mean? In video, every pixel on the camera sensor records a single value for Red, Green, and Blue, good old RGB. Using an uncompressed or "heavy codec", these values all remain unique in the video file. In more compressed codecs, like in ProRes 422, some of these color values are actually removed, and re-generated on the fly from an average of the surrounding colors. The 4:2:2 in ProRes 422 actually represents the color bit depth used in that codec. In the case of 4:2:2, for every 4 pixels of green color samples recorded, the codec only records 2 Red and 2 Blue pixels and then averages the missing ones together from surrounding samples.
Is there a loss in detail? Yes, but it's very, very small and mostly imperceptible to the naked eye. Where this can be a problem however, is during VFX or compositing work where color depth is used to perform accurate chromakeying and tracking. A codec with a limited bit depth such as 4:1:1 or 4:2:0 can make for some ugly greenscreen results as edges become a little harder to define cleanly due to the averaging of some colors. Generally, if that kind of VFX work is part of your workflow, you'll want to stick with at least a 4:2:2 capable codec, if not a 4:4:4 one to retain that detail and make the VFX process easier (and maybe a tiny bit cheaper).