S-Log. A Further In Depth Look.

Well I posted here a few days ago about how Data was distributed across the S-Log curve. David williams (thanks David) questioned some of the things in my post raising some valid questions over it’s accuracy, so I withdrew the post in order to review it further. While the general principles within the post were correct (to the best of my knowledge and research) and I stand by them, some of the numbers given were not quite right and the data/exposure chart was not quite right.

Before going further lets consider the differences between the a video sensor works and the way our eyes work. A video sensor is a linear device while our own visual system is a logarithmic system. Imagine you are in a room with 8  light fittings, each one with the same power and light output. You start with one lamp on, then turn on another. When you turn on the second lamp the room does not appear to get twice as bright even though the amount of light in the room has actually doubled. Now with two lamps on what happens when you turn on a third? Well you wouldn’t actually notice much of a change. To see a significant change you would need to turn on 2 more lamps. Now with 4 lamps on to see a significant difference you would need to turn on a further 4 lamps. Only adding one or two would make little visual difference. This is because our visual system is essentially a logarithmic system.

Now lets think about F-Stops. An f stop (or T-stop) is a doubling or halving of exposure. So again this is a logarithmic system. If with one light bulb your scene is one stop then to increase the scene brightness by one stop you must double the amount of light, so you would add another light bulb. Now to increase the scene brightness by a further stop you would have to take your existing two light bulbs and double it again to 4 light bulbs, and so on… 2, 4, 8, 16, 32, 64….

Now going back to a video sensor, take a look at the illustrative graph below. The horizontal scale is the number of lightbulbs in our hypothetical room and the vertical scale is the video output from an imaginary video sensor in percent. Please note that I am trying to illustrate a point, the numbers etc are not accurate, I’m just trying to explain something that is perhaps miss-understood by many, simply because it is difficult to understand or poorly explained elsewhere. The important thing to note is that the plotted blue line is a straight line, not a curve because the sensor is a linear device.

Now look at this very similar chart. The only difference now is that I have added an f-stop scale to the horizontal axis. Remember that one f-stop is a doubling of the amount of light, not simply one more lightbulb. I have also changed the vertical scale to data bits. To keep things simple I’m going to use something close 10 bit recording which actually has 956 data bits or steps (bits 64 to 1019 out of 1024 bits), but lets just round that up to 1000 data bits to keep life simple for this example.

So we can see that this imaginary  video sensor uses bits 0-50 for the first stop, 50-100 for the second stop, 100-200 for the third stop, 200-400 for the fourth and 400-800 for the fifth. So it is easy to see that huge amounts of data are required to record each stop of over exposure. The brighter the image the more data that is required. Clearly if you want to record a wide dynamic range using a linear system you need massive numbers of data bits for the highlights, while the all important mid tones and shadow areas have relatively little data allocated to them. This is obviously not a desirable situation with current data limited recording systems, you really want to have sufficient data allocated to your mid-tones so that in post production you can grade them satisfactorily.

Now look what happens if we allocate the same amount of data to each stop of exposure. The green line is what you get if, in our imaginary camera we use 200 data bits to record each of our 5 stops of dynamic range. Does the shape of this curve look familiar to anyone? The important note here is that compared to the sensors linear output (the blue line) as the image brightness increases less and less data is being used to record the highlights. This mimics the way we see the world and helps ensure that in the mid ranges where skin tones normally reside there is lots of data to play with in post. Our visual system is most acute in the mid range. that’s because some of the most important things that we see are natural tones, plants, fauna and people. We tend to pay much less attention to highlights as these are rarely of interest to us. Because of this we can afford to reduce the amount of information in video highlights without the end user really noticing. This technique is used by most video cameras when the knee kicks in and compresses highlights. It’s also used by extended gamma curves such as cinegamma’s and hypergamma’s.

Anyone that’s seen a hypergamma curve or cinegamma curve plot will have seen a similar shape of curve. Hypergammas and Cinegammas also use less and less data to record highlights (compared to a linear response) and in many ways achieve a similar improvement in the captured dynamic range.

Hypergammas are not the same as S-Log however. Hypergammas are designed to be useable without grading, even if it’s not ideal. Because of this they stay close to standard gammas in the mid range and it’s only really the highlights that are compressed, this also helps with grading if recording using only an 8 bit codec as the amount of pushing and pulling required to get a natural image is less extreme. However because the Hypergammas allocate more data in the 60 to 90 percent exposure range to stay close to standard gamma the highlights have to be more highly compressed than S-Log so there is less highlight data to work with than with S-Log.  If we look at the plot below which now includes an approximate S-Log curve (pink line) you can see that log recording has a much larger difference from a standard gamma in the mid ranges, so heavy grading will be required to get a natural looking image.

Because of the amount of grading that will normally be done with S-Log, recording the output using a 10 bit recorder is all but essential.

When I wrote this article I spent a lot of time studying the Sony S-Log white paper and reading up on S-Log and gamma curves all over the place. One thing that I believe leads to some confusion is the way Sony presents the S-Log data curve in the document. The exposure is plotted against the data bits using stops as opposed to image brightness. This is a little confusing if you are used to seeing traditional plots of gamma curves like the ones I have presented above that plot output against percentage light input. It’s confusing as Sony forget that using stops as the horizontal scale means that the horizontal scale is a log scale and this makes the S-Log  “curve”  appear to be a near straight line.

I have not used S-Log on an F3 yet. It will be interesting to see how it compares to Hypergamma in the real world. I’m sure it will bring some advantages as it allows for an 800% exposure range. I welcome any comments or corrections to this article.

More on S-Log and Gamma Curves

A lot of the issues with any camera and the dynamic range it can record are not due to limitations of the cameras hardware but to retain compatibility with existing display technologies, in particular the good old fashioned TV set that has been around for half a century. The issue being that in order for all TV owners to see a picture that looks “natural” there has to be a common standard for the signal sent to the TV’s that will work with all sets from the very oldest to the most recent.

Most modern cameras, not just the XDCAM’s simply ignore highlight information beyond what can be recorded, this results in the image getting clipped at a given point depending on the gamma curve being used. Interestingly using negative gain on a camcorder can act as a low end clip as very small brightness changes will be reduced by the negative gain, possibly to the point where they are no longer visible. This  normally results in a reduction in dynamic range (as well as noise). I suspect this is why the F3 has less noise using standard gammas because the sensor has excess dynamic range for theses curves and good sensitivity, so Sony can afford to set the arbitrary 0db point in negative space without impacting the recorded DR but giving a low noise floor benefit. For S-Log however it’s possible to record a greater dynamic range so 0db is returned to true zero and as a result the noise floor increases a little.
LUT’s are just a reverse gamma curve applied to the S-Log curve to restore the curve to one that approximates a standard gamma, normally REC-709. They are there for convenience to provide an approximation of what the finished image might look like. However applying an off the shelf LUT will impact the dynamic range as an assumption has to be made as to which parts of the image to keep and which to discard as we are back to squeezing 12 bits into 7 bits. As every project, possibly every shot will have differing requirements you would need an infinite number of LUT’s to be able to simply hit an “add LUT” button to restore your footage to something sensible. Instead it is more usual for the colorist or grader to generate their own curves to apply to the footage. Most NLE’s already have the filters to do this, it’s simply a case of using a curves filter or gamma curve correction to generate your own curves that can be applied to your clips in lieu of a LUT.

Understanding Gamma, Cinegamma, Hypergamma and S-Log

The graph to the left shows and idealised, normal gamma curve for a video production chain. The main thing to observe is that the curve is in fact pretty close to a straight line (actual gamma curves are very gentle, slight curves). This is important as what that means is that when the filmed scene gets twice as bright the output shown on the display also appears twice as bright, so the image we see on the display looks natural and normal. This is the type of gamma curve that would often be referred to as a standard gamma and it is very much what you see is what you get. In reality there are small variations of these standard gamma curves designed to suit different television standards, but those slight variations only make a small difference to the final viewed image. Standard gammas are typically restricted to around a 7 stop exposure range. These days this limited range is not so much to do with the lattitude of the camera but by the inability of most monitors and TV display systems to accurately reproduce more than a 7 stop range and to ensure that all viewers whether they have 20 year old TV or an ultra modern display get a sensible looking picture. This means that we have a problem. Modern cameras can capture great brightness ranges, helping the video maker or cinematographer capture high contrast scenes, but simply taking a 12 stop scene and showing it on a 7 stop display isn’t going to work. This is where modified gamma curves come in to play.

The second graph here shows a modified type of gamma curve. This is similar to the hypergamma or cinegamma curves found on many professional camcorders. What does the graph tell us? Well first of all we can see that the range of brightness or lattitude is greater as the curve extends out towards a range of 10 T stops compared to the 7 stops the standard gamma offers. Each additional stop is a doubling of lattitude. This means that a camera set up with this type of gamma curve can capture a far greater contrast range, but it’s not quite as simple as that.

Un-natural image response area

Look at the area shaded red on the graph. This is the area where the cameras capture gamma curve deviates from the standard gamma curve used not just for image capture but also for image display. What this means is that the area of the image shaded in red will not look natural because where something in that part of the filmed scene gets 100% brighter it will only be displayed as getting 50% brighter for example. In practice what this means is that while you are capturing a greater brightness range you will also need to grade or correct this range somewhat in the post production process to make the image look natural. Generally scenes shot using hypergammas or cinegammas can look a little washed out or flat. Cinegammas and Hypergammas keep the important central exposure range nice an linear, so the region from black up to around 75% is much like a standard gamma curve, so faces, skin, flora and fauna tend to have a natural contrast range, it is only really highlights such as the sky that is getting compressed and we don’t tend to notice this much in the end picture. This is because our visual system is very good at discerning fine detail in shadow and mid tones but less accurate in highlights, so we tend not to find this high light compression objectionable.

S-Log Gamma Curve

Taking things a step further this  even more extreme gamma curve is similar to Sony’s S-Log gamma curve. As you can see this deviates greatly from the standard gamma curve. Now the entire linear output of the sensor is sampled using a logarithmic scale. This allows more of the data to be allocated to the shadows and midtones where the eye is most sensitive. The end result is a huge improvement in the recorded dynamic range (greater than 12 stops) combined with less data being used for highlights and more being used where it counts. However, the image when viewed on a standard monitor with no correction that looks very washed out, lacks contrast and generally looks incredibly flat and uninteresting.

Red area indicates where image will not look natural with S-Log without LUT

In fact the uncorrected image is so flat and washed out that it can make judging the optimum exposure difficult and crews using S-Log will often use traditional light meters to set the exposure rather than a monitor or rely on zebras and known references such as grey cards. For on set monitoring with S-Log you need to apply a LUT (look Up Table) to the cameras output. A LUT is in effect a reverse gamma curve that cancels out the S-Log curve so that the image you see on the monitor is closer to a standard gamma image or your desired final pictures. The problem with this though is that the monitor is now no longer showing the full contrast range being captured and recorded so accurate exposure assessment can be tricky as you may want to bias your exposure range towards light or dark depending on how you will grade the final production. In addition because you absolutely must adjust the image in post production quite heavily to get an acceptable and pleasing image it is vital that the recording method is up to the job. Highly compressed 8 bit codecs are not good enough for S-Log. That’s why S-Log is normally recorded using 10 bit 4:4:4 with very low compression ratios. Any compression artefacts can become exaggerated when the image is manipulated and pushed and pulled in the grade to give a pleasing image. You could use 4:2:2 10 bit at a push, but the chroma sub sampling may lead to banding in highly saturated areas, really Hypergammas and Cinegammas are better suited to 4:2:2 and S-Log is best reserved for 4:4:4.

Brewing up a Scene File: Gamma and Knee

Before anyone complains that I have missed stuff out or that some technical detail is not quite right, one of the things I’m trying to do here is simplify the hows and why’s to try and make it easier for the less technical people out there. Lets face it this is an art form, not a science (well actually a bit of both really).
So what is a gamma curve anyway? Well the good old fashioned cathode ray tube television was a very non-linear device. You put 1 unit of power in and get one unit of light out. You put 2 units in and get 1.5 units out, put 3 in and get 2 out… and so on. So in order to get a natural picture the output of the camera also has to be modified to compensate for this. This compensation is the gamma curve, an artificial modification of the output signal from the camera to make it match TV’s and monitors around the world. See Wikipedia for a fuller explaination:   http://en.wikipedia.org/wiki/Gamma_correction
So, all video cameras will have a gamma curve, whether you can adjust it or not is another matter. Certainly most pro level cameras allow you some form of gamma adjustment.
The PMW-350 has 6 standard gamma curves, these are all pretty similar, they have to be otherwise the pictures wouldn’t look right, but small changes in the curve effect the relationship between dark and bright parts of the pictures. Todays modern cameras have a far greater dynamic range (range of dark to bright) than older cameras. This means that the full dynamic range of the sensor no longer fits within the gamma curves used for TV’s and monitors. In broadcast television any signal that goes over 100% gets clipped off and is discarded, so the cameras entire brightness range has to be squeezed into 0 to 100%. The PMW-350 sensors are capable of far more than this (at least 600%) so what can you do?
The older and simpler solution is called the “Knee”. The knee works because in most cases the brightest parts of a scene contains little detail and is generally ignored by our brains. We humans tend to focus on mid-tone faces, animals and plants rather than the bright sky. Because of this you can compress the highlights (bright) parts of the picture quite heavily without it looking hugely un-natural (most of the time at least). What the knee does is takes a standard gamma curve and up near it’s top, bends it over. This has the effect of compressing the brighter parts of the image, squashing a broad range of highlights (clouds for example) into a narrow range of brightness. While this works fairly well, it does tend to look rather “electronic” as the picture is either natural (below the knee) or compressed (above the knee).
The answer to this electronic video look is to replace the hard knee with gentle bend to the gamma curve. This bend starts some way down the gamma curve, very gentle at first but getting harder and harder as you go up the gamma curve. This has the effect of compressing the image gently at first with the compression getting stronger and stronger as you go up the curve. This looks a lot more natural than a hard knee and is far closer to the way film handles highlights. The downside is that because the compression starts earlier a wider tonal range is compressed. This makes the pictures look flat and uninteresting. You have to watch exposure on faces as these can creep into the compressed part of the curve. The plus point is that it’s possible to squeeze large amounts of latitude into the 100% video range. This video can then be worked on in post production by the editor or colorist who can pull out the tonal range that best suits the production.
These compressed gamma curves are given different names on different products. Panasonic call them “Film Rec”, on the EX1 they are “Cinegammas” on the PMW-350 they are “Hypergammas”. The 350 has four Hypergammas. The first is 3250. this takes a brightness range the equivalent to 325% and compresses it down to 100%. HG 4600 takes 460% and squeezes that down to 100%. Both of these Hypergammas are “broadcast safe” and the recordings made with them can be broadcast straight from the camera without any issues. The next Hypergamma is 3259. This takes a 325% range and squeezes this down to a 109% range, likewise 4609 takes 460% down to 109%. But why 109%? well the extra 9% gives you almost 10% more data to work with in post production compared to broadcast safe 100%. It also gives you the peak white level you need for display on the internet. Of course if you are doing a broadcast show you will need to ensure that the video levels in the finished programme don’t exceed 100%.
My preferred gamma is Hypergamma 4 (4609) as this gives the maximum dynamic range and gives a natural look, however the pictures can look a little flat so if I’m going direct from the camera to finished video without grading I use either a standard gamma or use the Black Gamma function to modify the curve. I’ll explain the Black Gamma in my next post.
There are 6 standard gammas to choose from. I like to stick with gamma 5 which is the ITU-709 HD standard gamma. To increase the dynamic range I use the Knee. The default knee point setting is 90, this is a reasonable setting, but if your shooting with clipping set to 100% you are not getting all the cameras latitude (the Knee at 90 works very well with clipping at 108%). Lowering the knee down to 83 gives you almost another stop of latitude, but you have to be careful as skin tones and faces can creep up towards 83%. It’s very noticeable if skin becomes compressed so you need to watch your exposure. This is also true of the Hypergammas and with them you may need to underexpose faces very slightly. The other option is to set the knee point to 88 and then also adjust the knee slope. The slope is the compression amount. A positive value is more compressed, negative less compressed. With the knee at 88 and slope set to +20 you get good latitude, albeit with quite highly compressed highlights.
If you want to play with the gammas and knee and see how they work one method you can use is to use a paint package on your PC (such as photoshop) to create a full screen left to right graduated image going from Black to white. Then shoot this with the camera (slightly out of focus) while making adjustments to the curves or knee and record the results along with a vocal description of each setting. Import the clips into your favorite editing package and use the waveform monitor or scopes you should be able to see a reasonable representation of the shape of the gamma curve and knee.
So my Gamma Choices are:
For material that will be post produced: Hypergamma 4609 (HG4)
For material that will be used straight from the camera: Standard Gamma 5 Knee at 90 with clip at 108% for non broadcast or Knee at 88 with slope +20 with white clip at 100% for direct to broadcast.