The subject of the color encoding has been touched in the log Color spaces but no satisfying method has been found. This is because these were not really designed for lossless compression, or too much influenced by the lossy TV methods.
The early explorations just considered plain R/G/B coding (it's simple, easy and resilient) but there is the feeling that a lot of redundancy remains, particularly when colors are saturated.
Here, I design a colorspace that better exploits the 3R block compaction algorithms, where the lowest amplitudes are favored.
Most colorspaces use a luminance/chrominance approach, with one signal that is the sum of all the components, and two signals are a difference with the first sum. These transforms are designed to maximise lossy compression where chroma is more lossy than luma, which explains why the components are weighted (typically G>R>B).
The 3RGB colorspace is subtly different because it tries to use encoding tricks borrowed from 3R, instead of psychovisual tricks: it encodes the raw sum, without weighting, as well as two components (like R and B) almost directly.
G + B => Cgb Cgb + R => Luma Send Luma, R, B
Decoding is easy as this:
Get Luma Get R Luma - R => Cgb Get B Cgb - B => G
From this point of view, this is probably the dumbest colorspace ever, so (except for its simplicity) why care ?
The point is to reduce the number of bits used with saturated colors, and exploit anciliary data, in particular the maximal value that each component can have, and mount a 3R-like algo, in order to only send the signals with the least amplitudes. Most colorspaces directly decode the two other components from the Luma, whereas here we use a longer chain, to reduce the range as much as possible. This is slower but provides better compaction.
Let's take 8-bits components: each pixel uses one byte for R, G and B. The minimum range of their sum is [0..765]. Let's look at the range [511..765] for example : it reduces the possible range of each component because they are all limited to 255. This is also where saturated colors reside (usually one component is off while two others are full on).
There are two particular cases of interest :
- Luma = 0 : the zerotree doesn't grow, all other components are off. End of story.
- Luma = 765 : all the components are on, no more information needed. End of story.
This case can be extended to Luma = 1:
- If R=1 then Cgb=0, no need to read G or B
- If R=0 then Cgb=1, read B (G=Cgb-B)
Same with Luma = 764: there is no need to encode R in the [0..255] range because the possible values are either R or G or B is 254, the other two are 255. R can be encoded in the [0..1] range:
- If R=1 then Cgb=511, no need to read G or B (which are implied to be 255)
- If R=0 then Cgb=510, so R=255, read B (one bit) to select whether G or B is 254 (the other is 255)
The case is obviously mirored between Luma=0 and Luma=765, with complementary values (something that the classic 3R algorithm doesn't exploit).
The fact that each component has a limit, reduces the coding space so there is a hope to encode less than 24 bits per pixel in many favorable cases. The worst case can't add more than 2 bits and this is not easy to construct because as the sum reaches 765, the other components are reduced. However the pigeonhole principle excludes that all the combinations use less than 24 bits. We'll examine this whole situation later.
The same ideas can be applied to other parts as well, such as encoding the minimum and maximum values of pixels in a block: while the minimum value uses a simple byte, the maximum value is in a variable range and can be encoded with less bits, using the knowledge of the preceding number, because their sum can't exceed 255.
So far we consider only the case where R, G and B have the maximum possible range but very often this is not the case. Each component can have their own range in a block so a formula must be extended a bit, using the anciliary range data:
- MinR, minG, MinB,
- MaxR, MaxG, MaxG (relative to MinR, minG, MinB, respectively)
Then Luma is in the range [ 0..MaxR+MaxG+MaxG ] and all the pixel values are offset by their respective MinX.
The σα transform (as it is now called) is explained in log 13. Breakthrough, for the special case of a pair of identical ranges, which applies directly to the (implicit) sum Cgb and the coded Blue component.
It gets more complicated when encoding R because R and Cgb don't have the same ranges. This will be covered in the new dedicated project #σα code (in the log "Extension to unequal boundaries")
Continued in the log Colorspace decomposition (episode 2: the big swap)
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.
Couldn't you safely ignore the slight differences between the images planes that carry 254 vs 255 values? You would have to keep it consistent, as you wouldn't notice this on a single image but if it changed randomly or periodically you would definitely notice.
I think 3 x 8 bits + A could well be enough for color encoding if (at least) luminance was logarithmically encoded.
It doesn't look like you consider subsampling, but it is possible to use full res luminance and half res chroma and expand to make an image that is 100% visually lossless. We did this on the film Pleasantville and no one ever suspected a thing :) This hack is no longer eligible to be called lossless, of course.
I should probably ask if this ever went anywhere....
csw
Are you sure? yes | no
Bonjour "Dude" :-)
concerning 254 vs 255 : it's irrelevant. You can use the encoding for about any bit depth. I use the 8 bits per component as a "relatable example" but the point is that it reduces the compression effort, and the best way is indeed to reduce the bit depth locally, adaptively.
Logarithmic encoding is out of the scope, it's a higher-level decision.
Chroma subsampling is not considered either, as it is "lossy" and would be used on more aggressive compression, downstream.
Did this go anywhere ? Not yet. I have gotten multi-meta-sidequested (I am currently simultaneously working on a transceiver, an operating system and another "private" project) and I must first work on several underlying techniques. But I still intend to write in-depth articles and write some reference implementation. Keep commenting !
Are you sure? yes | no