More about how light is processed

The path of light to the eye is manipulated on the way by a series of Muller cells (which function more or less like a selective fibre optic tube) that lead the light to the back of the retina.  These cells optimise the flow of red, green and blue light to the cones which handle color.  However not all of the light is funnelled and varying amounts of the light spill over the edges to reach the retina independently.  This effect has been described by Labin, Safui, Ribak & Perlman (Nature, 8 July, 2014)

The effect also has an impact on the detection of light by the rods of about 20%.  This impact is not uniform, in particular there is a dip in detected intensity (~40%) around 560nm (Yellow-green) light.  This adjustment for the rods should then be taken into account in terms of the overall detected brightness or darkness of a picture when seen from a human perspective.

The concentration of light provided by the Muller cells also provides a distorted view of the red and green vs. that of blue.  While this does go quite some way towards explaining the apparent “correctness” of various color models investigated so far, it also presents a possible simpler calculation for determining what a given scene looks like for a human just prior to neural processing.

CIECAM02

In order to be able to have a solid basis for processing an image with similar results to the way humans do we need to “see” the image in the same way.  Unfortunately we don’t really have a model for converting an RGB bitmap into an image as it would be seen by the human eye.  The closest model is the CIECAM02 model published by the CIE.  Unfortunately much of the articles discussing this are hidden behind paywall and so the field advances very very slowly.  Still, Billy Biggs with a little help from Nathan Moroney from HP have published an implementation for CIECAM02 in C.  So here is an adaption for c#.

There are a number of paywalled articles which discuss flaws with the CIECAM02 algorithm and a number of possible future correctives are being examined by the CIE.  Perhaps the element the most relevant to image processing however is that we gain greater depth perception with 2 eyes rather than 1 eye.  So logically the image as it is perceived by each eye should be slightly different.   This is not currently represented in the algorithm but something to be considered.   The choice of illumination should not really matter (we are able to understand the same image in a variety of settings).  There should however be sufficient contrast to the white point to be able to resolve the differences in color.  It seems to me that the selection of the white point should also be somehow derivable from the RGB image as we are able to analyse an image 

    /* this code is adapted from the work of Billy Biggs and Nathan Moroney
     * http://scanline.ca/ciecam02/
     * 
     * The following is an implementation of the forward and inverse
     * functions for XYZ to JCh values from CIECAM02, as well as a function
     * to return all of J, C, h, Q, M and s.  It has been tested against the
     * spreadsheet of example calculations posted by Mark D.  Fairchild on
     * his website (http://www.cis.rit.edu/fairchild/).
     *
     * This code should be used with XYZ values in 0 to 100, not 0 to 1 like
     * most of my other code uses.  For input from sRGB, I recommend that
     * you use these values: D65 whitepoint, 20% gray, La value of 4 cd/m^2
     * to correspond to an ambient illumination of 64 lux:
     *
     *  la = 4.0;
     *  yb = 20.0;
     *  xw = 95.05;
     *  yw = 100.00;
     *  zw = 108.88;
     *
     *  The f, c and nc parameters control the surround.  CIECAM02 uses
     *  these values for average (relative luminance > 20% of scene white),
     *  dim (between 0% and 20%), and dark (0%).  In general, use average
     *  for both input and output.
     *
     *  // Average
     *  f = 1.0; c = 0.690; nc = 1.0;
     *  // Dim
     *  f = 0.9; c = 0.590; nc = 0.95;
     *  // Dark
     *  f = 0.8; c = 0.525; nc = 0.8;
     *
     * J is the lightness.
     * C is the chroma.
     * h is the hue angle in 0-360.
     * Q is the brightness.
     * M is the colourfulness.
     * s is the saturation.
     */
    public struct CIECAM02Color
    {
        public double x, y, z;
        public double Lightness; // J
        public double Chroma; // C
        public double HueAngle; // h
        public double HueComposition; // H
        public double Brightness; // Q
        public double Colorfulness; // M
        public double Saturation; // s
        public double ac, bc;
        public double asl, bs;
        public double am, bm;
    }

    public class CIECAM02
    {
        double xw, yw, zw, aw; // reference white point
        double LuminanceAdaptingField; // La
        double LuminanceFactorBackground; // Lb
        double BackgroundLuminousFactor, z;
        double IncompleteAdaptationFactor; // F
        double ImpactOfSurround; // c
        double ChromaticInductionFactor; // Nc
        double adaptation; // d
        double nbb, ncb, fl; 

        /// <summary>
        /// Setups up initial viewing conditions
        /// </summary>
        public CIECAM02(double xw, double yw, double zw, double la, double yb, int surround)
        {
            this.xw = xw;
            this.yw = yw;
            this.zw = zw;
            this.LuminanceAdaptingField = la;
            this.LuminanceFactorBackground = yb;

            // Xw, Yw, Zw, La, Yb and surround
            // surrounds are 1 - average, 2 - dim, and 3 - dark.
            if (surround == 1)
            {
                /**
                 * Average
                 */
                IncompleteAdaptationFactor = 1.00;
                ImpactOfSurround = 0.69;
                ChromaticInductionFactor = 1.00;
            }
            else if (surround == 2)
            {
                /**
                 * Dim
                 */
                IncompleteAdaptationFactor = 0.90;
                ImpactOfSurround = 0.59;
                ChromaticInductionFactor = 0.90;
            }
            else if (surround == 3)
            {
                /**
                 * Dark
                 */
                IncompleteAdaptationFactor = 0.800;
                ImpactOfSurround = 0.525;
                ChromaticInductionFactor = 0.800;
            }


            /**
             * Read in and compute the parameters associated with the viewing conditions.
             */

            BackgroundLuminousFactor = yb / yw;
            z = 1.48 + Math.Pow(BackgroundLuminousFactor, 0.5);
            fl = Compute_fl(la);
            nbb = 0.725 * Math.Pow((1.0 / BackgroundLuminousFactor), 0.2);
            ncb = nbb;
            adaptation = IncompleteAdaptationFactor * (1.0 - ((1.0 / 3.6) * Math.Exp((-la - 42.0) / 92.0)));
            aw = achromatic_response_to_white();
        }

        private double achromatic_response_to_white()
        {
            double r, g, b;
            double rc, gc, bc;
            double rp, gp, bp;
            double rpa, gpa, bpa;

            xyz_to_cat02(out r, out g, out b, xw, yw, zw);

            rc = r * (((yw * adaptation) / r) + (1.0 - adaptation));
            gc = g * (((yw * adaptation) / g) + (1.0 - adaptation));
            bc = b * (((yw * adaptation) / b) + (1.0 - adaptation));

            cat02_to_hpe(out rp, out gp, out bp, rc, gc, bc);

            rpa = nonlinear_adaptation(rp, fl);
            gpa = nonlinear_adaptation(gp, fl);
            bpa = nonlinear_adaptation(bp, fl);

            return ((2.0 * rpa) + gpa + ((1.0 / 20.0) * bpa) - 0.305) * nbb;
        }

        static void xyz_to_cat02(out double r, out double g, out double b,
                          double x, double y, double z)
        {
            r = (0.7328 * x) + (0.4296 * y) - (0.1624 * z);
            g = (-0.7036 * x) + (1.6975 * y) + (0.0061 * z);
            b = (0.0030 * x) + (0.0136 * y) + (0.9834 * z);
        }

        void cat02_to_hpe(out double rh, out double gh, out double bh,
                          double r, double g, double b)
        {
            rh = (0.7409792 * r) + (0.2180250 * g) + (0.0410058 * b);
            gh = (0.2853532 * r) + (0.6242014 * g) + (0.0904454 * b);
            bh = (-0.0096280 * r) - (0.0056980 * g) + (1.0153260 * b);
        }

        static double nonlinear_adaptation(double c, double fl)
        {
            double p = Math.Pow((fl * c) / 100.0, 0.42);
            return ((400.0 * p) / (27.13 + p)) + 0.1;
        }

        private static double Compute_fl(double la)
        {
            double k = 1.0 / ((5.0 * la) + 1.0);
            return 0.2 * Math.Pow(k, 4.0) * (5.0 * la) + 0.1 *
                 (Math.Pow((1.0 - Math.Pow(k, 4.0)), 2.0)) *
                 (Math.Pow((5.0 * la), (1.0 / 3.0)));
        }

        public CIECAM02Color CreateColorFromXYZ(double X, double Y, double Z)
        {
            CIECAM02Color color = new CIECAM02Color();
            color.x = X;
            color.y = Y;
            color.z = Z;
            color = forwardCIECAM02(color);
            return color;
        }

        public CIECAM02Color CreateColorFromJCh(double J, double C, double h)
        {
            CIECAM02Color color = new CIECAM02Color();
            color.Lightness = J;
            color.Chroma = C;
            color.HueAngle = h;
            color = inverseCIECAM02(color);
            return color;
        }

        private CIECAM02Color inverseCIECAM02(CIECAM02Color theColor)
        {
            double r, g, b;
            double rw, gw, bw;
            double rc, gc, bc;
            double rp, gp, bp;
            double rpa, gpa, bpa;
            double a, ca, cb;
            double et, t;
            double p1, p2, p3, p4, p5, hr;
            double tx, ty, tz;

            xyz_to_cat02(out rw, out gw, out bw, this.xw, this.yw, this.zw);

            t = Math.Pow(theColor.Chroma / (Math.Sqrt(theColor.Lightness / 100.0) * Math.Pow(1.64 - Math.Pow(0.29, this.BackgroundLuminousFactor), 0.73)), (1.0 / 0.9));
            et = (1.0 / 4.0) * (Math.Cos(((theColor.HueAngle * Math.PI) / 180.0) + 2.0) + 3.8);

            a = Math.Pow(theColor.Lightness / 100.0, 1.0 / (this.ImpactOfSurround * this.z)) * this.aw;

            p1 = ((50000.0 / 13.0) * this.ChromaticInductionFactor * this.ncb) * et / t;
            p2 = (a / this.nbb) + 0.305;
            p3 = 21.0 / 20.0;

            hr = (theColor.HueAngle * Math.PI) / 180.0;

            if (Math.Abs(Math.Sin(hr)) >= Math.Abs(Math.Cos(hr)))
            {
                p4 = p1 / Math.Sin(hr);
                cb = (p2 * (2.0 + p3) * (460.0 / 1403.0)) /
                 (p4 + (2.0 + p3) * (220.0 / 1403.0) *
                 (Math.Cos(hr) / Math.Sin(hr)) - (27.0 / 1403.0) +
                 p3 * (6300.0 / 1403.0));
                ca = cb * (Math.Cos(hr) / Math.Sin(hr));
            }
            else
            {
                p5 = p1 / Math.Cos(hr);
                ca = (p2 * (2.0 + p3) * (460.0 / 1403.0)) /
                     (p5 + (2.0 + p3) * (220.0 / 1403.0) -
                     ((27.0 / 1403.0) - p3 * (6300.0 / 1403.0)) *
                     (Math.Sin(hr) / Math.Cos(hr)));
                cb = ca * (Math.Sin(hr) / Math.Cos(hr));
            }

            Aab_to_rgb(out rpa, out gpa, out bpa, a, ca, cb, this.nbb);

            rp = inverse_nonlinear_adaptation(rpa, this.fl);
            gp = inverse_nonlinear_adaptation(gpa, this.fl);
            bp = inverse_nonlinear_adaptation(bpa, this.fl);

            hpe_to_xyz(out tx, out ty, out tz, rp, gp, bp);
            xyz_to_cat02(out rc, out gc, out bc, tx, ty, tz);

            r = rc / (((this.yw * this.adaptation) / rw) + (1.0 - this.adaptation));
            g = gc / (((this.yw * this.adaptation) / gw) + (1.0 - this.adaptation));
            b = bc / (((this.yw * this.adaptation) / bw) + (1.0 - this.adaptation));

            cat02_to_xyz(out theColor.x, out theColor.y, out theColor.z, r, g, b);

            return (theColor);

        }

        static double inverse_nonlinear_adaptation(double c, double fl)
        {
            return (100.0 / fl) * Math.Pow((27.13 * Math.Abs(c - 0.1)) / (400.0 - Math.Abs(c - 0.1)), 1.0 / 0.42);
        }

        static void hpe_to_xyz(out double x, out double y, out double z,
                        double r, double g, double b)
        {
            x = (1.910197 * r) - (1.112124 * g) + (0.201908 * b);
            y = (0.370950 * r) + (0.629054 * g) - (0.000008 * b);
            z = b;
        }

        static void cat02_to_xyz(out double x, out double y, out double z,
                          double r, double g, double b)
        {
            x = (1.096124 * r) - (0.278869 * g) + (0.182745 * b);
            y = (0.454369 * r) + (0.473533 * g) + (0.072098 * b);
            z = (-0.009628 * r) - (0.005698 * g) + (1.015326 * b);
        }

        static void Aab_to_rgb(out double r, out double g, out double b, double A, double aa,
                        double bb, double nbb)
        {
            double x = (A / nbb) + 0.305;

            /*       c1              c2               c3       */
            r = (0.32787 * x) + (0.32145 * aa) + (0.20527 * bb);
            /*       c1              c4               c5       */
            g = (0.32787 * x) - (0.63507 * aa) - (0.18603 * bb);
            /*       c1              c6               c7       */
            b = (0.32787 * x) - (0.15681 * aa) - (4.49038 * bb);
        }

        CIECAM02Color forwardCIECAM02(CIECAM02Color theColor)
        {
            double r, g, b;
            double rw, gw, bw;
            double rc, gc, bc;
            double rp, gp, bp;
            double rpa, gpa, bpa;
            double a, ca, cb;
            double et, t, temp;

            xyz_to_cat02(out r, out g, out b, theColor.x, theColor.y, theColor.z);
            xyz_to_cat02(out rw, out gw, out bw, xw, yw, zw);

            rc = r * (((this.yw * this.adaptation) / rw) + (1.0 - this.adaptation));
            gc = g * (((this.yw * this.adaptation) / gw) + (1.0 - this.adaptation));
            bc = b * (((this.yw * this.adaptation) / bw) + (1.0 - this.adaptation));

            cat02_to_hpe(out rp, out gp, out bp, rc, gc, bc);

            rpa = nonlinear_adaptation(rp, this.fl);
            gpa = nonlinear_adaptation(gp, this.fl);
            bpa = nonlinear_adaptation(bp, this.fl);

            ca = rpa - ((12.0 * gpa) / 11.0) + (bpa / 11.0);
            cb = (1.0 / 9.0) * (rpa + gpa - (2.0 * bpa));

            theColor.HueAngle = (180.0 / Math.PI) * Math.Atan2(cb, ca);
            if (theColor.HueAngle < 0.0) theColor.HueAngle += 360.0;

            if (theColor.HueAngle < 20.14)
            {
                temp = ((theColor.HueAngle + 122.47) / 1.2) + ((20.14 - theColor.HueAngle) / 0.8);
                theColor.HueComposition = 300 + (100 * ((theColor.HueAngle + 122.47) / 1.2)) / temp;
            }
            else if (theColor.HueAngle < 90.0)
            {
                temp = ((theColor.HueAngle - 20.14) / 0.8) + ((90.00 - theColor.HueAngle) / 0.7);
                theColor.HueComposition = (100 * ((theColor.HueAngle - 20.14) / 0.8)) / temp;
            }
            else if (theColor.HueAngle < 164.25)
            {
                temp = ((theColor.HueAngle - 90.00) / 0.7) + ((164.25 - theColor.HueAngle) / 1.0);
                theColor.HueComposition = 100 + ((100 * ((theColor.HueAngle - 90.00) / 0.7)) / temp);
            }
            else if (theColor.HueAngle < 237.53)
            {
                temp = ((theColor.HueAngle - 164.25) / 1.0) + ((237.53 - theColor.HueAngle) / 1.2);
                theColor.HueComposition = 200 + ((100 * ((theColor.HueAngle - 164.25) / 1.0)) / temp);
            }
            else
            {
                temp = ((theColor.HueAngle - 237.53) / 1.2) + ((360 - theColor.HueAngle + 20.14) / 0.8);
                theColor.HueComposition = 300 + ((100 * ((theColor.HueAngle - 237.53) / 1.2)) / temp);
            }

            a = ((2.0 * rpa) + gpa + ((1.0 / 20.0) * bpa) - 0.305) * this.nbb;

            theColor.Lightness = 100.0 * Math.Pow(a / this.aw, this.ImpactOfSurround * this.z);

            et = (1.0 / 4.0) * (Math.Cos(((theColor.HueAngle * Math.PI) / 180.0) + 2.0) + 3.8);
            t = ((50000.0 / 13.0) * this.ChromaticInductionFactor * this.ncb * et * Math.Sqrt((ca * ca) + (cb * cb))) /
                 (rpa + gpa + (21.0 / 20.0) * bpa);

            theColor.Chroma = Math.Pow(t, 0.9) * Math.Sqrt(theColor.Lightness / 100.0)
                                 * Math.Pow(1.64 - Math.Pow(0.29, this.BackgroundLuminousFactor), 0.73);

            theColor.Brightness = (4.0 / this.ImpactOfSurround) * Math.Sqrt(theColor.Lightness / 100.0) *
                  (this.aw + 4.0) * Math.Pow(this.fl, 0.25);

            theColor.Colorfulness = theColor.Chroma * Math.Pow(this.fl, 0.25);

            theColor.Saturation = 100.0 * Math.Sqrt(theColor.Colorfulness / theColor.Brightness);

            theColor.ac = theColor.Chroma * Math.Cos((theColor.HueAngle * Math.PI) / 180.0);
            theColor.bc = theColor.Chroma * Math.Sin((theColor.HueAngle * Math.PI) / 180.0);

            theColor.am = theColor.Colorfulness * Math.Cos((theColor.HueAngle * Math.PI) / 180.0);
            theColor.bm = theColor.Colorfulness * Math.Sin((theColor.HueAngle * Math.PI) / 180.0);

            theColor.asl = theColor.Saturation * Math.Cos((theColor.HueAngle * Math.PI) / 180.0);
            theColor.bs = theColor.Saturation * Math.Sin((theColor.HueAngle * Math.PI) / 180.0);

            return (theColor);
        }

    }

Brightness

Starting from an RGB value there are various ways of determining the brightness of that pixel.  The simplest is to compute the average of the R, G and B channels.

        private double GetBrightness(Color clr)
        {
            return (clr.R + clr.G + clr.B) / 3.0; 
        }

Another is to use the HSV hexcone model which takes the maximum of all the colors

        private double GetBrightnessHSV(Color clr)
        {
            return Math.Max(Math.Max(clr.R, clr.G), clr.B);
        } 

Another is to use the HSL method which averages the maximum and minimum of the colors

        private double GetBrightnessHSL(Color clr)
        {
            return 0.5 * Math.Max(Math.Max(clr.R, clr.G), clr.B) + 0.5 * Math.Min(Math.Min(clr.R, clr.G), clr.B);
        }

There is also luma which is based upon the weighted average of gamma corrected RGB values.  Naturally there are different corrections depending upon the source.  Rec.601 refers to NTSC sources.  Rec 709 to sRGB and Rec 2020 to UHDTV.  Noting that in the .net framework each color is represented with 8 bits per color.  Rec 2020 calls for 10 or 12 bits per sample, expanding the range of possible color representation, this requires a different “color” object to represent the pixel’s value.

        private double GetBrightnessLumaRec2020(UHDColor clr)
        {
            return 0.2627 * clr.R + 0.6780 * clr.G + 0.0593 * clr.B; // rec 2020 Luma coefficients
        }

        private double GetBrightnessLumaRec601(Color clr)
        {
            return 0.30 * clr.R + 0.59 * clr.G + 0.11 * clr.B; // rec 601 Luma coefficients
        }

        private double GetBrightnessLumaRec709(Color clr)
        {
            return 0.21 * clr.R + 0.72 * clr.G + 0.07 * clr.B; // rec 709 Luma coefficients
        } 

All of these approximations have flaws, the most accurate representation we have appears to be the latest standard from the International Commission on Illumination (yes it really exists!) called CIECAM02.  This appears to be implemented in Windows from Vista onwards but not yet available in .net.

Working with an Image

Our budding DevOps engineer will need to be able to look at diagrams and understand them.  In order to be able to read a diagram we will need to be able to pick out objects from the background and capture the text associated with the objects.  The most general case if for this to be an image (Visio plugins could help pull out the underlying object structure but not necessarily the text that goes with each object, nor necessarily the associations between objects which might be joining or non joining lines or based upon underlays or overlaps of other images)

If we think about it, although the image is itself flat, we humans are able to see the difference between the objects in the picture and determine where the boundaries are.  If such a picture is initially held as a byte array and we are able to understand the displayed picture as a layered image, potentially with depth and shadow, then all such information necessary to determine this is in that byte array that we started with.

So first lets convert our bitmap into a byte array so we can do something more fancy with it then we can with it as a bitmap.  Also has the side effect of being much faster to operate with then directly manipulating the bitmap with .net.  Perhaps it will be more useful as a 2 dimensional byte array.

        public byte[,] GetImage(string filename)
        {
            Bitmap bmap = new Bitmap(filename);
            int colorDepth = Bitmap.GetPixelFormatSize(bmap.PixelFormat);
            int sizex = bmap.Width;
            int sizey = bmap.Height;
            int bytesPerPixel = colorDepth / 8;
            int pixelCount = sizex * sizey;
            byte[] pixels = new byte[pixelCount * bytesPerPixel];

            Rectangle rect = new Rectangle(0, 0, sizex, sizey);
            var bitmapData = bmap.LockBits(rect, ImageLockMode.ReadWrite,
                  bmap.PixelFormat);
            IntPtr Iptr = bitmapData.Scan0;

            // Copy data from pointer to array
            Marshal.Copy(Iptr, pixels, 0, pixels.Length);
            bmap.UnlockBits(bitmapData);
            byte[,] pixelgrid = pixels.ToSquare2D(sizex * bytesPerPixel);
            return pixelgrid;
        }	

The ToSquare2D extension method I used was originally posted by ZenLulz here.  I have kept the extension method however have replaced its insides with BlockCopy which appears to be faster.

Buffer.BlockCopy(array, 0, buffer, 0, array.Length);

One of the first things we notice if we open up a picture is that nearly every single pixel is different.  Even areas which might look visually identical can have little variations, RGB(255,0,0) looks extremely similar to RGB(254,1,2) if it is not identical.  So if it is identical to me and I can read the picture then it is too much information.  Of course those subtle differences might be what helps us determine orientation and depth in an otherwise flat image.