Image Recognition

In order to be able to have a solid basis for processing an image with similar results to the way humans do we need to “see” the image in the same way. Unfortunately we don’t really have a model for converting an RGB bitmap into an image as it would be seen by the human eye. The closest model is the CIECAM02 model published by the CIE. Unfortunately much of the articles discussing this are hidden behind paywall and so the field advances very very slowly. Still, Billy Biggs with a little help from Nathan Moroney from HP have published an implementation for CIECAM02 in C. So here is an adaption for c#.

There are a number of paywalled articles which discuss flaws with the CIECAM02 algorithm and a number of possible future correctives are being examined by the CIE. Perhaps the element the most relevant to image processing however is that we gain greater depth perception with 2 eyes rather than 1 eye. So logically the image as it is perceived by each eye should be slightly different. This is not currently represented in the algorithm but something to be considered. The choice of illumination should not really matter (we are able to understand the same image in a variety of settings). There should however be sufficient contrast to the white point to be able to resolve the differences in color. It seems to me that the selection of the white point should also be somehow derivable from the RGB image as we are able to analyse an image

    /* this code is adapted from the work of Billy Biggs and Nathan Moroney
     * http://scanline.ca/ciecam02/
     * 
     * The following is an implementation of the forward and inverse
     * functions for XYZ to JCh values from CIECAM02, as well as a function
     * to return all of J, C, h, Q, M and s.  It has been tested against the
     * spreadsheet of example calculations posted by Mark D.  Fairchild on
     * his website (http://www.cis.rit.edu/fairchild/).
     *
     * This code should be used with XYZ values in 0 to 100, not 0 to 1 like
     * most of my other code uses.  For input from sRGB, I recommend that
     * you use these values: D65 whitepoint, 20% gray, La value of 4 cd/m^2
     * to correspond to an ambient illumination of 64 lux:
     *
     *  la = 4.0;
     *  yb = 20.0;
     *  xw = 95.05;
     *  yw = 100.00;
     *  zw = 108.88;
     *
     *  The f, c and nc parameters control the surround.  CIECAM02 uses
     *  these values for average (relative luminance > 20% of scene white),
     *  dim (between 0% and 20%), and dark (0%).  In general, use average
     *  for both input and output.
     *
     *  // Average
     *  f = 1.0; c = 0.690; nc = 1.0;
     *  // Dim
     *  f = 0.9; c = 0.590; nc = 0.95;
     *  // Dark
     *  f = 0.8; c = 0.525; nc = 0.8;
     *
     * J is the lightness.
     * C is the chroma.
     * h is the hue angle in 0-360.
     * Q is the brightness.
     * M is the colourfulness.
     * s is the saturation.
     */
    public struct CIECAM02Color
    {
        public double x, y, z;
        public double Lightness; // J
        public double Chroma; // C
        public double HueAngle; // h
        public double HueComposition; // H
        public double Brightness; // Q
        public double Colorfulness; // M
        public double Saturation; // s
        public double ac, bc;
        public double asl, bs;
        public double am, bm;
    }

    public class CIECAM02
    {
        double xw, yw, zw, aw; // reference white point
        double LuminanceAdaptingField; // La
        double LuminanceFactorBackground; // Lb
        double BackgroundLuminousFactor, z;
        double IncompleteAdaptationFactor; // F
        double ImpactOfSurround; // c
        double ChromaticInductionFactor; // Nc
        double adaptation; // d
        double nbb, ncb, fl; 

        /// <summary>
        /// Setups up initial viewing conditions
        /// </summary>
        public CIECAM02(double xw, double yw, double zw, double la, double yb, int surround)
        {
            this.xw = xw;
            this.yw = yw;
            this.zw = zw;
            this.LuminanceAdaptingField = la;
            this.LuminanceFactorBackground = yb;

            // Xw, Yw, Zw, La, Yb and surround
            // surrounds are 1 - average, 2 - dim, and 3 - dark.
            if (surround == 1)
            {
                /**
                 * Average
                 */
                IncompleteAdaptationFactor = 1.00;
                ImpactOfSurround = 0.69;
                ChromaticInductionFactor = 1.00;
            }
            else if (surround == 2)
            {
                /**
                 * Dim
                 */
                IncompleteAdaptationFactor = 0.90;
                ImpactOfSurround = 0.59;
                ChromaticInductionFactor = 0.90;
            }
            else if (surround == 3)
            {
                /**
                 * Dark
                 */
                IncompleteAdaptationFactor = 0.800;
                ImpactOfSurround = 0.525;
                ChromaticInductionFactor = 0.800;
            }


            /**
             * Read in and compute the parameters associated with the viewing conditions.
             */

            BackgroundLuminousFactor = yb / yw;
            z = 1.48 + Math.Pow(BackgroundLuminousFactor, 0.5);
            fl = Compute_fl(la);
            nbb = 0.725 * Math.Pow((1.0 / BackgroundLuminousFactor), 0.2);
            ncb = nbb;
            adaptation = IncompleteAdaptationFactor * (1.0 - ((1.0 / 3.6) * Math.Exp((-la - 42.0) / 92.0)));
            aw = achromatic_response_to_white();
        }

        private double achromatic_response_to_white()
        {
            double r, g, b;
            double rc, gc, bc;
            double rp, gp, bp;
            double rpa, gpa, bpa;

            xyz_to_cat02(out r, out g, out b, xw, yw, zw);

            rc = r * (((yw * adaptation) / r) + (1.0 - adaptation));
            gc = g * (((yw * adaptation) / g) + (1.0 - adaptation));
            bc = b * (((yw * adaptation) / b) + (1.0 - adaptation));

            cat02_to_hpe(out rp, out gp, out bp, rc, gc, bc);

            rpa = nonlinear_adaptation(rp, fl);
            gpa = nonlinear_adaptation(gp, fl);
            bpa = nonlinear_adaptation(bp, fl);

            return ((2.0 * rpa) + gpa + ((1.0 / 20.0) * bpa) - 0.305) * nbb;
        }

        static void xyz_to_cat02(out double r, out double g, out double b,
                          double x, double y, double z)
        {
            r = (0.7328 * x) + (0.4296 * y) - (0.1624 * z);
            g = (-0.7036 * x) + (1.6975 * y) + (0.0061 * z);
            b = (0.0030 * x) + (0.0136 * y) + (0.9834 * z);
        }

        void cat02_to_hpe(out double rh, out double gh, out double bh,
                          double r, double g, double b)
        {
            rh = (0.7409792 * r) + (0.2180250 * g) + (0.0410058 * b);
            gh = (0.2853532 * r) + (0.6242014 * g) + (0.0904454 * b);
            bh = (-0.0096280 * r) - (0.0056980 * g) + (1.0153260 * b);
        }

        static double nonlinear_adaptation(double c, double fl)
        {
            double p = Math.Pow((fl * c) / 100.0, 0.42);
            return ((400.0 * p) / (27.13 + p)) + 0.1;
        }

        private static double Compute_fl(double la)
        {
            double k = 1.0 / ((5.0 * la) + 1.0);
            return 0.2 * Math.Pow(k, 4.0) * (5.0 * la) + 0.1 *
                 (Math.Pow((1.0 - Math.Pow(k, 4.0)), 2.0)) *
                 (Math.Pow((5.0 * la), (1.0 / 3.0)));
        }

        public CIECAM02Color CreateColorFromXYZ(double X, double Y, double Z)
        {
            CIECAM02Color color = new CIECAM02Color();
            color.x = X;
            color.y = Y;
            color.z = Z;
            color = forwardCIECAM02(color);
            return color;
        }

        public CIECAM02Color CreateColorFromJCh(double J, double C, double h)
        {
            CIECAM02Color color = new CIECAM02Color();
            color.Lightness = J;
            color.Chroma = C;
            color.HueAngle = h;
            color = inverseCIECAM02(color);
            return color;
        }

        private CIECAM02Color inverseCIECAM02(CIECAM02Color theColor)
        {
            double r, g, b;
            double rw, gw, bw;
            double rc, gc, bc;
            double rp, gp, bp;
            double rpa, gpa, bpa;
            double a, ca, cb;
            double et, t;
            double p1, p2, p3, p4, p5, hr;
            double tx, ty, tz;

            xyz_to_cat02(out rw, out gw, out bw, this.xw, this.yw, this.zw);

            t = Math.Pow(theColor.Chroma / (Math.Sqrt(theColor.Lightness / 100.0) * Math.Pow(1.64 - Math.Pow(0.29, this.BackgroundLuminousFactor), 0.73)), (1.0 / 0.9));
            et = (1.0 / 4.0) * (Math.Cos(((theColor.HueAngle * Math.PI) / 180.0) + 2.0) + 3.8);

            a = Math.Pow(theColor.Lightness / 100.0, 1.0 / (this.ImpactOfSurround * this.z)) * this.aw;

            p1 = ((50000.0 / 13.0) * this.ChromaticInductionFactor * this.ncb) * et / t;
            p2 = (a / this.nbb) + 0.305;
            p3 = 21.0 / 20.0;

            hr = (theColor.HueAngle * Math.PI) / 180.0;

            if (Math.Abs(Math.Sin(hr)) >= Math.Abs(Math.Cos(hr)))
            {
                p4 = p1 / Math.Sin(hr);
                cb = (p2 * (2.0 + p3) * (460.0 / 1403.0)) /
                 (p4 + (2.0 + p3) * (220.0 / 1403.0) *
                 (Math.Cos(hr) / Math.Sin(hr)) - (27.0 / 1403.0) +
                 p3 * (6300.0 / 1403.0));
                ca = cb * (Math.Cos(hr) / Math.Sin(hr));
            }
            else
            {
                p5 = p1 / Math.Cos(hr);
                ca = (p2 * (2.0 + p3) * (460.0 / 1403.0)) /
                     (p5 + (2.0 + p3) * (220.0 / 1403.0) -
                     ((27.0 / 1403.0) - p3 * (6300.0 / 1403.0)) *
                     (Math.Sin(hr) / Math.Cos(hr)));
                cb = ca * (Math.Sin(hr) / Math.Cos(hr));
            }

            Aab_to_rgb(out rpa, out gpa, out bpa, a, ca, cb, this.nbb);

            rp = inverse_nonlinear_adaptation(rpa, this.fl);
            gp = inverse_nonlinear_adaptation(gpa, this.fl);
            bp = inverse_nonlinear_adaptation(bpa, this.fl);

            hpe_to_xyz(out tx, out ty, out tz, rp, gp, bp);
            xyz_to_cat02(out rc, out gc, out bc, tx, ty, tz);

            r = rc / (((this.yw * this.adaptation) / rw) + (1.0 - this.adaptation));
            g = gc / (((this.yw * this.adaptation) / gw) + (1.0 - this.adaptation));
            b = bc / (((this.yw * this.adaptation) / bw) + (1.0 - this.adaptation));

            cat02_to_xyz(out theColor.x, out theColor.y, out theColor.z, r, g, b);

            return (theColor);

        }

        static double inverse_nonlinear_adaptation(double c, double fl)
        {
            return (100.0 / fl) * Math.Pow((27.13 * Math.Abs(c - 0.1)) / (400.0 - Math.Abs(c - 0.1)), 1.0 / 0.42);
        }

        static void hpe_to_xyz(out double x, out double y, out double z,
                        double r, double g, double b)
        {
            x = (1.910197 * r) - (1.112124 * g) + (0.201908 * b);
            y = (0.370950 * r) + (0.629054 * g) - (0.000008 * b);
            z = b;
        }

        static void cat02_to_xyz(out double x, out double y, out double z,
                          double r, double g, double b)
        {
            x = (1.096124 * r) - (0.278869 * g) + (0.182745 * b);
            y = (0.454369 * r) + (0.473533 * g) + (0.072098 * b);
            z = (-0.009628 * r) - (0.005698 * g) + (1.015326 * b);
        }

        static void Aab_to_rgb(out double r, out double g, out double b, double A, double aa,
                        double bb, double nbb)
        {
            double x = (A / nbb) + 0.305;

            /*       c1              c2               c3       */
            r = (0.32787 * x) + (0.32145 * aa) + (0.20527 * bb);
            /*       c1              c4               c5       */
            g = (0.32787 * x) - (0.63507 * aa) - (0.18603 * bb);
            /*       c1              c6               c7       */
            b = (0.32787 * x) - (0.15681 * aa) - (4.49038 * bb);
        }

        CIECAM02Color forwardCIECAM02(CIECAM02Color theColor)
        {
            double r, g, b;
            double rw, gw, bw;
            double rc, gc, bc;
            double rp, gp, bp;
            double rpa, gpa, bpa;
            double a, ca, cb;
            double et, t, temp;

            xyz_to_cat02(out r, out g, out b, theColor.x, theColor.y, theColor.z);
            xyz_to_cat02(out rw, out gw, out bw, xw, yw, zw);

            rc = r * (((this.yw * this.adaptation) / rw) + (1.0 - this.adaptation));
            gc = g * (((this.yw * this.adaptation) / gw) + (1.0 - this.adaptation));
            bc = b * (((this.yw * this.adaptation) / bw) + (1.0 - this.adaptation));

            cat02_to_hpe(out rp, out gp, out bp, rc, gc, bc);

            rpa = nonlinear_adaptation(rp, this.fl);
            gpa = nonlinear_adaptation(gp, this.fl);
            bpa = nonlinear_adaptation(bp, this.fl);

            ca = rpa - ((12.0 * gpa) / 11.0) + (bpa / 11.0);
            cb = (1.0 / 9.0) * (rpa + gpa - (2.0 * bpa));

            theColor.HueAngle = (180.0 / Math.PI) * Math.Atan2(cb, ca);
            if (theColor.HueAngle < 0.0) theColor.HueAngle += 360.0;

            if (theColor.HueAngle < 20.14)
            {
                temp = ((theColor.HueAngle + 122.47) / 1.2) + ((20.14 - theColor.HueAngle) / 0.8);
                theColor.HueComposition = 300 + (100 * ((theColor.HueAngle + 122.47) / 1.2)) / temp;
            }
            else if (theColor.HueAngle < 90.0)
            {
                temp = ((theColor.HueAngle - 20.14) / 0.8) + ((90.00 - theColor.HueAngle) / 0.7);
                theColor.HueComposition = (100 * ((theColor.HueAngle - 20.14) / 0.8)) / temp;
            }
            else if (theColor.HueAngle < 164.25)
            {
                temp = ((theColor.HueAngle - 90.00) / 0.7) + ((164.25 - theColor.HueAngle) / 1.0);
                theColor.HueComposition = 100 + ((100 * ((theColor.HueAngle - 90.00) / 0.7)) / temp);
            }
            else if (theColor.HueAngle < 237.53)
            {
                temp = ((theColor.HueAngle - 164.25) / 1.0) + ((237.53 - theColor.HueAngle) / 1.2);
                theColor.HueComposition = 200 + ((100 * ((theColor.HueAngle - 164.25) / 1.0)) / temp);
            }
            else
            {
                temp = ((theColor.HueAngle - 237.53) / 1.2) + ((360 - theColor.HueAngle + 20.14) / 0.8);
                theColor.HueComposition = 300 + ((100 * ((theColor.HueAngle - 237.53) / 1.2)) / temp);
            }

            a = ((2.0 * rpa) + gpa + ((1.0 / 20.0) * bpa) - 0.305) * this.nbb;

            theColor.Lightness = 100.0 * Math.Pow(a / this.aw, this.ImpactOfSurround * this.z);

            et = (1.0 / 4.0) * (Math.Cos(((theColor.HueAngle * Math.PI) / 180.0) + 2.0) + 3.8);
            t = ((50000.0 / 13.0) * this.ChromaticInductionFactor * this.ncb * et * Math.Sqrt((ca * ca) + (cb * cb))) /
                 (rpa + gpa + (21.0 / 20.0) * bpa);

            theColor.Chroma = Math.Pow(t, 0.9) * Math.Sqrt(theColor.Lightness / 100.0)
                                 * Math.Pow(1.64 - Math.Pow(0.29, this.BackgroundLuminousFactor), 0.73);

            theColor.Brightness = (4.0 / this.ImpactOfSurround) * Math.Sqrt(theColor.Lightness / 100.0) *
                  (this.aw + 4.0) * Math.Pow(this.fl, 0.25);

            theColor.Colorfulness = theColor.Chroma * Math.Pow(this.fl, 0.25);

            theColor.Saturation = 100.0 * Math.Sqrt(theColor.Colorfulness / theColor.Brightness);

            theColor.ac = theColor.Chroma * Math.Cos((theColor.HueAngle * Math.PI) / 180.0);
            theColor.bc = theColor.Chroma * Math.Sin((theColor.HueAngle * Math.PI) / 180.0);

            theColor.am = theColor.Colorfulness * Math.Cos((theColor.HueAngle * Math.PI) / 180.0);
            theColor.bm = theColor.Colorfulness * Math.Sin((theColor.HueAngle * Math.PI) / 180.0);

            theColor.asl = theColor.Saturation * Math.Cos((theColor.HueAngle * Math.PI) / 180.0);
            theColor.bs = theColor.Saturation * Math.Sin((theColor.HueAngle * Math.PI) / 180.0);

            return (theColor);
        }

    }

Our budding DevOps engineer will need to be able to look at diagrams and understand them. In order to be able to read a diagram we will need to be able to pick out objects from the background and capture the text associated with the objects. The most general case if for this to be an image (Visio plugins could help pull out the underlying object structure but not necessarily the text that goes with each object, nor necessarily the associations between objects which might be joining or non joining lines or based upon underlays or overlaps of other images)

If we think about it, although the image is itself flat, we humans are able to see the difference between the objects in the picture and determine where the boundaries are. If such a picture is initially held as a byte array and we are able to understand the displayed picture as a layered image, potentially with depth and shadow, then all such information necessary to determine this is in that byte array that we started with.

So first lets convert our bitmap into a byte array so we can do something more fancy with it then we can with it as a bitmap. Also has the side effect of being much faster to operate with then directly manipulating the bitmap with .net. Perhaps it will be more useful as a 2 dimensional byte array.

        public byte[,] GetImage(string filename)
        {
            Bitmap bmap = new Bitmap(filename);
            int colorDepth = Bitmap.GetPixelFormatSize(bmap.PixelFormat);
            int sizex = bmap.Width;
            int sizey = bmap.Height;
            int bytesPerPixel = colorDepth / 8;
            int pixelCount = sizex * sizey;
            byte[] pixels = new byte[pixelCount * bytesPerPixel];

            Rectangle rect = new Rectangle(0, 0, sizex, sizey);
            var bitmapData = bmap.LockBits(rect, ImageLockMode.ReadWrite,
                  bmap.PixelFormat);
            IntPtr Iptr = bitmapData.Scan0;

            // Copy data from pointer to array
            Marshal.Copy(Iptr, pixels, 0, pixels.Length);
            bmap.UnlockBits(bitmapData);
            byte[,] pixelgrid = pixels.ToSquare2D(sizex * bytesPerPixel);
            return pixelgrid;
        }

The ToSquare2D extension method I used was originally posted by ZenLulz here. I have kept the extension method however have replaced its insides with BlockCopy which appears to be faster.

Buffer.BlockCopy(array, 0, buffer, 0, array.Length);

One of the first things we notice if we open up a picture is that nearly every single pixel is different. Even areas which might look visually identical can have little variations, RGB(255,0,0) looks extremely similar to RGB(254,1,2) if it is not identical. So if it is identical to me and I can read the picture then it is too much information. Of course those subtle differences might be what helps us determine orientation and depth in an otherwise flat image.

Emergent Intelligence

The search for a new intelligence

More about how light is processed

CIECAM02

Brightness

Working with an Image