Working with an Image

Our budding DevOps engineer will need to be able to look at diagrams and understand them.  In order to be able to read a diagram we will need to be able to pick out objects from the background and capture the text associated with the objects.  The most general case if for this to be an image (Visio plugins could help pull out the underlying object structure but not necessarily the text that goes with each object, nor necessarily the associations between objects which might be joining or non joining lines or based upon underlays or overlaps of other images)

If we think about it, although the image is itself flat, we humans are able to see the difference between the objects in the picture and determine where the boundaries are.  If such a picture is initially held as a byte array and we are able to understand the displayed picture as a layered image, potentially with depth and shadow, then all such information necessary to determine this is in that byte array that we started with.

So first lets convert our bitmap into a byte array so we can do something more fancy with it then we can with it as a bitmap.  Also has the side effect of being much faster to operate with then directly manipulating the bitmap with .net.  Perhaps it will be more useful as a 2 dimensional byte array.

        public byte[,] GetImage(string filename)
        {
            Bitmap bmap = new Bitmap(filename);
            int colorDepth = Bitmap.GetPixelFormatSize(bmap.PixelFormat);
            int sizex = bmap.Width;
            int sizey = bmap.Height;
            int bytesPerPixel = colorDepth / 8;
            int pixelCount = sizex * sizey;
            byte[] pixels = new byte[pixelCount * bytesPerPixel];

            Rectangle rect = new Rectangle(0, 0, sizex, sizey);
            var bitmapData = bmap.LockBits(rect, ImageLockMode.ReadWrite,
                  bmap.PixelFormat);
            IntPtr Iptr = bitmapData.Scan0;

            // Copy data from pointer to array
            Marshal.Copy(Iptr, pixels, 0, pixels.Length);
            bmap.UnlockBits(bitmapData);
            byte[,] pixelgrid = pixels.ToSquare2D(sizex * bytesPerPixel);
            return pixelgrid;
        }	

The ToSquare2D extension method I used was originally posted by ZenLulz here.  I have kept the extension method however have replaced its insides with BlockCopy which appears to be faster.

Buffer.BlockCopy(array, 0, buffer, 0, array.Length);

One of the first things we notice if we open up a picture is that nearly every single pixel is different.  Even areas which might look visually identical can have little variations, RGB(255,0,0) looks extremely similar to RGB(254,1,2) if it is not identical.  So if it is identical to me and I can read the picture then it is too much information.  Of course those subtle differences might be what helps us determine orientation and depth in an otherwise flat image.

Leave a comment