Images in Cocoa Touch, represented by the UIImage class, are a very important subject. Apple’s iOS platform prides itself on visual appeal, with Retina Displays, custom UI in many top apps, and a focus on photos with apps like Instagram. To that end, it behooves you as an iOS programmer to know a bit about working with images. This post won’t discuss everything you need to know about using the UIImage class, as that’s more appropriate for a book than a blog post—though maybe a series of blog posts would do—but instead will focus on one advanced topic: working with pixel data. You can find the basic stuff in the UIImage documentation, anyway.
Turning an Image Into Data
One of the first things you might want to do with a UIImage object is to save it to disk. To do that, you’ll need to save it to an image file. There are built-in functions to get properly-formatted data from an image, in both PNG and JPEG functions:
- UIImagePNGRepresentation(), which returns an NSData object formatted as a PNG image, taking a pointer to a UIImage object as its sole parameter.
- UIImageJPEGRepresentation(), which returns an NSData object formatted as a JPEG image. Like the previous function, its first parameter is a pointer to a UIImage object, but it has a second argument: a CGFloat value representing the compression quality to use, with 0.0 representing the lowest-quality, highest-compression JPEG image possible, and 1.0 representing the highest-quality, lowest-compression image possible.
Once you have the image data represented by an NSData object, you can then save it to disk with various NSData methods, such as -writeToFile:atomically:.
Getting Raw Pixel Data
While the above functions are great for saving images, they aren’t so great for image analysis. Sometimes you need to analyze the pixel data for a given pixel, down to the values for the red, green, blue, and alpha components. To get that kind of granularity in an image, we’ll be using a lot of CoreGraphics functions. If you haven’t used CoreGraphics before, know before going in that it’s a C-based API à la CoreFoundation, so you won’t be using the Objective-C objects you know and are used to. Instead, there are opaque types (represented by CFTypeRef, which is analogous to Objective-C’s id) representing objects grafted onto C, complete with manual memory management—no ARC for you. That’s neither here nor their, however; let’s talk about pixel data.
The color space of an image defines what the color components of each pixel are. Represented by the CGColorSpace type, you’ll typically use either an RGB color space or a Gray color space, which have red, green, and blue components or a white component, respectively. For this example, we’ll be using the RGB color space. We can create an instance of it with the CGColorSpaceCreateDeviceRGB() function, which returns a CGColorSpaceRef type—think of it as a pointer to a CGColorSpace object.
What does using this color space get us? We now know that the pixels of our image will have three color components, and in what order. This will come in handy later on when we need to query the data.
A graphics context, represented by the CGContext type, is analogous to a painter’s canvas—it’s what you draw into. For the purposes of drawing an image, you’ll create a CGBitmapContext, the ideal type of context for this data. You create a context with the CGBitmapContextCreate() function, which return a CGContextRef type. Let’s look at the declaration of that function (from CGBitmapContext.h):
CGContextRef CGBitmapContextCreate ( void *data, size_t width, size_t height, size_t bitsPerComponent, size_t bytesPerRow, CGColorSpaceRef colorspace, CGBitmapInfo bitmapInfo );
So, that’s pretty simple, right? It’s actually fairly straightforward, despite its appearance. Let’s break it down into more easily-digestible components. It’ll make more sense if we don’t go top-to-bottom, so we’ll go in the order I think makes the most sense.
First is the bitmapInfo parameter. The CGBitmapInfo type is a bitmask that represents two options: the alpha component, which contains transparency information, and the byte order of the data. We’ll talk about the alpha component here; byte order is another topic altogether. On iOS, only some pixel formats are supported. Looking at this chart in the documentation, we can see that, for all supported pixel formats on iOS in the RGB color space, these are the CGBitmapInfo constants we can use:
We can do two things with the alpha component: skip it, or use it in a premultiplied format. The premultiplied flag tells the system to multiply the individual red, green, and blue components by the alpha value when storing it. So, instead of RGBA values of 1, 1, 1, and 0.5, it’s stored as 0.5, 0.5, 0.5, and 0.5. This is a performance-saving measure on iOS devices, and is done automatically to all of your PNG images by Xcode when you build for a device.
So, for the bitmapInfo parameter, I generally pass kCGImageAlphaPremultipliedLast.
The penultimate parameter, colorspace, is a CGColorSpaceRef pointing to a color space you’ve created. This informs the context about the number of color components. Keep in mind that there’s one extra component for the alpha information if you’re not skipping it, so an RGB color space uses 4 components including alpha.
The width and height parameters are pretty simple: the number of pixels wide and high to make the context. Keep in mind that for Retina displays, you may need to double the values. You can use the scale property of the main UIScreen object as a quick “am I on a Retina device?” check.
Next, let’s talk about the first parameter: data Here you have two options: to pass in a pointer to a region of memory you’ve allocated for the image data, or to pass NULL and have the graphics subsystem create it for you. If you’re trying to access pixel data, however, it’ll help to have a pointer to the data, so here you’d pass in memory you’ve allocated. How do you know how much is enough? Let’s look at the bitsPerComponent parameter. I usually use 8-bit components—again, see the chart linked above for valid options—so I would pass 8 for bitsPerComponent. Once you know that, you can determine bytesPerRow easily:
size_t bytesPerRow = (bitsPerComponent * width) / 8;
And then, finally, we can determine how much data to use. I use the uint8_t data type to represent this, as it’s an unsigned 8-bit integer, perfect for our needs.
uint8_t data = calloc((width * height) * numberOfComponents, sizeof(uint8_t));
The entire stack might look like this:
The only thing in this code that we haven’t gone over so far is the call to CGContextDrawImage, which (surprisingly) draws the image. It takes three parameters: the context to draw into, a CGRect defining where to draw, and a CGImageRef for the image. You can obtain a CGImageRef from a UIImage using its -CGImage method.
Now that the image is drawn in our context, the rawData array will be filled with real, live image data! You can access it like so (modify the values of x and y as suits your needs):
int x = 0; int y = 0; int byteIndex = (bytesPerRow * y) + (x * bytesPerPixel); uint8_ t red = rawData[byteIndex]; uint8_ t green = rawData[byteIndex + 1]; uint8_ t blue = rawData[byteIndex + 2]; uint8_ t alpha = rawData[byteIndex + 3];
And there you have it! Now that you’ve gotten the data out of your image, do whatever you want with it. Just remember the blog authors you read along the way when Facebook buys you for a billion dollars.
Note: The venerable Mike Ash published a similar article while this one was half-done in my drafts folder. I thought about scrapping it altogether, but since mine is iOS-specific, and with some prodding from a co-worker, I decided to press on. Go read Mike’s blog, too. It’s awesome.