Binary files: Exif image info

Images taken by digital cameras are usually stored in JPEG format, but besides the main image also contain a thumbnail and various information about how the image was taken, in a format called Exif (Exchangeable image file format).

Write a program that takes a JPEG file named on the command line, and if it has an Exif segment, prints out the date the photo was taken and writes the thumbnail to a file named name_thumb.jpg, where name is the original name without extension.

A JPEG file starts with bytes 0xFF 0xD8, and then contains a variable number of segments. Each segment starts with a two-byte marker: 0xFF and a second byte identifying the segment type. Exif information is stored in the first segment which must have marker 0xFF 0xE1 (meaning application data APP1).

All 2-byte and 4-byte integer values mentioned in the following are unsigned values stored in binary, in little-endian byte order.

The APP1 segment has the following structure:

An Image File Directory has the following structure: All offsets are measured from the start of the TIFF header, which is 12 bytes from the start of the file (2 x 2 bytes marker, 2 bytes size, 6 bytes Exif header). Thus, adding 12 to the offset gives the position of the data from file start. In the Exif IFD, we look for a directory entry with tag 0x9003. This is the tag for the date and time when the original image was taken. The 4-byte offset in this entry points to a 20-byte null-terminated string with the data in YYYY:MM:DD HH:MM:SS format, which the program should print.

In IFD1, we look for two directory entries: one entry with tag 0x201, whose 4-byte offset points to the JPEG thumbnail data; and another entry with tag 0x202, whose 4-byte value represents the thumbnail image size. Thus, we read data from the given offset, of the indicated amount, and write it to the thumbnail file.

For more information see this page at MIT, some other explanations with figures or the standard.

Marius Minea
Last modified: Fri Mar 21 4:45:00 EET 2013