Image File Formats - which to use?

(From: http://www.scantips.com/basics09.html)

Copyright © 1997-2005 by Wayne Fulton - All rights are reserved.

 

Briefly, the three most common image file formats, the most important for general purposes today, are TIF, JPG and GIF.   I propose we also consider the new PNG format too.

Best file types for these general purposes:

 

Photographic Images

Graphics, including
Logos or Line art 

Properties

Continuous tones, 24 bit color or 8 bit Gray, no text, few lines and edges

Solid colors, up to 256 colors, with text or lines and sharp edges

Best Quality for Archived Master

TIF or PNG 
(no JPG artifacts)

PNG or GIF or TIF
(no JPG artifacts)

Smallest File Size

JPG with a higher Quality factor can be decent   (JPG is questionable quality for archiving master copies)

TIF LZW or GIF or PNG   (graphics/logos usually permit reducing to 2 to 16 colors for smallest file size)

Maximum Compatibility
(PC, Mac, Unix)

TIF or JPG  
(the simplest programs may not read TIF LZW)

TIF without LZW
or GIF

Worst Choice

256 color GIF is very limited color, and is a larger file than 24 bit JPG

JPG compression adds artifacts, smears text and lines and edges

These are not the only choices, but they are good and reasonable choices.

TIF file format is the undisputed leader when best quality is required. TIF is very commonly used in commercial printing or professional environments.

Web pages require JPG or GIF or PNG image types, because that is all that browsers can show. On the web, JPG is the best choice (smallest file) for photo images, and GIF is most common for graphic images.

A common question is "How do I make my image files smaller?".

The JPG section following attempts to explain why the wonderfully small JPG files are NOT the best choice to be the master copy of your important image. However JPG cannot be beat for emailing photographs to friends, and for web page use. The JPG file format is the smallest by far, and a JPG copy should be used for such purposes (when file size is all important). For Line art and graphic files (as opposed to photographic images), then GIF files have historically been best, both for smallest size and for best quality.

But note that lowering scan resolution to reasonable values for the purpose is often the best file size improvement you can make.

The Next button will browse through the descriptions on the next pages, or you can use these shortcut links directly:

PNG - Portable Network Graphics

(.PNG file extension, the pronunciation 'Ping' is specifically mentioned in the PNG Specification). PNG needs to be mentioned. PNG is not the number one file format, but you will want to know about it. PNG is not so popular yet, but it's appeal is growing as people discover what it can do.

PNG was designed recently, with the experience advantage of knowing all that went before. The original purpose of PNG was to be a royalty-free GIF and LZW replacement (see LZW next page). However PNG supports a large set of technical features, including superior lossless compression from LZ77. Compression in PNG is called the ZIP method, and is like the 'deflate" method in PKZIP (and is royalty free).

But the big deal is that PNG incorporates special preprocessing filters that can greatly improve the lossless compression efficiency, especially for typical gradient data found in 24 bit photographic images. This filter preprocessing causes PNG to be a little slower than other formats when reading or writing the file (but all types of compression require processing time).

Photoshop 7 and Elements 2.0 correct this now, but earlier Adobe versions did not store or read the ppi number to scale print size in PNG files (Adobe previously treated PNG like GIF in this respect, indicated 72 ppi regardless). The ppi number never matters on the video screen or web, but it was a serious usability flaw for printing purposes. Without that stored ppi number, we must scale the image again every time we print it. If we understand this, it should be no big deal, and at home, we probably automatically do that anyway (digital cameras do the same thing with their JPG files). But sending a potentially unsized image to a commercial printer is a mistake, and so TIF files should be used in that regard.

Most other programs do store and use the correct scaled resolution value in PNG files. PNG stores resolution internally as pixels per meter, so when calculating back to pixels per inch, some programs may show excessive decimal digits, perhaps 299.999 ppi instead of 300 ppi (no big deal).

PNG has additional unique features, like an Alpha channel for a variable transparency mask (any RGB or Grayscale pixel can be say 79% transparent and other pixels may individually have other transparency values). If indexed color, palette values may have similar variable transparency values. PNG files may also contain an embedded Gamma value so the image brightness can be viewed properly on both Windows and Macintosh screens. These should be wonderful features, but in many cases these extra features are not implemented properly (if at all) in many programs, and so these unique features must be ignored for web pages. However, this does not interfere with using the standard features, specifically for the effective and lossless compression.

Netscape 4.04 and MS IE 4.0 browsers added support for PNG files on web pages, not to replace JPG, but to replace GIF for graphics. For non-web and non-graphic use, PNG would compete with TIF. Most image programs support PNG, so basic compatibility is not an issue. You may really like PNG.

PNG may be of great interest, because it's lossless compression is well suited for master copy data, and because PNG is a noticeably smaller file than LZW TIF. Perhaps about 25% smaller than TIF LZW for 24 bit files, and perhaps about 10% to 30% smaller than GIF files for indexed data.

Different images will have varying compression sizes, but PNG is an excellent replacement for GIF and 24 bit TIFF LZW files. PNG does define 48 bit files, but I don't know of any programs that support 48 bit PNG (not too many support 48 bit in any form).

Here are some representative file sizes for a 9.9 megabyte 1943x1702   24-bit RGB color image:

File type

File size

 

TIFF

9.9 megs

 

TIFF LZW

8.4 megs

 

PNG

6.5 megs

 

JPG

1.0 megs

(1.0 / 9.9) is 10% file size  

BMP

9.9 megs

 

Seems to me that PNG is an excellent replacement for TIFF too.

More PNG info at www.libpng.org/pub/png.

TIFF - Tag Image File Format

(.TIF file extension, pronounced Tiff) TIFF is the format of choice for archiving important images. TIFF is THE leading commercial and professional image standard. TIFF is the most universal and most widely supported format across all platforms, Mac, Windows, Unix. Data up to 48 bits is supported.

TIFF supports most color spaces, RGB, CMYK, YCbCr, etc. TIFF is a flexible format with many options. The data contains tags to declare what type of data follows. New types are easy to invent, and this versatility can cause incompatibly, but about any program anywhere will handle the standard TIFF types that we might encounter. TIFF can store data with bytes in either PC or Mac order (Intel or Motorola CPU chips differ in this way). This choice improves efficiency (speed), but all major programs today can read TIFF either way, and TIFF files can be exchanged without problem.

Several compression formats are used with TIF. TIF with G3 compression is the universal standard for fax and multi-page line art documents.

TIFF image files optionally use LZW lossless compression. Lossless means there is no quality loss due to compression. Lossless guarantees that you can always read back exactly what you thought you saved, bit-for-bit identical, without data corruption. This is a critical factor for archiving master copies of important images. Most image compression formats are lossless, with JPG and Kodak PhotoCD PCD files being the main exceptions.

Compression works by recognizing repeated identical strings in the data, and replacing the many instances with one instance, in a way that allows unambiguous decoding without loss. This is fairly intensive work, and any compression method makes files slower to save or open.

LZW is most effective when compressing solid indexed colors (graphics), and is less effective for 24 bit continuous photo images. Featureless areas compress better than detailed areas. LZW is more effective for grayscale images than color. It is often hardly effective at all for 48 bit images (VueScan 48 bit TIF LZW is an exception to this, using an efficient data type that not all others use ).

LZW is Lempel-Ziv-Welch, named for Israeli researchers Abraham Lempel and Jacob Zif who published IEEE papers in 1977 and 1978 (now called LZ77 and LZ78) which were the basis for most later work in compression. Terry Welch built on this, and published and patented a compression technique that is called LZW now. This is the 1984 Unisys patent (now Sperry) involved in TIF LZW and GIF (and V.42bis for modems). There was much controversy about a royalty for LZW for GIF, but royalty was always paid for LZW for TIF files and for v.42bis modems. International patents expired in mid-2004.

Image programs of any stature will provide LZW, but simple or free programs often do not pay LZW patent royalty to provide LZW, and then its absence can cause an incompatibility for compressed files.

It is not necessary to say much about TIF. It works, it's important, it's great, it's practical, it's the standard universal format for high quality images, it simply does the best job the best way. Give TIF very major consideration, both for photos and documents, especially for archiving anything where quality is important.

But TIF files for photo images are generally pretty large. Uncompressed TIFF files are about the same size in bytes as the image size in memory. Regardless of the novice view, this size is a plus, not a disadvantage. Large means lots of detail, and it's a good thing. 24 bit RGB image data is 3 bytes per pixel. That is simply how large the image data is, and TIF LZW stores it with recoverable full quality in a lossless format (and again, that's a good thing). $200 today buys BOTH a 60 GB 7200 RPM disk and 512 MB of memory so it is quite easy to plan for and deal with the size.

There are situations for less serious purposes when the full quality may not always be important or necessary. JPEG files are much smaller, and are suitable for non-archival purposes, like photos for read-only email and web page use, when small file size may be more important than maximum quality. JPG has its important uses, but be aware of the large price in quality that you must pay for the small size of JPG, it is not without cost.

 

JPG F JPEG - Joint Photographic Experts Group

(.JPG file extension, pronounced Jay Peg). This is the right format for those photo images which must be very small files, for example, for web sites or for email. JPG is often used on digital camera memory cards, but RAW or TIF format may be offered too, to avoid it. The JPG file is wonderfully small, often compressed to perhaps only 1/10 of the size of the original data, which is a good thing when modems are involved. However, this fantastic compression efficiency comes with a high price. JPG uses lossy compression (lossy meaning "with losses to quality"). Lossy means that some image quality is lost when the JPG data is compressed and saved, and this quality can never be recovered.

File compression methods for most other file formats are lossless, and lossless means "fully recoverable". Lossless compression always returns the original data, bit-for-bit identical without any question about differences (losses). We are used to saving data to a file, and getting it all back when we next open that file. Our Word and Excel documents, our Quicken data, any data at all, we cannot imagine NOT getting back exactly the original data. TIF, PNG, GIF, BMP and most other image file formats are lossless too. This integrity requirement does limit efficiency, limiting compression of photo image data to maybe only 10% to 40% reduction in practice (graphics can be smaller). But most compression methods have full lossless recoverability as the first requirement.

JPG files don't work that way. JPG is a big exception. JPG compression is not lossless. JPG compression is lossy. Lossy means "with losses" to image quality. JPG compression has very high efficiency (relatively tiny files) because it is intentionally designed to be lossy, designed to give very small files without the requirement for full recoverability. JPG modifies the image pixel data (color values) to be more convenient for its compression method. Tiny detail that doesn't compress well (minor color changes) can be ignored (not retained). This allows amazing size reductions on the remainder, but when we open the file and expand the data to access it again, it is no longer the same data as before. This lost data is like lost purity or integrity. It can vary in degree, it can be fairly good, but it is always unrecoverable corruption of the data. This makes JPG be quite different from all the other usual file format choices. This will sound preachy, but if your use is critical, you need a really good reason to use JPG.

There are times and places this compromise is an advantage. Web pages and email files need to be very small, to be fast through the modem, and some uses may not need maximum quality. In some cases, we are willing to compromise quality for size, sacrificing for the better good. And this is the purpose of JPG. There is no magic answer providing both high compression and high quality. We don't get something for nothing, and the small size has a cost in quality. Still, mild quality losses may sometimes be acceptable for less critical purposes. The sample JPG images on next page show the kind of problem to expect from excessive compression.

 

Even worse, more quality is lost every time the JPG file is compressed and saved again, so ever editing and saving a JPG image again is a questionable decision. You should instead just discard the old JPG file and start over from your archived lossless TIF master, saving that change as the new JPG copy you need. JPG compression can be selected to be better quality in a larger file, or to be lesser quality in a smaller file. When you save a JPG file, your FILE - SAVE AS dialog box should have an option for the degree of file compression.

Many programs (Photoshop, Elements, PhotoImpact, PhotoDeluxe) call this setting JPG Quality. Other programs (Paint Shop Pro and Corel) call it JPG Compression, which is the same thing, except Quality runs numerically the opposite direction from Compression. High Quality corresponds to Low Compression. Typical values might be 85 Quality, or 15 Compression. These numbers are relative and have no absolute meaning. Compression in one program will vary from another even at the same number. The number is also not a percentage of anything, and Quality 100 does NOT mean no compression, it is just an arbitrary starting point. JPG will always compress, and Quality 90 is not so different from Quality 100 in practice. There's very little improvement over 95.

Digital cameras also offer JPG quality choices too. Large image files do fill memory cards fast. You can buy more and larger cards, or you can compromise by sacrificing image quality for small file size (but I hope you won't go overboard with this). The camera menu will have two kinds of resizing choices. One size choice actually creates a smaller image size (pixels), resampled smaller from the original standard size of the CCD chip, for example perhaps to half size in pixel dimensions. The correct image size in pixels is related to your goal for using the image. For example you may need enough pixels to print 8x10 inches on paper (6 megapixels), or you may only want a small image for video screen viewing (1 megapixel).

Regardless of that selected image size in pixels, the camera menu will also offer a smaller file size choice in bytes, related to quality, via JPG file compression. This menu will offer a best quality setting which is the largest file, and maybe intermediate sizes, and a smallest but worst quality choice. My Nikon D70 offers three JPG file size choices of Fine (about 1/4 size in bytes), Norm (about 1/8 size in bytes), or Basic (about 1/16 size in bytes), comparing compressed file size to the uncompressed size. The best (largest) JPG file size will still contain JPG artifacts, but very mild, essentially undetectable, vastly better than the smallest file choice. Even better, some cameras also offer a RAW or TIF format to bypass JPG problems all together. These images may be large, but memory cards are becoming less expensive ($100 for 1 GB), and larger or multiple cards are by far the best quality solution.

With either scanner or camera images, individual image JPG file sizes will vary a little, because detail in the individual image greatly affects compressibility. Large featureless areas (skies, walls, etc.) compress much better (smaller) than images containing much tiny detail all over (a tree full of leaves). Therefore images of the same size in pixels and using the same JPG quality setting, but with differing image content, will vary a little in JPG file size, with extremes perhaps over a 2 to 1 range around the average size.

Since each image varies a little, the file size is only a crude indicator of JPG quality, however it is a rough guide. For ordinary color images (24 bit RGB), the uncompressed image size when opened in memory is always 3 bytes per pixel. For example, an image size of 3000x2000 pixels is 6 megapixels, and therefore by definition, when uncompressed (when opened), this memory size is 3X that in bytes, or 18 MB. That is simply how large the 24 bit data is. The compressed JPG file size will be smaller (same pixels, but fewer bytes). A High quality JPG file size might be compressed to 50% to 25% uncompressed size (bytes). A JPG file size only 10% of that image's size in memory would the general ballpark for a fair tradeoff of quality vs. file size for color images of web page quality (but not best quality).

The 10% size is not very precise, but of course only refers roughly to the average image size, since each individual image varies a little. Color compresses better than grayscale files, so grayscale doesn't decrease as much. These are very rough guidelines, your image, your photo program, your purpose, and your personal criteria or tolerance will all be a little different.

It is difficult to describe the JPG quality losses, except by seeing an example image (next page). JPG does not discard pixels. Instead it changes the color detail of some pixels in an abstract mathematical way. JPG is mathematically complex and requires considerable CPU processing power to decompress an image. JPG also allows several parameters, and programs don't all use the same JPG rules. Programs vary, some programs take shortcuts to load JPG faster but with less quality (browsers for example), and other programs load JPG slower with better quality. Final image quality can depend on the image details, on the degree of compression, on the method used by the compressing JPG program, and on the method used by the viewing JPG program.

Graphic Interchange Format (GIF)

(.GIF file extension) There have been raging debates about the pronunciation. The designers of GIF say it is correctly pronounced to sound like Jiff. But that seems counter-intuitive, and up in my hills, we say it sounding like Gift (without the t).

GIF was developed by CompuServe to show images online (in 1987 for 8 bit video boards, before JPG and 24 bit color was in use). GIF uses indexed color, which is limited to a palette of only 256 colors (next page). GIF was a great match for the old 8 bit 256 color video boards, but is inappropriate for today's 24 bit photo images.

GIF files do NOT store the image's scaled resolution ppi number, so scaling is necessary every time one is printed. This is of no importance for screen or web images. GIF file format was designed for CompuServe screens, and screens don't use ppi for any purpose. Our printers didn't print images in 1987, so it was useless information, and CompuServe simply didn't bother to store the printing resolution in GIF files.

GIF is still an excellent format for graphics, and this is its purpose today, especially on the web. Graphic images (like logos or dialog boxes) use few colors. Being limited to 256 colors is not important for a 3 color logo. A 16 color GIF is a very small file, much smaller, and more clear than any JPG, and ideal for graphics on the web.

Graphics generally use solid colors instead of graduated shades, which limits their color count drastically, which is ideal for GIF's indexed color. GIF uses lossless LZW compression for relatively small file size, as compared to uncompressed data. GIF files offer optimum compression (smallest files) for solid color graphics, because objects of one exact color compress very efficiently in LZW. The LZW compression is lossless, but of course the conversion to only 256 colors may be a great loss. JPG is much better for 24 bit photographic images on the web. For those continuous tone images, the JPG file is also very much smaller (although lossy). But for graphics, GIF files will be smaller, and better quality, and (assuming no dithering) pure and clear without JPG artifacts.

If GIF is used for continuous tone photo images, the limited color can be poor, and the 256 color file is quite large as compared to JPG compression, even though it is 8 bit data instead of 24 bits. Photos might typically contain 100,000 different color values, so the image quality of photos is normally rather poor when limited to 256 colors. 24 bit JPG is a much better choice today. The GIF format may not even be offered as a save choice until you have reduced the image to 256 colors or less.

So for graphic art or screen captures or line art, GIF is the format of choice for graphic images on the web. Images like a company logo or screen shots of a dialog box should be reduced to 16 colors if possible and saved as a GIF for smallest size on the web. A complex graphics image that may look bad at 16 colors might look very good at say 48 colors (or it may require 256 colors if photo-like). But often 16 colors is fine for graphics, with the significance that the fewer number of colors, the smaller the file, which is extremely important for web pages.

GIF optionally offers transparent backgrounds, where one palette color is declared transparent, so that the background can show through it. The GIF File - Save As dialog box usually has an Option Button to specify which one GIF palette index color is to be transparent.

Interlacing is an option that quickly shows the entire image in low quality, and the quality sharpens as the file download completes. Good for web images, but it makes the file slightly larger.

GIF files use a palette of indexed colors, and if you thought 24 bit RGB color was kinda complicated, then you ain't seen nuthin' yet.

For GIF files, a 24 bit RGB image requires conversion to indexed color. More specifically, this means conversion to 256 colors, or less. Indexed Color can only have 256 colors maximum. There are however selections of different ways to convert to 256 colors.

 

Electronic Publishing Home Page

Department of Communication, Seton Hall University