Unlock the secrets of scanner technology

| Faithe Wempen | E-Mail

Scanners are mysterious "black boxes" for most people. They put a picture into the scanner, and the picture appears on the computer screen, but they aren’t sure how it happened. Often, IT professionals themselves are at a loss at explaining how they work and which features are best. To help you understand how scanners work, I have written a detailed guide on parts, functions, and features, including a synopsis of important scanner terminology.

How a scanner scans

For this discussion, let’s assume we’re talking about a flatbed scanner—the kind that looks like a small photocopier. You place your original on the glass, close the lid, and then either press a button on the scanner or issue a command in an application to start the scan.

Inside the scanner is a fixed linear array called a charge-coupled device (CCD). It’s composed of clusters of photosensitive cells, sort of like the eye of an insect, that convert light to electrical charge. A light bar moves across the object being scanned, and a mirror or system of mirrors reflects the light to a lens and then into the CCD. Some scanners have two mirrors; others have three. The mirrors are slightly curved to compact the image as it reflects.

Each of the photosensitive cells produces an electrical signal proportional to the strength of the reflected light that hits it, and that signal is converted to a binary number, which is then sent to the computer. Dark areas have lower numbers; light ones have higher numbers.

The CCD receives data for one line of the image at a time, sends it on to the computer, and then tells a stepper motor to advance the lamp to the next line. Therefore, the CCD does not have to have a number of cells equivalent to the number of pixels in the entire page, only the number of pixels in a single row. On a scanner that can accept an 8.5-inch sheet at 300 dots per inch (dpi), that equates to about 2,600 cells. Scanners with a higher dpi have more CCD cells; for example, there are about 10,400 cells in a 1,200-dpi model. This is the scanner’s horizontal dpi or horizontal resolution. It is also called the x-direction sampling rate.

Some inexpensive scanners do not have a CCD and mirror/lamp/lens system but instead have a contact image sensor (CIS) consisting of rows of red, green, and blue LEDs. The image sensor mechanism, consisting of 300 to 600 sensors spanning the width of the scan area, sits very close to the glass. When the image is scanned, the LEDs combine to provide white light, and the illuminated image is then captured by the row of sensors. CIS scanners are cheaper, lighter, and thinner but do not provide the same level of quality and resolution found in most CCD scanners.

Many scanners also report a vertical dpi or vertical resolution in their specifications, also called the y-direction sampling rate. This is the number of separate lines per inch recorded as the light moves down the page. It’s technologically easier for a scanner manufacturer to make a light that moves in more precise increments than to include a CCD with more cells, so you will often see scanners advertised with a higher vertical than horizontal resolution, such as 600 x 1,200 dpi. When you see two numbers like that, the first one is always the horizontal.

Some scanners’ specifications report very high resolutions, such as 4,800 x 4,800 dpi. When you see one that purports to have a high resolution, you should be skeptical and look closer. The reported resolution may not be the hardware resolution or the actual resolution of the scanner itself. Instead, it probably refers to some software-enhanced resolution. One method of software enhancement is called interpolation, which inserts extra pixels between the actual scanned ones and uses a mathematical formula to determine their value.

For example…

If one pixel has a value of 10 and the next one 20, interpolation would insert a pixel between them with the value of 15. I’m using regular base 10 numbers here for the sake of discussion, but in a real scanning situation, the numbers would be binary values.

Scanner light types

When desktop scanners were first introduced, many manufacturers used fluorescent bulbs as light sources. However, fluorescent bulbs cannot emit consistent white light, and they produce heat that can distort the other optical components. Therefore, most manufacturers moved to cold-cathode bulbs as soon as was practical. Cold-cathode bulbs resemble fluorescent bulbs, but they have no filament; therefore, the bulbs operate at much lower temperatures and are more reliable. Most scanner manufacturers have lately moved to using Xenon bulbs. Xenon bulbs produce a very stable, full-spectrum light source, although they do use slightly more power than the fluorescent and cold-cathode bulbs.

How color scanners are different

In black-and-white or grayscale scanners, a single light bar and mirror system transmits grayscale data to the CCD. In a color scanner, however, there must be three separate evaluations of each pixel in the image: the amount of red, amount of blue, and amount of green. There are several methods of gathering this data during the scanning process.

Early color scanners made three passes across the image, gathering the color data separately. This worked well with the limited technology available but was very slow. A slightly newer method uses three colored lights, which all move down the page together. Each light has its own mirror system or a single mirror system with three filters so it can separately accept each color’s data.A note on registration

On a system that separately gathers data for each color, there is the potential for misalignment errors. The accuracy of a scanner’s color alignment is known as registration.

The most modern color technology employs three separate CCDs (or a single CCD with three stripes), each one collecting data about a single color. This is the most common method in scanners sold today. Yet scanners that use separate CCD areas for each color can be even further differentiated. Some use a beam splitter, in which the single image coming from the mirror is split into the three colors, each of which is read by a different CCD. Others coat each CCD with a film so that it can read only one of the colors from an unsplit beam. Beam splitting often produces a better scan result but is more expensive.

Understanding color depth

The original scanners were one bit. They were black and white only—no grayscale, no color—and transmitted a single bit of data for each cell in the CCD. The number of bits is the number of binary digits required to represent each pixel’s value. In a one-bit system, each pixel is either zero or one: off or on. Then came four-bit scanners (16 unique values) and then eight-bit scanners (256 unique values).

Mathematics refresher

To determine the number of unique values in a certain number of bits, take 2 to the nth power where n is the number of bits. For example, four-bit is 2⁴, or 16, and eight-bit is 2⁸, or 256.

Today, all scanners support at least 24-bit scanning. This is known as true color, and this type of scanning uses a 24-digit binary code to represent each pixel. That amount of color depth provides over 16 million colors from which to choose, which is more than the human eye can detect. So in theory, 24-bit color depth is the most you would ever need in a scanner. However, newer scanners advertise 30-bit or even 36-bit support. Why would they do that, when the best that most monitors can display and most printers can print is 24-bit?

The answer is a bit complicated—no pun intended! A 24-bit scanner offers an eight-bit range (256 levels) for each primary color (8+8+8=24), but a few of the least significant bits are lost in noise, and any post-scanning tonal corrections further reduce the range. Therefore, you want the scanner driver to make any brightness and color corrections before making the final scan. If you have a scanner that starts with a 30-bit depth, it has a wider range to start with, so it can make tonal corrections and still end up with a decent 24 bits. Using the scanner driver through your OS, you can control which 24 of those 30 bits are kept and which ones are discarded by changing the Gamma Curve setting.

Dynamic range

Dynamic range is a measure of the scanner’s ability to distinguish light and dark. The scale runs from zero to four, and most inexpensive desktop scanners rate about 2.4 or so. Higher-end scanners might have a rating of 2.8 to 3.2. Professional-quality scanners can approach 3.8 in their dynamic range.

Scanning speed

Speed refers to the amount of time that the scanner takes to scan the image. The actual speed varies depending on what you are scanning (a full page vs. a smaller item), at what resolution you are scanning (150 dpi, 300 dpi, etc.), whether you are scanning in grayscale or color (most color scanners can do either), and whether you are using optical character recognition (OCR) software to import scanned text.

An extremely quick speed rating, such as nine to 20 seconds or so, likely refers to the quickest time it can take the scanning head to move from the top to the bottom of the glass. A slower speed rating, such as 45 to 60 seconds, likely refers to the time for a typical scan. This number is not very meaningful, however, unless the specification also tells the exact resolution and image size. For example, a scanner might take 60 seconds to scan a 4" x 6" color photo or a full-page black-and-white drawing, 90 seconds to scan an OCR page of text into a word processor, and 150 seconds to scan an 8.5" x 11" color photo

Scanner interface

Speed is also affected by the interface used to connect the scanner to the computer. In the past, low-end scanners usually used a parallel interface. Since most PCs have only one parallel port, parallel scanners typically come with some sort of pass-through that allows the scanner and printer to share a single parallel port. However, a parallel pass-through does not always work very well. In particular, ink-jet printers seem to have a difficult time playing nice on a shared parallel port. You can sometimes make system-setting adjustments in BIOS to work out a compromise between the two devices, or you can unhook the scanner and hook up the printer every time you want to print, but that makes for a less than ideal arrangement. Parallel is also the slowest scanning interface, resulting in overall slower performance for your scanner than other interfaces.

In contrast, most high-end scanners have traditionally used a SCSI interface. Most computers do not already have a SCSI interface, so it’s an extra expense to add a SCSI card. SCSI has been around for a long time and has many advantages, such as high speed and the ability to daisy chain several devices together to use a single SCSI port. SCSI scanners are not common in local computer and office supply stores these days, so you will likely need to order a SCSI scanner if you want one.

Today, USB has become the interface of choice. It’s fast and, like SCSI, it can chain several devices together on a single port. Also, most computers already have a USB port. The main drawback to USB is that it’s relatively new, so if a PC is more than a few years old, it might not have a USB port. However, you can buy USB circuit boards that add USB capability to an older PC, much like you can with SCSI. USB requires Windows 95C or higher, so older PCs might need a double upgrade: both an add-on USB port and a new OS version.

Things to remember:

Here’s a recap of some important points to remember about scanners:

CCD vs. CIS—CCD is the traditional scanning technology; it produces better results. CIS is cheaper and lighter but limited to 300 to 600 dpi.
Resolution—The higher the hardware resolution, the costlier the scanner. Don’t confuse hardware resolution with software-enhanced resolution, such as those using interpolation. If there are two numbers advertised, the first one is the horizontal resolution, and it’s the more important measure of scanner quality.
Light type—If power consumption is not a critical issue, choose a scanner with Xenon bulb(s) for more consistent light and less heat buildup.
Color depth—24-bit is sufficient for casual use, but you will be hard-pressed to find a scanner that is less than 30-bit these days. Therefore, color depth is not an important shopping factor unless you are buying a high-end model for professional graphics use. In that case, look for 36-bit models.
Dynamic range—This is the scanner’s ability to recognize light and dark, and it is a fairly good measurement of the overall quality of the innards. For casual use, a rating of 2.4 is acceptable; for the highest-quality professional scans, look for a rating in the high 3s.
Speed—There are many ways of reporting a scanner’s speed. Make sure you are comparing apples to apples when comparing the speeds of two or more scanners. The most accurate way to compare is raw capability: the amount of time needed for the scan head to move from top to bottom. If comparing average scanning times, make sure the type and size of the image being scanned are the same in each rating.
Interface—For commercial/professional use, SCSI is still the champ, but USB is a cheaper and very viable alternative for personal or small office use. Avoid parallel interface when possible because of speed and sharing issues.

Conclusion
Though some scanners come packed with features and do many more things than just scan, essentially they all rely on similar technology to produce the image. Whether comparing scanners for the office or attempting to troubleshoot them for your users, a solid understanding of these features and the underlying mechanisms will come in handy.