Cameras 101

This article is not intended to promote any one vendor over another. The goal is to educate the machine vision system customer to make better choices and ultimately purchase a successful vision solution. Although specific camera company examples are used in this article, the same principles may apply to others in the industry (i.e. AVT, Point Grey, Dalsa, Baumer, JAI, etc.). If you have any questions about the material presented feel free to get in touch, 303-832-1111 or This email address is being protected from spambots. You need JavaScript enabled to view it.

Camera Model:

For our purposes, a camera is a system that measures electro-magnetic radiation (light, heat, etc). Most cameras will measure visible spectrum radiation (light), and do so with an array of imaging elements (pixels). This function allows us to accurately measure how much light is present and where it is coming from. Eyes function similarly, which is why visible spectrum cameras make images look similar to what the eyes see. Both are photosensitive in the visible spectrum, and so both devices create roughly the same signal. Infrared or IR cameras do the same, except with heat, and that is why the images produced are much different than what the eyes see.

Basic Principles of Operation:

All camera sensors are photosensitive, meaning that when radiation hits them they convert that to charge - they do this via the photo-electric effect. When I shine light at a certain wavelength at certain materials, this excites and frees some electrons, and as these free electrons build, a charge develops. A camera is thus essentially a small solar panel but instead of simply converting light to electricity, a camera seeks to measure that electricity and determine how much light hit the sensor. If we had a large array of solar panels and measured the output of each one we would have an enormous camera sensor. We would still require a lens to ensure the device is working, but the duality helps explain the principle.

Visible spectrum cameras have silicon based sensors, and silicon is photosensitive in the visible light spectrum. When the light hits a silicon sensor, some of that light energy is absorbed, freeing electrons and building an electric charge. Similarly, Mercury Cadmium Telluride or (MCT) is sensitive to electro-magnetic radiation in the 3um to 12um range. In other words, when radiated heat hits MCT, charge continues to build. Therefore, MCT based sensors can be used to build certain types of IR Cameras.

A camera sensor has many independent pixels, which are basically "buckets" that accumulate charge over the course of exposure time. When the exposure time concludes, circuitry inside the camera drains the charge, amplifies it, and converts it to a numeric measurement (digital signal), which can be stored or sent to another component. Ample light means many photons hit the sensor, which also means full "buckets" of charge are existent and in turn high pixel values are recorded. In 8-bit black and white, we measure from zero (empty, no light - no charge) to 255 (full, ample light - full of charge). If ample light was present, this means a measurement of 255 (or full) is recorded. The camera sends out 11111111 as 8 voltage peaks, or the binary representation of 255. If a medium amount of charge is recorded, and the bucket is only 1/3 full, we may measure 85. We would then transmit 01010101 or 85 in binary. If no light hits the sensor and thus no charge accumulates, we record a zero and transmit 00000000, or no voltage peaks.

Real World Implementations:

Cameras are fiendishly difficult to build. It's all well and good to describe a light measuring device with 5 million pixels, but building one presents enormous engineering challenges. What distinguishes camera vendor A from camera vendor B is how well they meet these challenges and eliminate error in the measurements. A crucial part of building any imaging or vision system is getting good data in. Poor (or uninformed) camera selection impedes this and will doom any project. The remainder of this article seeks to differentiate the important features that distinguish cameras. The goal is that after reading this, the reader knows what questions to ask, and furthermore knows what he's getting from a $3,000 camera compared to a $2,000 camera. Sometimes the money will be worth it, sometimes it won't, the hope is that after reading below you'll know what you're paying for and you'll know a few yardsticks you can use to compare cameras.

Sensor Approaches:

Roughly speaking there are two types of camera sensor: CCD and CMOS. The distinction comes in how each type drains the charge it accumulates. Each method is best suited for different applications. These types are presented first because the choice of sensor type plays into the measures of effectiveness and imaging errors we'll explore later on.

CMOS:

CMOS sensors drain charge from each pixel individually. This means that each pixel (or bucket) has the circuitry on it to drain charge. The charges are piped to ADCs (analog to digital converters) and read that way. There are a few implications of this design.

High Speed:

  • Charge can be drained from the chip quickly. Most high speed cameras are CMOS.

Lower Sensitivity:

  • Circuitry on each pixel takes up space making a CMOS camera less photosensitive.

CCD:

  • CCD sensors drain charge from 1 or more taps at corners of the sensor. They work like a "bucket brigade" passing charge down and over to the readout tap. As a result CCDs generally show the following traits.

Slower readout speeds:

  • CCDs have trouble generating fast readouts even in multi-tap (2 or more drainage point) designs.

More photosensitivity:

  • Because CCD pixels do not have circuitry on board the entire surface can be photosensitive.

Blooming:

  • Because CCD pixels cannot drain individually when pixel A fills with charge that charge will spill over into other pixels (unless more advanced anti-blooming technology is employed). Thus, bright spots in an image can wash out a large area of the image.

Imperfections in Imaging:

Quantum Efficiency:

In an ideal world there would be a material that converted 100% of light at selected wavelengths to electrical charge. Unfortunately no such material exists. Furthermore no material with a high QE (quantum efficiency) has the same spectral response as the human eye. This means that while humans perceive light from 400nm (violet) to 700nm (red) a silicon sensor is sensitive to light from about 300nm to 1000nm. In short the camera sees a fuller spectrum.

In practice glass in the lens cuts the UV component (<400nm) and an IR-cut filter can cut the near infrared component (>700nm) before it hits the sensor if needed to better replicate the human eye. Typical silicon sensors have QE's of anywhere between 25% and 60%. A typical QE response curve is shown below at different wavelengths for a mid-market CCD. Manufacturers should be able to provide you with the QE response curve for a camera upon request. Sometimes QE's are also quoted at peak or at 545nm (roughly the middle of the visible spectrum - green light). For instance you might read that camera X has a QE of 53% at 545nm or 57% at peak.

A sample Quantum Efficiency Curve

Full Well Capacity, Saturation Capacity, Pixel Size and SNR (Signal to Noise Ratio):

Statistically speaking there will always be variance in how much charge builds on a given pixel during a given exposure. That is if we take two pictures of the exact same scene with the exact same lighting, we won't read the same charge accumulation each time. First of all this is because photons (units of light) can be thought of like rain. Even if it's raining at a given rate, it's not necessarily true that the same number of drops hit a specific cobblestone every second. Similarly a given pixel gets a different number of photons every exposure even if conditions are exactly the same. Furthermore even if 10,000 photons (units of light) hit pixel X in both exposures, quantum efficiency is probabilistic for each photon. Therefore if I have a QE of 50% (coin flip probability of freeing an electron), the first exposure may generate 4,900 electrons and the second may generate 5,100. Both these factors combine to create signal to noise ratio. That is before we've talked about error, there is simply a precision beyond which we can't measure. To tilt the field in our favor though, we want to be sure to be taking measurements in a large enough electron bucket such that randomness doesn't play a large role. We want large signal and little noise.

Full Well Capacities on typical machine vision cameras range from about 5,000 e- to 50,000 e-. However reaching the full well capacity is difficult because when the pixel nears it's maximum capacity electrons begin to repel each other and some blooming can occur (spillover into adjacent CCD pixels). Therefore we typically consider a saturation capacity which is the capacity we can realistically get to without introducing these problems. The theoretical max SNR we can achieve is equal to the square root of the saturation capacity. Therefore for a typical CCD like the Sony ICX285 the numbers shape up as follows.

SNR is often expressed in bits or decibels. To get bits we simply take log2(SNR). To get decibels we take 20 * log10( SNR). 1 bit equals 6.02 dB. Sometimes you'll see both measures. Bear in mind that because both bits and dB are logarithmic scales, 1 additional bit means a camera is 2 times better, 10dB is equivalent to 10 times better.

Sony ICX285

 
Saturation Capacity 18,000 electrons
Dark Noise 9 electrons
Quantum Efficiency @545nm 56%
Max SNR sqrt(1800) = 134 = 7.1 bits = 43 dB

The ICX285 is found in Basler's Scout 1400 models, at a price points between $2,999 and $3,469

Sony ICX274

 
Saturation Capacity 8,000 electrons
Dark Noise 8 electrons
Quantum Efficiency @545nm 50%
Max SNR sqrt(800) = 89 = 6.5 bits = 39 dB

The ICX 274 is found in the Scout 1600 models at price points between $1,969 to $2,279.

But isn't higher resolution always better?

The scA1600s have higher resolutions (1628x1236) compared to the scA1400s (1392x1040) how can the lower resolution camera possibly be more expensive? The answer lies in a larger sensor, meaning larger pixels, deeper wells and thus better SNR per the math described above. The Basler cameras are merely presented as a good example. We work with Basler because we think they make good cameras but the same calculations would apply for any other vendor you chose to consider.

Maximum SNR and Measured SNR:

In the real world we can never actually achieve the theoretical max SNR. The plot below measures how close a mid-grade CCD achieves the theoretical SNR limit at different levels of lighting. The X axis here shows photons absorbed. The graph has therefore already excluded considerations of quantum efficiency because we're only considering the photons that were absorbed and thus freed electrons. On the Y axis is SNR. The diagonal represents maximal theoretical SNR as Sqrt(photons absorbed)=SNR all along that line. Many different units of this mid-grade CCD were tested and are shown as different green lines. The red line is the theoretical curve that fits the observed data. It represents the theoretical average unit for this camera model. A perfect camera would show up as a straight line along the diagonal. It always achieves the best possible SNR (there is no such thing money can buy). A poorly performing camera would not get very close to its theoretical SNR and would be far to the right and far below the diagonal. Note how SNR increases with illumination. Well lit images enable us to take the best measurements.

A sample Measured SNR curve

Dynamic Range and Dark Noise:

The above discussion of well capacities only presents part of the picture. Unfortunately the "wells" or "buckets" are never actually empty. On each pixel in a sensor there are a certain number of background electrons that will be read even if no photons hit a given pixel. In short the electrons move around according to laws of physics and it's impossible for us to control them and ever empty a pixel well, hold it empty or random electron interference. On a CCD dark noise is typically 8 to 25 electrons, on a CMOS we typically see 15 to 110. CMOS sensors have higher values here because the internal circuitry allows electrons to "slosh" onto and off of a given pixel more.

Dynamic Range is simply full / empty. It is the ratio of the maximum value to the minimum value. Higher dynamic ranges mean a camera can measure a greater range of illumination in a given image. For instance if my camera has a saturation capacity of 10,000 e- and a dark noise of 100, it has a dynamic range of 100 because 10,000 / 100 = 100. This means that in one image my camera can only effectively distinguish between items whose brightnesses are within factors of 100. If this camera has a QE of 50% this means that anything emitting under 200 photons per exposure is effectively indistinguishable from dark noise. Similarly any object emitting more than 20,000 photons per exposure will overwhelm the well capacity. Therefore in one exposure I can only effectively measure light levels between 200 and 20,000 photons.

However because exposure time is a variable so we can move where this range lies by changing exposure but the relative distance between max and min is a parameter of the camera we have. For instance if we are reading mostly white pixels (the image is washing out), we can cut the exposure time. An item that emits 200,000 photons per second will be white in our image if we use a 1 second exposure. 200,000 photons yields 100,000 e- at 50% QE which is much higher than our well capacity. If we use a 10ms (1/100th of a second) exposure the same item will emit 2,000 photons per exposure yielding about 1,000 electrons and we'll see an accurate gray level. However at this short exposure time, an item emitting 20,000 photons per second becomes more or less black because 20,000 per second means 200 photons per exposure which yields about 100 electrons and is indistinguishable from dark noise.

ICX285 specs:

Sony ICX285

 
Saturation Capacity 18,000 electrons
Dark Noise 9 electrons
Quantum Efficiency @545nm 56%
Max SNR sqrt(1800) = 134 = 7.1 bits = 43 dB
Dynamic Range saturation / dark noise = 18000/9 = 2000 = 11 bits = 66dB

ICX274 specs:

Sony ICX274

 
Saturation Capacity 8,000 electrons
Dark Noise 8 electrons
Quantum Efficiency @545nm 50%
Max SNR sqrt(800) = 89 = 6.5 bits = 39 dB
Dynamic Range saturation / dark noise = 8000/8 = 1000 = 10 bits = 60dB

 

Other sources of noise:

ADC (Analog to Digital Conversion) Noise:

Not all ADCs are exactly alike and because the ADCs are effectively the yardstick in the camera, it's very important that they are accurate and consistent. Manufacturers will typically calibrate ADCs under uniform lighting conditions to make sure that the ADC on one side is not giving lower values than the ADC on the other.

Photo Response Non-Uniformity (PRNU):

Not all pixels are created equally and not all pixels respond linearly to increasing light. As we move to higher quality sensors this number will fall. 1% is very good. You'll sometimes see this number quoted so just bear it in mind.

Defect Pixels:

Some pixels will simply not work. In the process of fabricating over 1 million identical pixel wells on a 1 mega-pixel sensor there is bound to be an error or two. Even at six-sigma precision we should expect 3.4 defective pixels per million. As a camera ages pixels can fail. This is one reason for machine vision applications we generally recommend overshooting slightly on resolution. Any system that intends to rely on the value of a single pixel is setting itself up to fail.

Thermal Noise (a component of Dark Noise):

Besides light, heat can also excite Silicon atoms causing them to release electrons. This factor is one component of Dark Noise (discussed above) and is fairly constant. At room temperature it's generally a few electrons per pixel per second so it's not big enough to justify a cooled camera. However when we image very dark areas over very long exposure time windows (i.e. astronomy) this becomes a big problem. It can also be a problem in exceedingly hot environments like a steel plant. For machine vision applications in these situations we typically design cooling enclosures to minimize thermal noise noise as much as possible.