Table Of Contents [an error occurred while processing this directive] previous page [an error occurred while processing this directive] next page

Training Sites; Mixed Pixels

The primary reason for setting up training sites in situ is to determine and define Land Cover/Use categories to be classified (mapped) using space observations, assisted by other sources of information. The sites to be selected must be carefully chosen so as to be in sufficient number, size and shape, variety, homogeneity, and distribution to maximize the accuracy of classification in the imagery. The categories fall into three broad surface classes: vegetated (arbitrarily, more than 40% cover); non-vegetated (rock, soil, water, works of man); topographic types. Class size may be limited by sensor resolution. The total number of sites depends in part on the number of classes, their diversity, and the areal dimensions of the scene being interpreted. Usually, a minimum of three sites per class is sufficient but more are needed if the class has notable variability. Even for an entire Landsat scene in which perhaps 15 to 25 classes may be sought, the 30 or more training sites needed will commonly occupy only about 1/20th of the image's area. In most instances, the location of sites rests mainly on convenience of access and ability to be pin-pointed in the imagery. Sites associated with linear features (roads) or possessing recognizable interfaces with other features (e.g., fields) are usually most effective.

The alternative to depending on training sites for classification is to apply the concept of signature extension. This term refers to the assumption that a single, more or less constant, spectral signature may be defined as characteristic of any class, and that this signature has broad (universal) applicability to any scene in a region, or even worldwide. As a specific example, the signature for winter wheat at its maturation should be essentially the same for fields in the U.S. Great Plains, Argentina, the Ukraine, and Australia - provided such variables as differing air masses, Sun position, soil types, soil moisture, etc. are compensated for. If that proves true, then an unknown feature or class in a given scene anywhere should be classifiable by comparing its spectral properties (for a Landsat pixel, its multiband digital number [DN] values) to a "data bank" containing standard values for each of many classes. The closest fit of the unknown's DN values to those of some one class in the bank is assumed to identify it.

This approach usually works well (i.e., will achieve an acceptable accuracy) for a few common features such as clear water bodies, clouds, snow, desert sand, several common rock types, certain forest types, and perhaps central urban areas. However, in most cases, signature extension can be unreliable for such reasons as 1) the natural variability of most classes, 2) the "mixed pixel" problem (see below), 3) the often artificial or arbitrary way in which many spectral classes are set up (e.g., rocks are classified [named] by mineral content and texture, which may bear no direct or simple relationship to gross spectral properties), 4) the influence of (usually undetermined) differences in atmospheric conditions from place to place and on different dates, 5) the seasonal variability of vegetation, and 6) the inability to account for and correct other variables.

A "mixed pixel" results from the fact that individual areas consisting of different features or classes may be below (are smaller than) the resolution of the sensor. Consider this hypothetical "map" of a rural setting:

In this instance, each category is treated as though more or less homogeneous. As imaged by a sensor whose instantaneous field of view (IFOV) (controlled by optics and sampling rates) leads to a pixel size represented by the smaller rectangles, if an individual pixel happens to lie completely within - or fortuitously coincides with - the boundaries of a given class, then the DNs for that pixel will be values determined by the multiband spectral properties of the dominant material(s) making up the enclosed class. It is more likely, however, that the pixel will straddle or cut across several class or feature boundaries. The resulting spectral content is then a composite or weighted average of the spectral responses from each internal class. Recognition of each feature or class becomes difficult, since there are two primary unknowns to account for - the identity of the class and its relative proportion in the mix. Mathematical methods are available to solve for these unknowns but there always remains some statistical uncertainty. One improvement is to reduce pixel size (increase resolution), as is done in the central rectangle, so that more pixels fall within the space occupied by a single class/feature and fewer cross boundaries (going in the other direction, note the effect of enlarging the pixel [say, to the size of the outer boundary of the cluster of 9]). The key rule in optimizing classification is to seek a resolution that approximates the sizes of the smallest specific classes whose identities are sought.


Table Of Contents [an error occurred while processing this directive] previous page [an error occurred while processing this directive] next page


Code 935, Goddard Space Flight Center, NASA
Written by: Nicholas M. Short, Sr. email: nmshort@epix.net
and
Jon Robinson email: Jon.W.Robinson.1@gsfc.nasa.gov
Webmaster: Bill Dickinson Jr. email: rstwebmaster@gsti.com
Web Production: Christiane Robinson, Terri Ho and Nannette Fekete
Updated: 1999.03.15.