Download PDF Classification Methods for Remotely Sensed Data, Second Edition

Free download. Book file PDF easily for everyone and every device. You can download and read online Classification Methods for Remotely Sensed Data, Second Edition file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Classification Methods for Remotely Sensed Data, Second Edition book. Happy reading Classification Methods for Remotely Sensed Data, Second Edition Bookeveryone. Download file Free Book PDF Classification Methods for Remotely Sensed Data, Second Edition at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Classification Methods for Remotely Sensed Data, Second Edition Pocket Guide.

One au is equal to the average Earth-to-sun distance. About January 3, at perihelion, the Earth-to-sun distance is approximately 0. The distance multiplicative factor d is used to compensate for the variation in solar irradiance E caused by the change in distance between the sun and the Earth. The introduction of the factor d into Equation 1.

Under this assumption, the magnitude of the radiance detected by the sensor is affected only by variations in the solar zenith angle, the wavelength, and the atmospheric interaction. The atmospheric correction model introduced in Section 1. This is known as topographic effect. More specifically, the topographic effect can be defined as the variation in radiance exhibited by inclined surfaces compared to radiance from a horizontal surface as a function of orientation of the surface relative to the radiation source.

Moreover, if we assume non-Lambertian reflectance for the surface being measured, the sensor position is another important variable that should be considered. Calibration for topographic effects is intended to normalize the sensor-detected signal difference caused by the topographic variation.

Various techniques e. Band ratioing Colby, , Mather, is the most commonly used method for reducing the topographic effect. In other words, the same vegetation cover should have a similar spectral response, irrespective of location, thus the differences between sample sites are assumed to be caused by topographic effects.

Results showed that the variance of spectral response between sample sites in ratio image was lower than that obtained from original TM band 4 and 5 images. Colby concludes that band ratioing does partially compensate for topographic effects. Smith et al. The first such function is based on a Lambertian reflectance assumption, while the second function assumes that reflectance is non-Lambertian. Although this is a relatively old model, it generates quite robust calibration results Colby, ; Law and Nichol, The Lambertian model assumes that the surface reflects the incident radiation uniformly in all directions.

If we treat wavelength as a constant and ignore atmospheric interactions, the variation in radiance detected by the sensor is mainly caused by the local incidence angle Ri i. Modified from Smith, J. Lin, and K. Photogrammetric Engineering and Remote Sensing 46 : — Slope is defined as a plane tangent to the surface containing two components: one is gradient, which specifies the rate of change in elevation, and the other is aspect, which measures the direction of the gradient.

The values of several parameters are required to solve these equations, namely, solar zenith angle Rs, solar azimuth angle Gs, the surface slope Rn, and aspect Gn. Both Rs and Gs can be obtained from the image header file, while slope and aspect can be derived by co-registering the image with a digital elevation model DEM. A variety of approaches can be employed to calculate the slope and aspect from a DEM Skidmore, ; Jones, In the case of the non-Lambertian reflectance assumption, Smith et al. A Lambertian surface is defined by a k value of 1. In order to solve Equation 1.

To estimate the Minnaert constant k, Equation 1. To calibrate for the topographic effect using a non-Lambertian assumption is more complicated than that based on a Lambertian reflectance assumption. As far as the computational cost and calibration accuracy is concerned, Smith et al. Under such circumstances, one can use Equation 1. If either of the above conditions is not satisfied, the use of Equation 1. It should also be appreciated that surface topographic variations will also cause distortions in the geometry of images.

The map to which the image is referenced represents the relationship between features reduced to some datum such as sea level, while the image shows the actual terrain surface. These instruments transmit a signal in the wavelength range approximately 3 cm to 1 m, and receive reflection backscatter from the target.

The level of backscatter for each pixel over the imaged area is recorded and the set of pixels forms the radar image. Remote sensing in the microwave region differs from optical remote sensing in a number of ways, the most important of which are 1. Radar backscatter is related to the roughness and electrical conductivity of the target. This information is complementary to that which is acquired by optical and thermal sensors.

Energy in the microwave region can penetrate clouds. Microwave imaging radars are active, not passive instruments, and thus can operate independently of solar illumination. An increasing number of space-borne radar systems is now in orbit, including the recently launched German TerraSAR-1 and the Italian COSMO-SkyMed, and it is probable that radar imagery will play an increasingly important role in supporting our understanding and monitoring of our environment.

Optical sensors passively detect reflected solar radiation. Passive microwave sensors, which are not considered in this chapter, detect microwave radiation that is generated by the target. This section introduces radar remote sensing. In Section 1. Section 1. One of the main problems associated with the interpretation of radar imagery is the presence of noise, or radar speckle. The use of filters to reduce the noise effect is described in Section 1. The energy pulse is likely to scatter in all directions when it reaches the surface. Part of the backscattered energy is reflected back to the radar antenna and recorded for later processing.

Normally, each energy pulse has a duration of between 10 and 50 Ns, and utilizes a small range of microwave wavelengths. A waveform can be characterized in terms of its wavelength and amplitude. Wavelength is defined as the distance between two adjacent crests or troughs of the waves Figure 1. Amplitude measures the strength of an electromagnetic wave in terms of the maximum distance achieved by the waveform relative to the mean position shown by the horizontal line in Figure 1.

The amplitude may be a function of a complex signal including both magnitude and the phase; this point is discussed further below. Frequency, rather than wavelength, can also be used to describe wavebands. Frequency is the number of oscillations per unit time or number of wavelengths that pass a point per unit time. Space-borne imaging radars operate on a similar basis, but use a synthetic rather than a real antenna.

Pattern Recognition and Classification of Remotely Sensed Images by Artificial Neural Networks

They are known as synthetic aperture radars SAR. The area imaged by the sensor is thus a strip of ground parallel to the flight track known as the azimuth direction. Pulse duration affects the resolution in the range cross-track direction, while antenna beam width controls the azimuth along-track resolution. The ground range resolution and azimuth resolution are computed by cT 2 cos Q 1.

Range resolution can also be analyzed in terms of incidence angle or look angle Figure 1. Incidence angle is defined as the angle between the radar beam and a line perpendicular to the illuminated surface. The look angle is complementary to the depression angle.

If the illuminated surface is assumed to be flat, then one can also regard the incidence angle as the complement of the depression angle. It can be inferred from Equation 1. The depression angle R varies across an image. The value of R in the near range is relatively larger than that in the far range Figure 1.

Thus, the ground range resolution will also vary with respect to R. Equation 1. Thus, near ground range has a higher resolution than that in far ground range because both are smaller in near range than that in far range Figure 1. According to Equation 1. However, shorter wavelengths are more likely to be affected by the atmosphere and, furthermore, antenna length is constrained by physical limitations. Clearly, if finer azimuth resolution is sought in terms of increasing antenna length, then serious practical difficulties will be encountered.

An alternative strategy is to use a synthetic aperture radar SAR , where the term aperture means the opening used to collect the reflected energy that is used to generate an image. In the case of radar, this opening is the antenna, while in the case of a camera, the opening is the shutter opening. SAR increases the antenna length not in physical terms but by synthesizing a long antenna using the forward motion of a short antenna, a process that requires more complicated and expensive technology.

SAR uses the Doppler principle in order to synthesize a longer antenna. The Doppler effect is the change in wave frequency as a function of the relative velocities of transmitter and reflector. A radar sensor can image a given target repeatedly from successive locations, as illustrated in Figure 1. Here, the frequency of the waveform reflected by the target will increase from location a to b because the distance between the sensor and the object is reducing. As the platform moves away from the target, from b to c, the frequency of the returned signal decreases. SAR uses the Doppler information to compute frequency shifts and thus determine the location and scattering properties of the target.

This timing delay is dependent on the distance between the radar antenna and the target. This distance is the slant range Figure 1. Therefore, every target located on the terrain being observed by the radar will be mapped onto the slant range domain. Because of this slant range mapping, radar imagery is likely to be affected by geometric distortions. The most common distortions are those of layover, foreshortening, and shadow. The layover effect results when the top of an illuminated target is seen by the radar as the bottom, and the bottom of the target is recorded by radar as the top. This phenomenon occurs when the time for the microwave energy to travel from the antenna to the top of an object is less than the time needed to travel to the bottom of the same object.

The microwave energy transmitted by the radar will reach the tops of both objects points a and b in Figure 1. The antenna will first receive the reflected energy from a and b, then some time later, the energy reflected from c and d. After projection onto the slant range domain, the result is called the layover effect. It might be inferred from the preceding discussion that the higher the isolated target the greater the layover effect. However, layover is also controlled by another important factor: the angle, Rf , between the front of the target and the energy path Figure 1.

Foreshortening, like layover, results from the fact that radar is a side-looking sensor. The object labeled B in Figure 1. Hence, the front slope distance ab appears to be less than the back slope distance bc when projected onto slant range. Since the front slope also tends to reflect microwave energy more strongly than does the back slope, it will appear to be brighter, steeper, and shorter, while the back slope is shallower and darker.

The darker back slope demonstrates another radar image geometry effect, that of shadow, as illustrated in the following text. Radar shadow is due to the returned back energy from targets being affected by the nature of the terrain. A radar image is effectively a representation of returned energy levels plotted against the time taken for the energy to travel to and from the target. It follows that if, during a certain period, the antenna receives no reflection, then the image area corresponding to this time period will contain zero dark values. The effect of radar shadow is controlled by the target height and angle Rb Figure 1.

Thus, the back slope of object A is illuminated by the microwave energy. However, since the angle Rb of object B is larger than the corresponding depression angle R2, the radar antenna will not receive any reflection from the back slope of object B, and this period of zero reflection is likely to continue until point a is reached. The resulting radar shadow after projection onto slant range is also illustrated in Figure 1.

The description of radar image distortions given above is a simplification. Hence, in order to compensate for these effects, one has to make careful case-by-case analyses Kropatsch and Strobl, ; Goyal et al. That is, if one knows what effects are occurring at a given pixel, then one can use suitable algorithms to carry out calibration. As in the case of topographic calibration of optical imagery described above, geometric and radiometric correction of radar images requires a co-registration to a DEM. However, the calibration procedures are generally more complicated. Kwok et al. The stronger the backscatter, the brighter the corresponding image element.

The level of backscatter is determined by terrain conditions such as roughness and electrical characteristics , and also by the parameters of the radar system. Understanding the factors affecting radar backscatter can help analyze landscape properties more knowledgeably. The roughness of a surface is dependent on both the wavelength of the incident energy and the angle of incidence. Rough surfaces act as Lambertian reflectors Section 1. Smooth surfaces are specular, in that they act like a mirror and reflect the incident energy away from the sensor, resulting in extremely weak backscatter Figure 1.

Likewise, as the wavelength increases, the surface tends to appear smoother. The strength of backscatter is also affected by the incidence angle. For a given wavelength, as the incidence angle increases, backscatter becomes weaker, and so the illuminated surface appears smoother. Some ground objects can behave like corner reflectors, which can reflect high energy back to the antenna and cause bright spots on the radar image.

Such phenomena often occur in urban areas because energy can be returned by means of a double bounce from the corners of buildings Figure 1. Water and metal are good conductors. As a result, radar backscatter will be sensitive to metal objects and to the presence of moisture in the illuminated target area, even though the amount of moisture may be small.

In a radar image, metal objects such as railway tracks and metal bridges generally result in bright spots. Moisture also affects the depth of microwave energy penetration of the soil surface. If soil contains a large amount of moisture, the signal does not penetrate the soil surface and is reflected back to the radar antenna.

SearchWorks Catalog

If the soil is dry, then the radar signal can penetrate more deeply into the soil surface layer. Wavelength is also another control on the depth of penetration. Lakes and other water bodies might be expected to exhibit high backscatter, but in fact the surfaces of rivers and lakes are generally smooth relative to radar wavelengths and act as specular reflectors. The ocean surface is generally rougher, and therefore the magnitude of backscatter depends on sea state as well as on wavelength and depression angle.

All of the parameters in Equation 1. However, only T 0 is related to the properties of the illuminated surface. Thus, the quantized pixel values 0 to in a radar image are sometimes converted to T 0 before being interpreted. The received power Pr in Equation 1. A discussion of the scattering matrix is presented in the next section. Knowledge of radar polarimetry enables us to use a variety of features such as complex format data, the elements of the scattering matrix, and the coefficient of variation of polarization signature to perform image interpretation.

Some basic concepts are described first. An electromagnetic wave, besides being described in terms of wavelength and amplitude, can also be characterized using complex number format a complex number consists of a two components, termed the real and imaginary parts. When the coordinate system is translated into both the real and imaginary axes Figure 1. Radar phase represents the degree of coincidence in time between a repetitive radar signal and a reference signal having the same frequency.

The radar image can thus be formed by using any of the m, I, or Q components. As a result, this kind of radar image is said to be represented in complex format. Complex format radar imagery can be used to generate interferometric information Massonnet and Rabaute, ; Zebker et al. The potential of SAR interferometry in the form of coherence maps in land cover classification is the subject of current investigations e.


  • Shadow Threat (Book II of the Extension 1788 Series 2)?
  • Agora você está falando minha linguagem (Portuguese Edition).
  • Trip of a Switch - One mans journey into BDSM.
  • Stanford Libraries!

However, most radar systems polarize microwaves in such a way that the transmitted and received waves are restricted to a single plane perpendicular to the direction of wave propagation Figure 1. The polarized wave is therefore transmitted and received in either the horizontal H or the vertical V plane. Consequently, there are four combinations of transmission and reception for the polarized waves. The other combinations can be inferred in a similar manner.

Radar imagery generated in terms of HH or VV is called co- or like-polarized imagery, while imagery resulting from HV or VH polarization is called cross-polarized imagery. Cross-polarization detects multiple scattering from the target and thus generally results in weaker backscatter than that measured by a co-polarization configuration.

The coordinate system shown in Figure 1. F F The terms ah and av denote the positive amplitudes in the h and v directions Figure 1. In the most general case, the electric field vector of a plane monochromatic wave rotates in a plane perpendicular to the direction of microwave energy propagation, and in doing so traces out an ellipse, as shown in Figure 1. The wave is said to be elliptically polarized. If one refers to the relative amplitude and phase relationships of the components of a given wave as the elliptic polarization state, Equation 1.

See text for explanation. Modified from Evans, D. Farr, J. Van Zyl, and H. A polarimetric imaging radar measures the magnitude of the backscatter from a target as a vector quantity in such a way that the complex backscattered characteristics of any transmitting and receiving polarization configuration can be determined. The remaining elements are obtained in a similar fashion. The scattering matrix thus describes how the ground scatterer transforms the illuminating electric field.

This process is called polarization synthesis. Polarization synthesis can also be expressed in terms of either the Stokes matrix or the covariance matrix. Both of these representations consist of linear combinations of the cross-products of the four basic elements of the scattering matrix.

The entries of these matrices can also provide a variety of features for classification purposes. A polarization signature is a three-dimensional representation consisting of a plot of synthesized scattering power as a function of the ellipticity and orientation angles i. Normally, analyses are based on only two types of polarization signatures, namely, co-polarization and cross-polarization.

Pr max 1. Since CoV relates to the surface roughness, it can be used as a discriminating feature in classification. As the value of CoV increases, the measured surface tends to be rougher. The concept of CoV is based on the following observations. The polarization signature for each resolution element represents the sum of the polarization signatures of many individual measurements. If the surface being measured is smooth, the scattering mechanisms from a group of scatterers should be identical. Therefore, the maxima minima of a scattering mechanism should coincide with the maxima minima of the other scattering mechanisms.

When the composite polarization signature is derived, it will produce a composite signature in which there is a large difference in magnitude between maximal and minimal backscatter, and thus the polarization signature will result in more peak- and valley-like shapes. As a result, the value of the CoV will be small i. Conversely, if the measured ground surface is rough, several different scattering mechanisms may result, the backscatter maxima and minima may occur together from different individual scatterers, and a relatively flat polarization signature shape will be produced equivalently, CoV will be large, i.

Speckle has the characteristics of a random multiplicative noise defined below in the sense that as the average grey level of a local area increases, the noise level increases. SAR can achieve high resolution in the azimuth direction independent of range, but the presence of speckle decreases the interpretability of the SAR imagery. If such imagery is to be used in classification, then some form of preprocessing to reduce or suppress speckle is necessary. There are two approaches to the suppression of radar image speckle. The first method is known as the multilook process, while the second method uses filtering techniques to suppress the speckle noise.

This procedure is called multilook processing. As the radar sensor moves past the target pixel, it obtains multiple looks i. If these looks are spaced sufficiently far apart they can be considered to represent individual observations. The relationships among the radar aperture length L, the resolution R, and the number of independent samples Ns is expressed by Ulaby et al.

Although the averaging of independent looks can reduce the noise variance, it also causes degradation in image resolution. Along Track Direction R 0.

Fler böcker av författarna

Adaptive filters use weights that are dependent on the degree of speckle in the image, whereas nonadaptive filters use the same set of weights over the entire image. Adaptive filters are more likely to preserve details such as edges or high-texture areas e. The best-known nonadaptive filters are those based on the use of the mean or the median. The mean filter uses the same set of smoothing weights for the whole image without regard for differences in image texture, contrast, etc.

The median filter does not use a weighting procedure, but is based on the ranking of image pixel values within a specified rectangular window Mather, Both of these filters have a speckle-suppression capability, but they also smooth away other high-frequency information. The median is more effective than the mean in eliminating spike noise while retaining sharp edges. Both filters are easily implemented and require less computation than adaptive filters. In comparison with nonadaptive speckle filters, adaptive speckle filters are more successful in preserving subtle image information.

A number of adaptive speckle filters have been proposed, the best known being the Lee filter Lee, , , , the Kuan filter Kuan et al. The effectiveness of these adaptive filters is dependent on the following three assumptions Lee, ; Lopes et al. SAR speckle is modeled as a multiplicative noise note that the visual effect of multiplicative noise is that the noise level is proportional to the image gray level.


  1. Sex Crimes Chronicles - 1 (Volume One).
  2. The Man With No Face.
  3. Pattern Recognition and Classification of Remotely Sensed Images by Artificial Neural Networks?
  4. The noise and signal are statistically independent. The sample mean and variance of a pixel is equal to its local mean and local variance computed within a window centered on the pixel of interest.

    7 editions of this work

    All of the speckle filters described above rely strongly on a good estimate of local statistics e. If the window center is located close to the boundary of an image segment such as a boundary between agricultural fields , the resulting local statistics are likely to be biased and will thus degrade the filtering result. Nezry et al. Readers are referred to Sheng and Xia and Liu, Z.

    In recent years, the wavelet transform has been used for radar imagery denoising Donoho, ; Fukuda and Hirosawa, , ; Achim et al. The wavelet transform can decompose the multiplicative noise, and so simplify the speckle filtering process. Details of the wavelet transform are described in Chapter 7.

    First, the SAR image is translated into logarithmic domain and is then decomposed by means of the wavelet transform in a multiscale sense. Note that translating the raw SAR image into logarithmic domain is to convert the multiplicative noise to an additive noise. Second, the empirical wavelet coefficients are shrunk using a thresholding mechanism. Finally, the denoised signal is synthesized from the processed wavelet coefficients through the inverse wavelet transform.

    It is noted that the quality of wavelet transform for speckle suppression is closely related to the thresholding method used Donoho, Some good estimators and modeling techniques proposed by Simoncelli , Pizurica et al. Gagnon and Jouan and Achim et al. The image contaminated by the noise is displayed in Figure 1.

    Further reading on the material covered in this chapter is provided in Elachi , Liang , Slater , and contributors to Asrar This pattern is made up of measurements on a set of features, which can be thought of as the axes of a k-dimensional space, called the feature space. The aim of pattern recognition is to establish a relationship between a pattern and a class label.

    The relationship between the object and the class label may be one-to-one producing a hard classification or one-to-many producing a fuzzy classification. The features describing the object may be spectral reflectance or emittence values from optical or infrared imagery, radar backscatter values, secondary measurements derived from the image such as texture , or geographical features such as terrain elevation, slope, and aspect.

    The object may be a single pixel or a set of adjacent pixels forming a geographical entity, such as an agricultural field. Finally, the class labels may be known or unknown in the sense that the investigator may, in the case of a known label set, be able to list all of the categories present in the area of study. In other cases, the investigator may wish to determine the number of separable categories and their location and extent.

    These two methods of labeling are known as the supervised and unsupervised approaches, though some approaches to pattern recognition use a combination of both. For instance, if one decides to use the minimum distance between a pixel and the mean of each class as the classification criterion, one must first collect samples to construct estimates of the class means. The acceptability of the results will depend on how adequately these class means are estimated. The label set selected for supervised classification experiments identify information classes.

    The investigator should have sufficient knowledge of the type and the number of information classes that are represented in the study area to allow him or her to collect training samples of pixels from the image that are representative of the information classes. In contrast, unsupervised pattern recognition methods are less dependent on user interaction.

    At each iteration, the previous set of estimates of the class means is refined until the process converges, usually when the means remain in the same place in feature space over successive iterations. The results output by unsupervised methods are called clusters or, sometimes, data classes. The pattern recognition process is complete when each cluster is identified, that is, linked to a specific information class by the user. Although the unsupervised approach appears to be more elegant and automatic than the supervised procedures, the accuracy of unsupervised methods is generally lower than that achieved by supervised methods.

    In complex classification experiments, information classes often overlap. In the spectral domain, this implies that the reflectance, emittence, or backscatter characteristics of different classes may be similar. In the spatial domain, the implication is that any one object a pixel or a field, for example may contain areas representative of more than one information class. This is the mixed pixel problem. Spectral and spatial overlap of classes is the main barrier to the achievement of high classification accuracy. Even so, some interesting unsupervised algorithms are worthy of investigation as they may reveal useful information concerning the structure of the data set.

    Such methods can be thought of as exploratory data analyses or even data mining Witten and Frank, A further problem with pixel-based classifiers is that radiance carrying information that apparently reaches the sensor from a given pixel actually includes contributions from neighboring pixels, due to atmospheric effects and the properties of the instrument optics Chapter 1. Townshend et al. They note that only where pixel size is small relative to the area of land cover units, will this effect be unimportant. Until the mids, pattern recognition methods applied to remotely sensed imagery were mainly based on conventional statistical techniques, such as the maximum likelihood or minimum distance procedures, using a pixel-based approach.

    Although these traditional approaches can perform well, their general ability for resolving interclass confusion is limited. As a result, in recent years, and following advances in computer technology, alternative strategies have been proposed, particularly the use of artificial neural networks, decision trees, the support vector machine, methods derived from fuzzy set theory, and the incorporation of secondary information such as texture, context, and terrain features.

    This chapter introduces the principles of pattern recognition, starting from the concept of feature space and its manipulation using feature selection and orthogonalizing techniques. Details of the statistical classifiers are then described. Algorithms based on artificial neural networks, decision trees, the support vector machine, the fuzzy rule base concept including the mixed pixel problem , the incorporation of secondary information, and change detection are discussed in later chapters.

    These features may be image pixel values in separate wavebands, context or texture measurements, or geographical attributes of the area represented by the pixel, such as mean elevation, slope angle, or slope azimuth. Feature space is multidimensional and as such cannot be visualized. Nevertheless, standard geometrical measures such as the Euclidean distance as the shortest distance between two points are still valid Alt, , gives a nonmathematical description of hyperspace. Figure 2. The pattern recognition process involves the subdivision of feature space into homogeneous regions separated by decision boundaries.

    The various statistical, neural, and knowledge-based methods discussed in this book use different decision rules to define or specify these boundaries. In the fuzzy classification procedure Chapter 5 , decision boundaries can overlap. Orthogonal transforms, which accomplish this aim to a greater or lesser extent, are described later in this section. In a survey of pattern recognition methodology in remote sensing, Landgrebe notes the need for accurate class statistics if supervised classifiers are to perform in a satisfactory way.

    The minimum size of a training data set depends to a considerable extent on the number of features used to characterize the objects to be classified. However, ground data are difficult and also expensive to procure. Landgrebe points out that the number of possible locations in feature space depends on the number of features and on the number of quantization levels used for each feature. Given a fixed training set size, an increase in the dimensionality of the feature space e.

    The maximum likelihood decision rule described in Section 2. As the value of k increases, the number of parameters to be estimated increases disproportionately. The fact that efficient estimation of statistical parameters requires a representative sample of a sufficient size is well known; consequently, as the number of parameters increases then, for a fixed sample size, the efficiency of the estimation decreases, which implies that the confidence limits for each estimate become wider.

    The effectiveness of the classifier will therefore begin to decrease once a certain number of dimensions is reached. This is known as the Hughes phenomenon Hughes, It follows that if satisfactory results are to be obtained from the classification of remotely sensed data, the relationship between dimensionality and training sample size must be born in mind.

    The topic of sampling adequacy is considered further in Section 2. One way of mitigating the effects of high dimensionality is to determine a subset of the k-dimensional feature space that contains most of the pixels. This is the aim of orthogonal transforms. If the measurements made by two instruments are correlated, then there is a degree of redundancy present in the data. The features used in remote sensing image classification are generally correlated, and a high proportion of the information content of the data can be represented in terms of m dimensions, where m Pattern Recognition Principles 45 matrix so that all training data are used, rather than compute separate covariance matrices for each class.

    An alternative approach involves the use of orthogonal transforms, which provide a means of reducing feature space dimensionality. Nielsen , Park et al. In essence, the aim of an orthogonal transform is to replace the feature set with a derived or synthetic feature set. The covariances of the synthetic features are zero; hence there is no redundancy.

    The synthetic features are defined by linear combinations of the observed features, and much of the information content of the observed features is reproduced by m synthetic features, where m 2. The axes of this four-dimensional feature space are transformed into new four-dimensional coordinates defined by the concepts of brightness, greenness, yellowness, and nonesuch. The transformation involves rotation of the axes of feature space and translation of the origin of the coordinate system.

    For example, one of the axes could be moved to a position such that pixels with a low value of the ratio of infrared to red reflectance take low near zero values, while pixels that have a high infrared-to-red ratio take high values. The first transformed axis, brightness, is based on soil reflectance values e. The second and the third axes are based on pixels of green vegetation and senescent vegetation, respectively. The fourth axis, nonesuch, has been interpreted as being related to atmospheric conditions.

    To construct such a new coordinate system, the user selects at least two representative pixel values for each coordinate because the definition of a line requires at least two points. Each representative pixel value can be obtained by taking the average of pixels belonging to the same group. For instance, in the case of soil line construction, one can select several pixels belonging to wet soil class, then take the average as one end of the soil line.

    The other end can be obtained based on the average of dry soil pixels. Equation 2. There are studies Crist and Cicone, a; Huang et al. Readers should refer to Mather for further elaboration. The advantages of the tasseled cap transform are 1. The dimensionality of the feature space is reduced, making the classification problem less complex.

    The axes of the feature space represent specific concepts brightness, greenness, and wetness that can be considered to be defined externally to the specific data set under study. The principal disadvantages of the transform are 1. The tasseled cap axes may not be well defined for a particular problem if the coefficients are not properly calculated Jackson, There can be no assurance that significant information is not omitted by the transformation of the six-band Landsat TM data set to a set of three tasseled cap axes.

    The process of PCA can be divided into three steps: 1. Calculation of the variance—covariance or correlation matrix of multiband images e. Extraction of the eigenvalues and eigenvectors of the matrix, and 3. Transformation of the feature space coordinates using these eigenvectors. If the correlation matrix is used, each entry in C should be further divided by the product of the standard deviations of the features represented by the corresponding row and column.

    For instance, let c12 denote the entry of 1st row and 2nd column i. The correlations of other entries can be computed in a similar manner. The new coordinate system is formed by the normalized eigenvectors of the variance—covariance or correlation matrix. PCA has the property that the first principal component image PC1, derived from the first eigenvector represents the maximum amount of the total variance of the data set, and the variances of the remaining principal component images decrease in order, as denoted by the magnitudes of the corresponding eigenvalues. Normally, the variance contained in the last few principal components is small.

    Therefore, PCA is often used to condense the information in a multiband image set into fewer channels represented by the higher-order components , and input them, rather than the raw data, into a classifier, thus reducing the computational demands and possibly improving performance. It may also be of interest to note that the locations of the principal component axes in the feature space are fixed by reference to two constraints.

    First, the axes are orthogonal. Second, each axis accounts for the maximum variance in the data set given that the influence of any higher-order components has been removed. Relaxation of the second constraint allows the use of orthogonal rotations that might make the resulting components more interpretable. It is also possible to relax the first constraint, but that does not seem to be helpful in the present context.

    Here, the original pixel values in the two images fall in the range [A, B] and [C, D], respectively. After the covariance or correlation matrix is formed, the resulting eigenvectors are orientated in the directions [E, F] and [G, H], respectively. The data range [G, H] of PC2 is much smaller, and the corresponding principal component image will show less variation. However, the information content of the two axes will differ in that PC1 will represent the information that is shared between all or most of the original spectral bands and may therefore be thought to represent average brightness , whereas the second and subsequent components will contain information that is statistically uncorrelated with brightness.

    For example, the second principal component of a multispectral image set of a vegetated area may be related to variations in the nature, vigor, and spatial cover of the vegetation. The resulting principal component images are shown in Figure 2. As noted earlier, PCA can be based either on the matrix of variances and covariances or the matrix of correlations among the spectral bands of an image set, which can be derived either from sample data or from all the pixels of the image set.

    These matrices have different eigenvectors, and thus the resulting principal component images will also be different. If the images subjected to PCA are not measured in the same scale, the standardized method is also a better choice than the unstandardized method because the correlation matrix normalizes the data onto the same scale i. Users should be aware of the fact that the use of the correlation matrix implies the equalization of within-band variance, and should also note that the technique does not differentiate between information and noise.

    They propose a method of distinguishing between signal and noise, and suggest that their approach should be preferred to filtering because it does not blur the data. Their method is based on the principle that information, or signal, is spatially autocorrelated in the sense that pixels in a neighborhood will tend to have similar values because they represent some geographical object such as a lake or a forest.

    The histograms show that the variance of the first principal component exceeds that of the second, which in turn is greater than the variance of the third component. The MAF procedure is based on the ordering of set of orthogonal functions in such a way that autocorrelation decreases from the low-order to the high-order functions, that is, autocorrelation is minimized rather than variance being maximized, as is the case with PCA. The lower-order functions may therefore be expected to contain mainly nonautocorrelated noise while the higher-order components would represent information.

    The procedure is scale-free in the sense that the same result is achieved irrespective of the scale of measurement used for each feature. This is because the procedure maximizes a ratio—that of signal to noise. A detailed account of the computational procedures involved in the MAF transform is provided by Nielsen and Nielsen et al.

    Switzer and Green suggest that the original data set is converted to two difference images, one being shifted by one pixel horizontally and the other being shifted by the same amount vertically. Thus, the MNFs produce components that successively maximize the signal-to-noise ratio, just as PCA generates components that successively maximize the variance. The noise variance can be estimated in a number of ways.

    For example, the differences between adjacent pixels can be used, as described in Section 2. Other methods of estimating CN are considered at a later stage. Martin and Wilkinson show how this equation is solved by reduction to standard form. The computed eigenvectors are used to derive component images, as described in Section 2. Clearly, the nub of the issue is the estimation of CN. Neilsen lists five methods, partly derived from Olsen These are 1. Simple differencing, using the method of Section 2. Determine the difference between the value of the pixel of interest and the local mean, which is computed for a rectangular window.

    Method 3 , but the local median is used in place of the local mean. Compute the residual from a local quadratic surface based on the pixel values neighboring the pixel of interest. Neilsen also considers the problem mentioned previously of periodic noise, such as banding due to differences in sensor calibrations. Banding, which is seen as horizontal striping, is obviously autocorrelated, and so will be identified as part of the signal.

    Neilsen suggests that frequency-domain filtering Mather, be used prior to the calculation of the noise covariance. The use of wavelets to estimate noise is another possible approach Mather, Other pertinent references to the derivation and use of MNFs are Green et al. Neilsen et al. As n increases in magnitude, the computational cost will rise nonlinearly.

    The use of large artificial neural networks also results in increased training times. Two approaches can be used to reduce the number of input features without sacrificing accuracy. One is to project the original feature space onto a subspace i.

    Classification Methods for Remotely Sensed Data | Taylor & Francis Group

    This can be done using either an orthogonal transformation, as described in Section 2. The second method is to use separability measurements in the input feature space, and then select the subfeature dimension in which separability is a maximum. The aim is to reduce the feature space dimension without prejudicing classification accuracy. Two separability indices, the divergence index Singh, , and the B-distance Haralick and Fu, are widely used.

    The divergence index Dij between two classes i and j is derived from the likelihood ratio of any pair of classes. In some cases, a transformation of Dij, called the transformed divergence TDij , is used. The greater the value of Dij or TDij, the greater is the class separability based on selected m subfeature dimension. If expression 2. The sum of Bij for every pair of classes except the class itself is obtained and is a measure of the overall separability.

    Although the two separability measures defined by Equations 2. Instead, the user specifies only the number of clusters to be generated, and the classifier automatically constructs the clusters by minimizing some predefined error function. Sometimes the number of clusters can be detected automatically by the classifier Cheung, In theory, users do not need to interact with the classifier, which operates independently and automatically. This algorithm is implemented by recursively migrating a set of cluster means centers using a closest distance to mean approach until the locations of the cluster means are unchanged, or until the change from one iteration to the next is less than some predefined threshold.

    Change may also be defined in terms of the number of pixels moving from one cluster to another between iterations, or by the value of a measure of cluster compactness, such as the sum of squares of the deviations of each pixel from the center of its cluster, summed for all classes.

    The use of the k-means algorithm must estimate the initial number of clusters, say n, present in the data. The algorithm initially determines the location of n cluster means within the feature space, either by generating random feature vectors, or by selecting n pixels at random from the available data, or by using a predefined set of feature vectors. Each pixel is then associated with its nearest cluster center. Nearest is defined either by the Euclidean or Mahalanobis distance measure.

    At the next stage, the location of each cluster mean is recalculated based on the set of pixels allocated to that center. The process is repeated until the change between iterations becomes less than a user-specified threshold. The dimension of vector xi is equal to the number of bands being used as input. Matrix Cj is obtained by Equation 2. The Mahalanobis distance takes into account the shape of the frequency distribution assumed to be Gaussian for a given cluster in feature space, resulting in ellipsoidal clusters, whereas the use of the Euclidean distance assumes equal variances and a correlation of 1.

    If the Euclidean distance measure is used, then the decision to place the pixel in cluster a or b will be ambiguous. In addition to the above considerations, the initialization points of the clusters forming process may also play an important role for generating clustering outcomes. Pena et al. In general, the performance of fuzzy clustering methods is superior to that of the corresponding hard versions, and they are less likely to stick in a local minima Bezdek, The FCM algorithm and its derivations have been implemented successfully in many applications, such as pattern classification and image segmentation, especially those in which the final goal is to make a crisp decision.

    The FCM algorithm uses the probabilistic constraint that the membership probabilities of a data point across classes must sum to one. This constraint comes from generalizing a crisp c-partition of a data set, and is used to generate membership-update equations for an iterative algorithm based on the minimization of a least-square type of criterion function. The constraint on membership used in FCM is meant to avoid the trivial solution of all membership probabilities being equal to zero. A more detailed description of FCM is provided in Chapter 5. Classification performance is highly dependent on how well the user is able to model the target class distribution.

    Ideally, the training areas should be sites where homogeneous examples of known cover types are found Townshend, A supervised statistical classification can be carried out by the following three steps: 1. Define the number and nature of the information classes, and collect sufficient and representative training data for each class, 2. Estimate the required statistical parameters from the training data, and 3.

    This chapter aims to report on some of the main application of neural networks in remote sensing and indicate topics where further development may be expected to occur. Unable to display preview. Download preview PDF. Skip to main content. Advertisement Hide. Authors Authors and affiliations G.

    This process is experimental and the keywords may be updated as the learning algorithm improves. This is a preview of subscription content, log in to check access. Baret F Use of spectral reflectance variation to retrieve canopy biophysical characteristics, In: Danson, F.

    Day C Remote sensing applications which may be addressed by neural networks using parallel processing technology, In Kanellopoulos, I, Wilkinson, G. Foody GM Approaches for the production and evaluation of fuzzy land cover classifications from remotely-sensed data, International Journal of Remote Sensing, 17, — CrossRef Google Scholar. Foody GM a Image classification with a neural network: from completely-crisp to fully-fuzzy situations, In Atkinson, P. Ji CY Land-use classification of remotely sensed data using Kohonen selforganising feature map neural networks, Photogrammetric Engineering and Remote Sensing, 66, — Google Scholar.

    Ecology, 72, — CrossRef Google Scholar. Scepan J Thematic validation of high-resolution global land-cover data sets, Photogrammetric Engineering and Remote Sensing, 65, — Google Scholar. Wang F The use of artificial neural networks in a geographical information system for agricultural land-suitability assessment, Environment and Planning A, 26, — CrossRef Google Scholar. Wiess M, Baret F Evaluation of canopy biophysical variable retrieval performances from the accumulation of large swath satellite data, Remote Sensing of Environment, 70, — CrossRef Google Scholar.