Published in Soil Sci. Soc. Am. J. 67:1810-1822 (2003).
© 2003 Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
DIVISION S-5PEDOLOGY
Supervised Landform Classification to Enhance and Replace Photo-Interpretation in Semi-Detailed Soil Survey
Tomislav Hengl*,a and
David G. Rossiterb
a Dep. of Earth System Analysis, International Institute for Geo-Information Science & Earth Observation (ITC), P.O. Box 6, 7500 AA Enschede, The Netherlands
b International Institute for Geo-Information Science & Earth Observation (ITC), P.O. Box 6, 7500 AA Enschede, The Netherlands
* Corresponding author (hengl{at}itc.nl).
 |
ABSTRACT
|
|---|
A method to enhance manual landform delineation using photo-interpretation to map a larger area is described. Conventional aerial photo-interpretation (API) maps using a geo-pedological legend of 21 classes were prepared for six sample areas totaling 111 km2 in the Baranja region, eastern Croatia. Nine terrain parameters extracted from a digital elevation model (DEM) (ground water depth, slope, plan curvature, profile curvature, viewshed, accumulation flow, wetness index, sediment transport index, and the distance to nearest watercourse) were used to extrapolate photo-interpretation over the entire survey area (1062 km2). The classification accuracy was assessed using the error matrix, calculated by comparing both the whole API maps and point samples, with the results of classification. The first results, using a maximum-likelihood classifier, were 58.2% (hill land), 39.1% (plain), and 45.3% (entire area) reproducibility of the training set. Six classes in the plain were responsible for a large proportion of the misclassifications, due to an insufficiently detailed DEM and the complex nature of landforms (point bar complexes, levees, active channel banks), which cannot be explained with the terrain parameters only. Reproducibility for a simplified legend of 15 classes over the study area was improved to 65.8% (plain), 58.2% (hill land), and 63.4% (entire area) using the whole-API training set. After the simplification of legend (15) and with the iterative (3) selection of point-sample training set, classification was able to reproduce 97.6% (hill land), 86.7% (plain), and 90.2% (entire area) of the training set. The supervised classification showed fine details not achieved by photo-interpretation. The number of manual photo-interpretations that had to be prepared was reduced from 84 to 6. The methodology can be applied by soil survey teams to edit and update current maps and to enhance or replace API for new surveys.
Abbreviations: GIS, geographic information system DEM, digital elevation model API, aerial photo-interpretation ILWIS, Integrated Land and Water Information System GWD, ground water depth SLOPE, slope gradient PROFC, profile curvature TANGC, plan curvature VSHED, viewshed reflectance FLOW, accumulation flow CTI, Compound Topographic Index STI, Sediment Transport Index DISTW, distance to nearest watercourse PC, principal component
 |
INTRODUCTION
|
|---|
A PRODUCT OF THE SEMI-DETAILED soil survey is an entity-class or polygon map of soil types at a typical scale of 1:50000, with minimum legible delineations of 10 ha and optimal delineations of 40 ha (Forbes et al., 1982). This corresponds to
Order 3
to
Order 4
(Soil Survey Division Staff, 1993, Table 2-1), semi-detailed or medium intensity soil surveys (Avery, 1987, Table 2). These soil maps are intended for extensive land-use planning and to give a reasonably accurate picture of the distribution of soil types in an area at relatively low cost. The standard method of semi-detailed survey is to draw preliminary boundaries on aerial photos by means of stereoscopic landscape analysis, and then determine the soil types that occur in each map unit by field inspection of the soil at representative sites. A common inspection density is one observation per one to four map centimeters squared (Western, 1978, Table 3.3), which at this scale represents 25 to 100 ha. Often, the surveyors make sure that there is also at least one observation per each polygon. The observations are used to characterize the composition of photo-interpretation units, rather than to find or adjust every boundary.
One approach to semi-detailed survey is to study representative sample areas, typically covering about 10% of the survey area, more intensively to arrive at a better understanding of the soil-landscape relations and map unit composition. Field sampling is thus concentrated in comparison with the densities mentioned above, to an observation density of one per 2.5 to 10 ha in the sample area. This is at the cost of samples over the rest of the area, which is then mapped purely by photo-interpretation, extrapolating from the detailed understanding of the soil landscape built up in the sample areas. Because of the low inspection density, the only way that such maps can be reasonably accurate is if the surveyor is able to correctly understand the soil-landscape relations in the survey area, and then map these by surface features visible on the aerial photo (e.g., the landform as seen stereoscopically).
Several systematic approaches to soil-landscape photo-interpretation have been developed. In this study, we use the
geo-pedological
method of Zinck (Zinck, 1988; Zinck and Valenzuela, 1990), which explicitly relates landform elements to predictions of soil classes and properties. In many cases, however, including standard mapping procedures in the USA, surveyor's experience on soil-landscape relations is used without formalization (Soil Survey Division Staff, 1993, p. 219231). Jenny's conceptual equation (Jenny, 1980), that is, soil property or class = f(climate, organism, relief, parent material, time) is thus used subjectively (and sometimes subconsciously) as a concept to guide photo-interpretation. Depending on the survey area, some aspects of the equation may be more important than others; for example, on a typical hillside, the catena or topo-sequence concept may be uppermost in the surveyor's mind, whereas in areas of recent deposition, parent material and time may be more important. In many areas, landform classification or segmentation, usually by photo-interpretation, is an important step in building up the soil map, since the landform delineations are often associated directly with natural soil bodies (Buringh, 1960).
The process of quantitatively describing terrain is known as
digital terrain analysis
(Lane et al., 1998),
geomorphological analysis
(Odeh et al., 1994),
landform parameterization
or
landform morphometry
(Schmidt et al., 1998). A concise overview is given by Weibel and Heller (1991) and Moore et al. (1991). Wilson and Gallant (2000) have edited a recent overview, including an overview of applications in soil mapping. Shary et al. (2002) give an overview of morphometric terrain parameters and their characteristics. If other factors than just relief and time are considered in the predictive model, a more general term such as
soil predictive variables
(Odeh et al., 1994) or
soil environmental variables
(McKenzie et al., 1999) is used instead of soil-landscape parameters. Examples of such additional factors are temperature, rainfall, geology, ground water, land use, and vegetative cover. All these, popularly called
cheap-to-measure attributes
(Burrough and McDonnell, 1998), are inexpensive and available in continuous fashion and can be therefore used to replace part of the expensive field sampling of soils or make more efficient use of it.
For some time, there has been interest in replacing or supplementing the expert judgment of the surveyor with reproducible procedures, particularly by the use of digital terrain analysis to predict the distribution of soil properties (Moore et al., 1993; Gessler et al., 1995; Bell et al., 2000) and model depositional and erosional processes (Mitasova et al., 1996). Irvin et al. (1997) were among first to use terrain parameters to derive soil-landscape elements and provide more objective basis for production of soil maps. They compared automated classification of landforms with the manual delineations by API using a small study area. Other authors have attempted to directly derive soil classes from terrain parameters (Thompson et al., 1997; Thomas et al., 1999). Recent developments include use of automated methods to detect landform facets using unsupervised fuzzy-set classifications (Burrough et al., 2000). These are then applied even in the areas of lower relief to enhance crop production using site-specific management (MacMillan, 2000). A related effort by Zhu and collaborators (Zhu et al., 1996, 2001) attempts to infer soil classes identified by expert soil surveyors as being typical for each class directly from the terrain parameters, geological and remote sensing data.
With this background, we decided to extend the work of Irvin et al. (1997) and analyze to what extent supervised landform classification could be used to replace additional photo-interpretations at county level, that is, for areas on the order of 1000 km2. Here the basic idea was to use subjective photo-interpretation in small sample areas to train a classifier, and then apply this classifier in the extrapolation, which is analogous to the remote sensing-based land cover mapping. We then assessed the classification accuracy using the error matrix, calculated by comparing both the whole API maps and point samples, with the results of classification. In addition, we investigated whether visualization of terrain parameters in a false color composite could improve subjective photo-interpretation in the training areas. Finally, we wanted to see if the classifier would identify features or patterns that had been omitted or missed by the photo-interpreter, that is, details that are not visible on the photos.
 |
MATERIALS AND METHODS
|
|---|
Study Area
The study area of 1062 km2 corresponds to the Croatian portion of the historic region of Baranya. It is located in northeastern Croatia, in the triangle formed by the Danube River to the east, the Drava River to the southwest, and the Hungarian border to the north (centered at N 45° 42' 14'' N lat., 18° 40' 35'' E long.). It is the half of Osijek-Baranja County lying between the Drava and Hungary, and is part of the large Panonian plain, which stretches through all of Hungary and ends in the northeastern part of Croatia and in the Vojvodina region of Yugoslavia.
Soils are mostly formed in Pleistocene and Holocene sediments (Bognar, 1984). The major relief types can be seen in Fig. 1
. The study area consists of a number of different landforms and therefore was interesting to see how the classification will work for different types. The principal soil-forming factors differ between the essentially erosional hill land and depositional plain. The most extensive landscape is the fluvial plains of the Danube and Drava Rivers, with temporary and permanent swamps and a series of terraces. The higher terraces are covered with 20 to 50 m of Pleistocene loess overlying older fluvial sediments. The plains are fairly level, and cover about 85% of the total area. Elevations range from 80 to 250 m above sea level, and local relief is flat to gently undulating. About 20% of the area is in a separate landscape, Baranja Hill. This is a dissected asymmetrical horst ridge of basalt and andesite, mostly blanketed by Pleistocene loess, from a few meters on the summit to 30 m at the bottom of the glacis. In some vales and on the glacis, there is gravely colluvium eroded from bedrock.

View larger version (66K):
[in this window]
[in a new window]
|
Fig. 1. Location of the study area in Europe (upper left), main relief types as seen from digital elevation model (DEM) (upper right), and a cross-section sketch indicated by line (below). Note that the DEM is stretched to the 80- to 100-m range to emphasize the main relief types.
|
|
The average monthly temperatures vary from 0°C in January to 21°C in August and with an overall annual average of 11°C. Annual precipitation varies from 630 to 750 mm. The ground water depth varies from above the surface in flooded swamps to more then 10 m on the higher terraces and is mainly determined by the two rivers and thus varies annually and seasonally. Water tables reach their maximum in April and minimum in October. During spring, the active floodplain and the lower terraces are occasionally flooded.
Data Input and Photo-Interpretation
Topographic maps at 1:50000 covering the study area were scanned, imported to the GIS, geo-referenced to an accuracy of 2 to 8 m using overprinted grid intersections, and resampled to a 10-m cell size, all in the Integrated Land and Water Information System (ILWIS) (GIS; Unit Geo Software Development, 2001). The contour interval was 10 m, with supplementary 5-m contours and spot heights in areas of low relief. On-screen digitizing was used to create vector layers of contours and spot heights. We used the ANUDEM interpolation method with drainage enforcement (Hutchinson, 1989), as implemented in the TOPOGRID command of the ArcInfo 8 GIS (Environmental Systems Research Institute, 2001) to produce a DEM. This algorithm required about 20 iterations in areas of low relief and 100 iterations elsewhere, to reach a steady state, that is, when DEM does not change visually any more. On inspection of the results of the first efforts, artifacts such as slope breaks, cut-offs, or spurious sinks, were still clearly visible, especially in areas of low relief and on hilltops. Thus, the accuracy of terrain parameters is
less a function of absolute accuracy of elevation values than of how well and how smoothly the landscape features are modeled
(MacMillan, 2000). Thus, we decided to digitize supplementary contour lines and spot heights indicating small channels, hilltops, and ridges that were not indicated on the original topographic maps. Their elevation (±2 m) was estimated from nearby contours and field knowledge of relative local elevation differences.
From a total of 167 aerial photos covering the whole study area, we selected six training photos of 2116 ha (4.6 by 4.6 km) each, totaling 11079 ha. These were selected subjectively to provide a representative sample of major soil landscapes. Training Areas A and F covered sections of Baranja Hill, the abandoned course of the Drava, and the edge of the low terrace, while the others (B, C, D, and E) covered the terraces and the floodplain. The middle photos from triplets of photogrammetric vertical 23 by 23 cm aerial photos at approximately 1:20000 scale (see Fig. 3) were interpreted according to the geo-pedological method of Zinck (1988) and cross-checked in the field, resulting in a four-level hierarchical legend (Table 1). The minimum delineation size was 10 ha (0.4 cm2 on the map), and the minimum delineation width was 150 m (3 mm on the map), since the objective was to make a 1:50000 soil map. Twenty-one soil-landscape units (seven in the hill land, 14 in the plain) were identified, of which 13 (six in the hill land, seven in the plain) accounted for 95% of the training area.

View larger version (70K):
[in this window]
[in a new window]
|
Fig. 3. Visualization of relationship between terrain parameters and landform classes (Training Area A): (a) API delineations and main landforms and (b) boundaries overlaid over terrain parameters.
|
|
Both the photos and the interpretation overlays were scanned, imported into ILWIS, geo-referenced with an ortho-correction to a horizontal precision of 3 to 15 m, using five to eight tie-points per photo (Rossiter and Hengl, 2002).
Extraction of Terrain Parameters
The DEM was used, directly or as a component, in calculating eight terrain parameters (maps): ground water depth (GWD), slope gradient (SLOPE), profile curvature (PROFC), plan curvature (TANGC), viewshed reflectance (VSHED), accumulation flow (FLOW), Compound Topographic Index or wetness index (CTI), and sediment transport index (STI), each at 30 m grid resolution (Fig. 2)
. SLOPE, PROFC, TANGC, and VSHED were calculated directly from the DEM with 5 by 5 pixel filters in ILWIS, which implements the Zevenbergen and Thorne (1987) formulas. Independently, the distance to nearest watercourse (DISTW) was computed from a map of the drainage network.
Ground water depth was calculated by using the additional information from the topographic map and hydrological stations. The base elevation of the water table was estimated from four benchmarks at Danube and Drava River level on the edges of the study area. We used mean annual water table height measurements and second-order trend function to interpolate the water table surface for the whole area. The GWD was then calculated as the difference between the DEM and this surface. Thus, the GWD represents a slight adjustment of the DEM or relative elevation for the regional slope to the southeast.
The CTI reflects the tendency of water to accumulate at any point in the landscape, while STI reflects the erosive power of the overland flow:
 | [1] |
 | [2] |
where Af is the specific catchment or contributing area, which is the cumulative number of grid cells draining through the target cell (FLOW) and ß is the local slope angle related to that cell (Burrough and McDonnell, 1998). Both CTI and STI were calculated in ILWIS using the multiple flow direction method of Quinn et al. (1991). Since the algorithm was developed using the neighborhood operation in GIS, it needs a number of iterations as an input. Here, we used 50 iterations, that is, neighboring pixels to derive flow accumulation. This small number of iterations was sufficient, because remaining changes with further iteration were only in stream bottoms, which already had a high CTI relative to other landscape positions. A problem with the algorithm was that in pixels with zero slope, the CTI calculation fails due to division by zero. Also a zero accumulation flow is unrealistic and will produce an undefined pixel. In these positions, we approximated CTI by iteratively averaging the slope (ß) and accumulation flow maps (Af) from surrounding pixels until all zero values were replaced with small values. This has the effect of creating pools of high CTI in the plain, which in the study area is realistic due to lowest position of these areas.
The viewshed reflectance (VSHED) is a relative estimate of direct incoming radiation, that is, an estimate of the solar energy reaching the surface. It was computed using the formula estimated by Horn (1981) and described in Burrough and McDonnell (1998). Here we assumed the sun to be at an elevation of 45° and azimuth of due south. This variable was selected to present different expositions and environmental conditions.
During the creation of the predictors, it became clear that there were major differences in central values and spread of terrain parameters (as seen on histograms). Some predictors (SLOPE, VSHED, CTI) showed asymetrical, log-normal, while others (GWD, STI, and DISTW) inverse distributions. This asymmetry in histograms reflects with a low contrast in images due to the domination of the plain landforms. In addition, there was a significant inter-predictor correlation between the two major landscapes (hill land and plain) (Table 2). Especially PROFC and TANGC are inversely related, as are CTI and SLOPE, while the strongest correlation show STI and SLOPE. Similarly, higher elevations are associated with steeper slopes and lower CTI. Because of these correlations and the difference in spread among predictors, we assumed that the data reduction by factor analysis in a GIS (Eastman and Fulk, 1993) would be an effective transformation to reduce multicollinearity and improve the contrast in predictors.
View this table:
[in this window]
[in a new window]
|
Table 2. Correlation between predictive variables for whole-area, hill land, and plain only. The most significant values (>0.5) are italicized.
|
|
The principal component analysis (Table 3) shows that the first five components account for more than 80% of total variance. The most significant data reduction is in hill land, where first three components accounted for almost 75% of total variance. Still, the high proportion in higher components in whole area shows that the predictors had a fair degree of independence, which is often not the case with remote sensing images.
View this table:
[in this window]
[in a new window]
|
Table 3. Variance proportions explained by the principal component analysis for stretched principal components (PC1 to PC9).
|
|
Since the predictors are expressed in different units with widely different ranges and distributions, all the maps of terrain parameters were rescaled to a dynamic range of 0 to 255, which is the one byte per pixel structure typical for satellite images. In this case, we decided to use a linear stretch with 0.5% truncation in each tail. The principal components were normalized to the same range but without truncation. The landform maps were then ready to be used in an image processing software (ILWIS) as
synthetic bands,
that is, to classify the whole area as in the case of classification of remote sensing images (Janssen and Huurneman, 2001).
Training and Classification Stage
Two methods for selecting training samples were compared. In the first, the entire area of the interpreted photographs, that is, API maps (in further text whole-API training set). In the second, training samples were created by manual selection on-screen of about 100 pixels within each photo-interpretation unit in the sample areas (in further text point-sample training set). Here by central concept we consider locations, which were in our mental model typical representatives of landform classes when observed stereoscopically. In addition, the photo-interpretation units were displayed as boundaries over false-color composites of synthetic bands and then point-samples checked to ensure that they fall in relatively homogenous facets. The selection of variables for the three colors and their contrast was adjusted repeatedly to highlight the differences between API units. Thus the second method allows more precision, as the photo-interpretation units typically have inclusions within their minimum legible delineation that may confuse the classifier. This subjective process is an extension of subjective photo-interpretation: the analyst is asked to find locations within each landform class where representative soils should be found, according to landform. In photo-interpretation, a three-dimensional model is constructed in the analyst's visual perception by comparing adjacent photos of a stereo-pair, whereas in on-screen interpretation, a color composite is adjusted by the analyst until key geomorphic differences are evident by color alone.
To investigate whether different major landscapes should be classified separately, we also divided the two landscapes (hill land and plain) by a clearly visible master line. This line was manually delineated on-screen by visually interpreting a color composite of elevation and slope map. The two major landscapes were then classified separately and then merged to a single map.
The training samples were then used as input to maximum likelihood classifiers (Lillesand and Kiefer, 2000, Section 7.9), with no distance thresholds, so that all pixels were classified. Automated (classification) and manual (photo) API maps were compared over the entire training area with a confusion matrix, with two test criteria: (i) the proportion of agreement between the two classifications, and (ii) the kappa coefficient, which accounts for chance agreement (Congalton and Green, 1999). In addition, the nature and seriousness of the errors was evaluated subjectively, both from the confusion matrix and by a visual comparison of the maps. For the inclusive training set, the same samples were used for classification and accuracy assessment. Thus in this context
accuracy
is more properly termed
reproducibility,
that is, the degree to which the automated classifier could reproduce the subjective photo-interpretation and sample point selection.
 |
RESULTS
|
|---|
Different terrain parameters, when examined visually, have shown stronger relationship with the delineations in hill land and in plain. In hill land, the CTI, GWD, SLOPE, and PROFC showed strongest correspondence with the manual delineations (Fig. 3) . When evaluated in the feature space (scatter plots), landform classes in the hill land area showed different clustering, while in the case of landforms in plain, the clouds of points were narrow and adjacent to each other (Fig. 4)
. For example, to distinguish between a channel (Pl312) and terrace (Pl311), a small difference in GWD matters. Mapping these classes is therefore much more dependent of how well are the training pixels selected. The overlay of the photo-interpretation boundaries on a false-color composites (see http://www.itc.nl/library/Academic_output/2003/ [verified 6 June 2003]) clearly showed deficiencies in the photo-interpretation. First, some areas had not been correctly identified by photo-interpretation. For example, sharp bands of bright yellow indicate steep slopes at high elevations; these are either transitions (boundaries) between higher and lower soil-landscape units (e.g., summit Hi111 and shoulder/backslope Hi112) or, if wide enough (>150 m at this scale), units of the scarp (Hi211). Second, some areas, while correctly identified, should have their boundaries adjusted to increase their homogeneity. These adjustments are easily achieved with on-screen digitizing. In this sense, the color composite provides an objective visualization of the geomorphology to supplement the stereovision of the photo-interpreter.

View larger version (12K):
[in this window]
[in a new window]
|
Fig. 4. Scatter-plot of feature space formed using terrain parameters stretched to 0 to 255 range. CTI and SLOPE were shown to be good predictors in the hill land area (left), while DISTW and GWD show separation between the landform units in the plain (right).
|
|
Classification with principal components showed that the first three components were sufficient in the hill land (47.5% of overall accuracy), but that in the plain, substantial improvement in the classification continued through the eighth component (Table 4). This means that information in higher-level components is still useful for the classification of the landform classes and should not be discarded. Finally, because of the low data redundancy in this data set, there was little advantage in working further with principal components, so we concentrated on the original terrain parameters instead. In addition, the principal components are harder to interpret and therefore unpractical for visualization using false-colors or scatter plots.
Each class in the training set needed a non-zero estimate of the variance. Otherwise poorly conditioned or singular matrices will give unreliable results on inversion, and probability classifiers such as maximum likelihood will fail. This has happened several times with our classification when we selected points from terraces where all pixels either had the same SLOPE or GWD. So it is not possible to only select central concepts, some variability must be also included. We achieved this by iteratively inspecting the scatter plots of all band combinations to ensure separability and variability of point clusters (training set).
Reproducibility
Initial attempts to classify the entire landscape, never achieved better than about 50% overall accuracy. The classifications using nine predictors, either as original predictors or their principal components, and all API legend classes showed clear differences between methods but similar overall results. The maximum-likelihood classification gave 45.3% (Kappa = 42.6%) with whole-API training set (Table 5) and 36.8% with point-sample training set overall reproducibility. The corresponding figures for the classification of separate landscapes were 58.1 and 51.6% (hill land), and 39.1 and 34.4% (plain). The whole-API set was consistently superior to the point-sample set, and classification of separate landscapes was superior to a single classification, but in no case was the classification accuracy satisfactory (>80%).
Considering the major classification errors in the plain, both whole-API and point-samples gave similar results. When compared in the whole area, classes Pl115 (cut-off channels) and Pl121 (point bar complex in the floodplain) were grossly over-classified, mostly at the expense of Class Pl111 (floodplain) (Table 5). Class Pl221 (abandoned point-bar complexes) was also grossly over-classified, at the expense of five other classes (Pl111, Pl113, Pl211, Pl311, and Pl411). This shows that the complex landform facets cannot be easily distinguished by using the terrain parameters. The overall poor results can be attributed to poor separation in feature space, especially in plain region. Since these classes occupied a small proportion of the training areas, they were eliminated from the legend. Similar considerations applied to Pl112 (levee), Pl122 (active channel banks), and Pl213 (small elevations on the low terrace). Together these classes occupied only 3.5% of the training API in the study area (4.7% of the plain).
In the hill land, no single class contributed disproportionally to the poor reproducibility. Results were sensitive to sampling method. For the whole-API set, Hi112 (shoulder/backslope) was over-classified, mostly at the expense of Hi311 (vale slopes) and Hi212 (colluvial footslopes of escarpments). The latter may be due to uncertain placement of the photo-interpretation boundary between these two adjacent units. For point sampling, Hi312 (vale bottoms) was over-classified, mostly at the expense of Hi212 and Hi411 (glacis). In both cases there was substantial confusion between most classes, resulting in moderate overall accuracy. In the case of whole-API set, this is attributed to the heterogeneous nature of the landform predictors, even in homogeneous photo-interpretation units. This cannot be corrected, so the 58.1% reproducibility in the hill land is the best possible with this set of predictors.
Improving Reproducibility
To improve reproducibility, we reduced the original legend according to the results of the first classifications. This corresponded to a priori ideas about what differences might be difficult to detect by landform analysis alone. For example, recognition of the channel classes that are of high curvature and close to the water table is feasible. Distance to streams does not help, because some abandoned channels are quite close to active ones. A second group of features that were grouped in the reduced legend were morphologically compound classes such as point bar complexes that consist of an array of smaller channels, levees and smaller elevations. These are inherently hard to classify, which is similar to the problem of automatic classification of urban areas as a land cover class, often consisting of mixed features.
For the whole-API training set, we assigned the photo-interpretation areas for the eliminated classes to the geomorphologically most similar class of the reduced legend, which turned out to always be adjacent to the merged class in geographical space. We did not consider the confusion in the first classification, but rather reduced the legend on these geomorphological criteria. The reclassifications were: Pl112 (levee) to Pl113 (abandoned point bar complex on floodplain); Pl115 (cut-off channels), Pl121 (point bar complex on floodplain), and Pl122 (active channel banks) to Pl111 (floodplain); Pl213 (elevations on low terrace) to Pl211 (tread of low terrace); and Pl221 (abandoned point bar complex on low terrace) to Pl211. Thus, Pl113 in the reduced legend groups those units in the floodplain that have less active flooding than Pl111. Thus we gave up the attempt to map some details of the floodplain, namely levees, cut-off channels, active channel banks, and coarse-textured point bar complexes; and also for the low terrace, namely abandoned point bar complexes and small elevations. Some classes had to be merged with others of different lithology (here, dominant sediment size).
We then repeated the classification with the whole-API training set, resulting in overall accuracies relative to the API of 63.4% (whole area), 65.8% (plain), and 58.2% (hill land), that is, an improvement of 26.7% in the plain and 18.1% overall; the results for the hill land were not affected. These results show that whole-API set is unlikely to produce satisfactory results, due to the unavoidable heterogeneity within an API unit and consequent overlap in feature space. However, they provide the basis for manual improvement. The maximum-likelihood classifier with the point-sampling method was quite sensitive to the training set, so that the classification could be improved considerably by iterative selection, classification, and evaluation of results. This effect was most pronounced in the plain, because of the clustering of classes in feature space, as illustrated by Fig. 4. In both landscapes, the best results were achieved when the training sets were selected using the central concept and the landform classes used were defined as morphologically more or less homogenous units.
After three iterations we were able to achieve high reproducibility for the point sample itself: 90.2% (Kappa = 89.3%) (Table 6). Thus we were able to reproduce the classification of the central concept of each landform class. However, agreement with the whole-API set was only improved to 55.8% (hill land), 55.4% (plain), and 53.6% (entire area) using this iteratively selected point sample. This shows that the API polygons are indeed heterogeneous, and their internal variability is best represented by all pixels in the map unit. On the other hand, to identify fine detail in the landscape, point-samples are preferred. The overclassification of some classes (e.g., Pl313) was probably because of under-interpretation in the original API. These are well-defined elevations but difficult to see stereoscopically, because of the low relative elevation difference. In this sense, the automatic classification is more in accordance with reality than the reference API.
View this table:
[in this window]
[in a new window]
|
Table 6. Reproducibility of point training samples by the maximum-likelihood classification for the whole area, after legend simplification.
|
|
An interesting question with a hierarchical legend is to what degree are the higher levels operational. In this case, to what degree are the misclassifications at detailed level within the same higher-level category. At the highest level (landscape), the automatic classifier using selected points and a reduced legend was quite good. Almost all pixels of the training sample (98.6%) were classified in the correct landscape. The only significant errors were areas of Pl311 (tread of high terrace) and Pl411 (old floodplain) misclassified as Hi312 (vale bottom). At the second level (relief type), overall accuracy was 72.5%, which can be compared with 53.6% at the landform (detailed) level, as explained above. This shows that the hierarchical legend of landforms (Table 1) provides useful information as classes are grouped.
Extrapolation to the Entire Study Area
The final classification map produced by iteratively selected point samples, reduced legend and maximum-likelihood classification is shown in Fig. 5
. This gives a more mosaic-like pattern than the units produced manually through API, since these are already generalized and smooth because of cartographic considerations of scale and consequent minimum delineation size and width. Incongruous boundaries can be explained by the lack of detailed contours both in areas of low relief and in areas with complex relief over a short distance in the hill land, both of which can be recognized on the aerial photos. In the plain, the automatic classification found details in small channels and ridges that the photo-interpreter had generalized or missed due to the low impression of relief.

View larger version (53K):
[in this window]
[in a new window]
|
Fig. 5. Location of the training areas and all aerial photos (indicated with crosses) taken to cover the whole area (above); final map produced through supervised classification of landforms (below). The legend was reduced to 15 classes.
|
|
In general, the visual agreement with a conventional soil-landscape map is strong. Especially for relatively homogeneous landforms that cover relatively large areas, such as terrace treads, channels, vale slope and hill summit, the correspondence is high. In some areas, features recognized through the API were not detected. The separate classification of the hill land was markedly superior to the separate classification of the plain and to the whole-area classification. This is because landscape elements derived from a DEM are more striking and easier to both photo-interpret and automatically classify when there is strong relief.
 |
CONCLUSIONS AND DISCUSSION
|
|---|
The results show that the supervised classification is an objective method to supplement photo-interpretation, especially in surveys where the funds are limited and only a few soil observations are made outside the training areas. To survey all of Baranja with the conventional method of semi-detailed soil survey based on landscape analysis would have required the manual interpretation of the center photos of 84 photo-triplets. In the present study, we used only six photos (6.25% of the total) to map the whole area and therefore largely decreased cost and effort. Overall accuracy of the supervised classification of landforms improved to 63.4% of the training API and 90.2% of the point-sample, once the legend was simplified to eliminate small classes that caused large relative misclassifications. A further step would be to do field sampling of the soils in these units to determine if they are distinct soil-landscape units.
Supervised classification can be applied over entire survey areas, or separately in major landscapes through easily identified master lines, which tend to follow slope breaks or abrupt changes of landscape type. In the current study, the results improved only slightly. However, stratification has the conceptual advantage that the predictive equations correspond better to conventional understanding of soil formation in different environments. Stratification of the area also enables selection of different predictor sets for each landscape. On the other hand, it is more practical to develop a single data set and predictive map of the entire area at once.
Limitations and Ways to Overcome Them
Some photo-interpretation classes were poorly identified by the supervised classification, especially in areas of low relief. The likely reasons for poor performance in the plain are: (i) the limited vertical resolution of the DEM relative to the relief, (ii) large distances between known elevations (contour lines), both of these leading to artifacts in the landform parameterization, (iii) the absence of predictor variables specifically adapted to the plain, other than relative elevation, such as distance to local drainage, and (iv) the presence of landform complexes which occupy too much of feature space. We also discovered that many features in the plain differ on the topo-map (Year 1985) and aerial photo (Year 1998). This is due to the fact that the fluvial processes such as flooding and building of dams and canals change the detailed geomorphology of the area much faster than in the hill land.
Another serious issue is the software needed to create a good DEM. We first attempted to generate a DEM both in ILWIS, by using linear interpolation between contours and spot heights, and the SURFER package (Golden Software, 2001), by using the minimum curvature method (Fogg, 1984) on a point set derived from contours and spot heights. In both cases, clear artifacts were seen in the terrain parameters. Finally, a more realistic DEM was produced using the TOPOGRID command in ArcInfo 8. For the rest of the analysis, we used a low-cost commercial product (ILWIS). With the development of affordable methods, for example, airborne laser altimeters to construct precision elevation models, it will be possible to acquire accurate and timely information even in the areas of low relief (McKenzie and Austin, 1993).
Considering the classification accuracy, the supervised landform classification provides, in general, poorer results than the typical accuracy of land-cover classification. The use of the whole-API training set was only moderately successful, even with a reduced legend; however, it does not require a further step of point-sample selection. On the other hand, point-samples can be reselected iteratively and accuracy improved. The following three steps refinements may be applied, by preference in the order given: (i) refine the training set for classes that are misclassified, using scatter diagrams in feature space; (ii) simplify the legend or eliminate or merge classes; (iii) adjust the number of sample points for misclassified classes; and (iv) consider addition of different predictors or improvement of the quality of existing ones. This procedure is thus seen as a tool for the experienced soil mapper, not a replacement that could be applied by non-specialists. This is in contrarily to the unsupervised classification of landforms as described by Burrough et al. (2000), where the only input needed is the number of classes and fuzzy exponent. In the case of supervised classification, the analyst must still have a good knowledge of soil-landform relations, whether working with traditional or GIS-assisted methods. The analyst must indeed intervene after the initial attempts to classify, to discover which landform units cannot reliably be identified. This causes a new collaboration between the geographers, GIS experts and mappers that seems beneficial to all.
Reproducibility could be further improved with the addition of extra and more detailed information directly related to soil-forming factors, for example, on variation of the ground water table, satellite images showing flooding areas, vegetation indices, and soil moisture as estimated from radar data. For example, to differentiate between abandoned and actively flooded channels, remote sensing images at times of flooding, or images which can be correlated to soil moisture should be used. Also the use of fuzzy classification algorithms will offer better insight into the spatial confusion among classes, that is, uncertainty of classification.
Applicability of Landform Classification for Soil Survey
The supervised landform classification and visualization of terrain parameters as color-composites has two major advantages that can be used to enhance API for Soil Survey. First, the use of terrain parameters provides an objective and cost-effective basis for clustering of landscape facets. The classification results as shown in Fig. 5, prove that the procedure described can be used to extrapolate (generalized) photo-interpretation maps, with the only requirement, in addition to the aerial photos, being a good topographic map to show the contours, map of a drainage networks and geological survey map if possible. The classification also identified soil-landscape units smaller than the minimum legible delineation. If these are not just artifacts of the landform classification, this implies that semi-detailed survey could result in maps detailed enough for site-specific management. The soil mapper can manually interpret the remainder of the area simply by drawing boundaries on the automatic classification; the resulting lines are geometrically correct and geo-referenced. Second potential application of the proposed methods is to edit the soil boundaries of existing soil maps and improve their spatial accuracy within an existing GIS. This can be done by simply overlying the given boundaries over the false color-composites of terrain parameters or the results of a supervised classification and then modifying the boundaries to match changes in the terrain parameters.
In both cases, the mapper must take into account the usual considerations of legibility and delineation size, and also check the soil-landscape relations discovered by the classifier. This is faster than stereo-interpretation, since the automated classifier has already found the general location of the boundaries. The mapping project is thus the result of collaboration between an expert mapper and the GIS, able to find patterns with correct training. Far from eliminating soil surveyors, this process allows them to concentrate their efforts on their area of expertise, for example, identification of soil-landscape relations, and map large areas as efficiently as possible.
 |
ACKNOWLEDGMENTS
|
|---|
We thank Stephan Gruber (Department of Geography, University of Zurich) for developing an algorithm to calculate wetness index in ILWIS; Professors Matko Bogunovi
and Stjepan Husnjak (Soil Science Department, University of Zagreb) who participated in the field soil survey, and Dr. Wouther Siderius (Department of Earth System Analysis, ITC) who helped with the geo-pedological legend and photo-interpretation.
 |
NOTES
|
|---|
Web: http://www.itc.nl/personal/rossiter. Supplementary materials, datasets, additional results and color graphics available online at: http://www.itc.nl/library/Academic_output/2003/.
Received for publication May 18, 2002.
 |
REFERENCES
|
|---|
- Avery, B.W. 1987. Soil survey methods: A review. Soil Survey & Land Resource Centre, Silsoe, UK.
- Bell, J.C., D.F. Grigal, and P.C. Bates. 2000. A soil-terrain model for estimating spatial patterns of soil organic carbon. p. 295310. In J.P. Wilson and J. Gallant (ed.) Terrain analysis: Principles and applications. Wiley & Sons, New York.
- Bognar, A. 1984. An outline of geomorphological characteristics on Baranja. p. 6776. In Annales Universitas Scientarum de Rolando. Eotvos Nominate. Vol. 1819. Budapest.
- Buringh, P. 1960. The application of aerial photographs in soil surveys. Manual of Photographic Interpretation. American Society of Photogrammetry, Washington DC.
- Burrough, P.A., and R.A. McDonnell. 1998. Principles of geographical information systems. Oxford University Press, Oxford.
- Burrough, P.A., P.F.M. van Gaans, and R.A. MacMillan. 2000. High-resolution landform classification using fuzzy k-means. Fuzzy Sets Syst. 113:3752.
- Congalton, R.G., and K. Green. 1999. Assessing the accuracy of remotely sensed data: Principles and practices. Lewis, Boca Raton, FL.
- Eastman, J.R., and M. Fulk. 1993. Long sequence time series evaluation using standardized principal components. Photogramm. Eng. Remote Sens. 59:13071312.
- Environmental Systems Research Institute. 2001. ArcInfo. [Online] http://www.esri.com/software/arcgis/arcinfo/index.html (verified 6 June 2003).
- Fogg, D.A. 1984. Contour to rectangular grid conversion using minimum curvature. CVIP 28:8591.
- Forbes, T.R., D. Rossiter, and A. Van Wambeke. 1982. Guidelines for evaluating the adequacy of soil resource inventories. 1987 printing ed. Cornell University Department of Agronomy, Ithaca, NY.
- Gessler, P.E., I.D. Moore, N.J. McKenzie, and P.J. Ryan. 1995. Soil-landscape modeling and spatial prediction of soil attributes. Int. J. GIS 9:421432.
- Golden Software. 2001. Golden Software: Surfer. [Online] http://www.goldensoftware.com/products/surfer/surfer.shtml (verified 6 June 2003).
- Horn, B.K.P. 1981. Hill shading and the reflectance map. Proceedings IEEE 69:1447.
- Hutchinson, M.F. 1989. A new procedure for gridding elevation and stream line data with automatic removal of spurious pits. J. Hydrology (Amsterdam) 106:211232.
- Irvin, B.J., S.J. Ventura, and B.K. Slater. 1997. Fuzzy and isodata classification of landform elements from digital terrain data in Pleasant Valley, Wisconsin. Geoderma 77:134154.
- Janssen, L.L.F., and G.C. Huurneman. 2001. Principles of Remote Sensing. ITC, Enschede, The Netherlands.
- Jenny, H. 1980. The soil resource: Origin and behavior. Springer-Verlag, New York.
- Lane, S.N., K.S. Richards, and J.H. Chandler. 1998. Landform monitoring, modelling and analysis. John Wiley and Sons, New York.
- Lillesand, T.M., and R.W. Kiefer. 2000. Remote Sensing and Image Interpretation. John Wiley and Sons, New York.
- MacMillan, R.A. 2000. A protocol for preparing digital elevation (DEM) data for input and analysis using the landform segmentation model (LSM) programs, prepared for: the Soil Variability Analysis to Enhance Crop Production (SVAECP) Project. p. 42. LandMapper Environmental Solutions, Edmonton, AB, Canada.
- McKenzie, N.J., and M.P. Austin. 1993. A quantitative Australian approach to medium and small scale surveys based on soil stratigraphy and environmental correlation. Geoderma 57:329355.
- McKenzie, N.J., P.J. Ryan, and J.J. de Gruijter. 1999. Spatial prediction of soil properties using environmental correlation. Geoderma (Pedometrics '97) 89:6794.
- Mitasova, H., J. Hofierka, M. Zlocha, and L.R. Iverson. 1996. Modelling topographic potential for erosion and deposition using GIS. Int. J. GIS 10:629641.
- Moore, I.D., R.B. Grayson, and A.R. Ladson. 1991. Digital terrain modelling: A review of hydrological, geomorphological, and biological applications. Hydrological Processes 5:330.
- Moore, I.D., P.E. Gessler, G.A. Nielsen, and G.A. Peterson. 1993. Soil attribute prediction using terrain analysis. Soil Sci. Soc. Am. J. 57:443452.[Abstract/Free Full Text]
- Odeh, I.O.A., A.B. McBratney, and D.J. Chittleborough. 1994. Spatial prediction of soil properties from landform attributes derived from a digital elevation model. Geoderma 63:197214.[ISI]
- Quinn, P., K. Beven, P. Chevallier, and O. Planchon. 1991. The prediction of hillslope flow paths for distributed hydrological modelling using digital terrain models. Hydrological Processes 5:5979.[ISI]
- Rossiter, D.G., and T. Hengl. 2002. Technical note: Creating geometrically-correct photo-interpretations, photomosaics, and base maps for a project GIS. ITC, Enschede, the Netherlands, [Online] http://www.itc.nl/~rossiter/teach/lecnotes.html (verified 6 June 2003).
- Schmidt, J., K. Hennrich, and R. Dikau. 1998. Scales and similarities in runoff processes with respect to geomorphometry. Proceedings of the 3rd International Conference on GeoComputation, University of Bristol, UK.
- Shary, P.A., L.S. Sharaya, and A.V. Mitusov. 2002. Fundamental quantitative methods of land surface analysis. Geoderma 107:132.
- Soil Survey Division Staff. 1993. Soil survey manual. USDA, Washington, DC.
- Thomas, A.L., D. King, E. Dambrine, A. Couturier, and J. Roque. 1999. Predicting soil classes with parameters derived from relief and geologic materials in a sandstone region of the Vosges mountains (Northeastern France). Geoderma 90:291305.
- Thompson, J.A., J.C. Bell, and C.A. Butler. 1997. Quantitative soil-landscape modeling for estimating the areal extent of hydromorphic soils. Soil Sci. Soc. Am. J. 61:971980.[Abstract/Free Full Text]
- Unit Geo Software Development. 2001. ILWIS 3.0 academic user's guide. [Online] Available by ITC, http://www.itc.nl/ilwis/(verified 6 June 2003).
- Weibel, R., and M. Heller. 1991. Digital terrain modelling. p. 269297. In D.J. Maguire et al. (ed.) Geographical information systems: Principles and applications. Longman, London.
- Western, S. 1978. Soil survey contracts & quality control. Clarendon Press, Oxford.
- Wilson, J.P., and J. Gallant. (ed.) 2000. Terrain analysis: Principles and applications. Wiley & Sons, New York.
- Zevenbergen, L.W., and C.R. Thorne. 1987. Quantitative analysis of land surface topography. Earth Surf. Processes Landforms 12:4756.
- Zhu, A.X., B. Hudson, J. Burt, K. Lubich, and D. Simonson. 2001. Soil mapping using GIS, expert knowledge, and fuzzy logic. Soil Sci. Soc. Am. J. 65:14631472.[Abstract/Free Full Text]
- Zhu, A.-X., L.E. Band, B. Dutton, and T. Nimlos. 1996. Automated soil inference under fuzzy logic. Ecolog. Modell. 90:123145.
- Zinck, J.A. 1988. Physiography & Soils. ITC Lecture notes, Enschede, The Netherlands.
- Zinck, J.A., and C.R. Valenzuela. 1990. Soil geographic database: Structure and application examples. ITC Journal 990:270294.