|
|
||||||||
a GIS Lab, 306 Founders Hall, Lincoln Univ., 820 Chestnut St., Jefferson City, MO 65102 USA
b Soil Genesis, School of Natural Resources, Univ. of Missouri-Columbia, 302 ABNR Building, Columbia, MO 65211 USA
hammerr{at}missouri.edu
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Soils are grouped in field mapping primarily by identifying landscape attributes which are believed to contain similar soils (Hudson, 1992). "Landscape position" can be considered to be a geographic approach to classification, in which combinations of surficial and stratigraphic attributes are used to identify populations of related soil individuals within a landscape (Ruhe, 1956). Thus, to the extent that taxonomic classes contain soil attributes that are correlated with landforms and/or other identifiable surficial attributes within a particular soil landscape, Soil Taxonomy is a useful mapping guide. The high taxonomic variability reported for many soil map units (e.g., Powell and Springer, 1965; Wilding et al., 1965; Edmonds et al, 1985; Young et al., 1998) indicates that some map units contain numerous landforms, or that taxonomic groups do not coincide with landform groups, or both. The relevant question then becomes, "Do taxonomic classes represent the relationships of soil attributes with terrain attributes across the areal extent of the soils?"
Cluster analysis is a general term for a family of statistical classification methods that group objects. The idea is statistically to minimize within-group variability while maximizing among-group variability in order to produce relatively homogeneous groups that are distinct from one another. Statistical groupings are unique to the data used (and to the data collection methods), and are statistically referenced to multivariate group centroids which have distinct group boundaries that are defined by means and variances in multidimensional space. This numerical approach would seem to be conceptually well suited to the methods and objectives of mapping soils in the field. Bidwell and Hole (1964a) were among the first to recognize the potential of numerical methods, including ordination, to refine soil classification. Rayner (1966) suggested that cluster analysis be used to "pick groups of profiles ... to act as standards with which to compare other profiles." Arkley (1971, 1976) advocated cluster analysis as an objective means of developing a soil classification, and illustrated his philosophy with a cluster analysis of a diverse soil dataset (Arkley, 1976). Virtually all applications of cluster analysis in soil science literature have been "hierarchical agglomerative" methods, as defined by Aldenderfer and Blashfield (1984). Webster and Oliver (1990) have illustrated cluster analysis techniques using soil data. Several authors (Moore and Russell, 1967; Campbell et al., 1970) have considered and compared various clustering methodologies, including different multivariate space distance measures.
Cluster analysis has not been used extensively to create new soil classification criteria, although it has been used to develop land-based ecological classification schemes. Lentz and Simonson (1987) used cluster analysis to identify six vegetation classes, then examined the soils for differences among classes. Radloff and Betters (1978) used soil variables as part of a cluster analysis to develop a wildland classification in which soils was one component. Russell and Moore (1967) developed a classification for deep, sandy soils in a region of southeastern Australia, and found differences among soils which were not previously apparent. Russell and Moore's report did not compare their numerical classification with a standard soil taxonomic classification. Bottner et al. (1975) used cluster analysis to develop a classification scheme along a "bioclimatic sequence," and related the results to the French soil taxonomy.
Cluster analysis has been used to develop conceptual schemes for grouping soils. Langohr et al. (1976) used the similarities among particle-size distributions to cluster soils, showing that the cluster classes approximated existing series. Cluster analysis and other multivariate techniques were used to determine that hydrology was the primary determinant of variability in Gloucestershire, England (Norris, 1972). Campbell et al. (1970) used cluster analysis to confirm that the textural profile of the investigated soil was an important diagnostic property.
When cluster analysis and numerical taxonomy have been used to evaluate previously defined soil taxonomic classes, the comparisons generally have been with higher taxonomic categories. Cipra et al. (1970) applied cluster analysis to soils in nine taxonomic orders, most of which coincided with cluster groups. Vertisols were not clustered. Arkley (1971) found generally good agreement between cluster groups and taxonomic classes in a "worldwide" soil dataset. Bidwell and Hole (1964b) compared soil groups from the 1938 soil classification with clustered classes of Kansas soils, with generally good agreement. Grigal and Arneman (1969) applied cluster analysis to Minnesota soils, noting that cluster classes and taxonomic classes were not exactly coincident because of differences in criteria. Agreement between numeric and non-numeric classification was enhanced when comparisons were by textural classes rather than the more subjective "diagnostic horizons." Adams et al. (1992) found that classes formed by cluster analysis were roughly equivalent to Soil Taxonomy classes in their abilities to differentiate soil properties important in surface water acidification. Webster and Burrough (1972) note that as the number of cluster classes increased, the soil map units became increasingly fractionated. Limitations exist with many of these investigations. Statistical methodologies have increased with improved computer technology. Statistical software programs have become more rigorous and powerful. Many of the early applications of numerical analyses were performed on data collected for other purposes or from a variety of sources and with varying sampling and laboratory techniques.
Edmonds et al. (1985) used cluster analysis of data from three soil map units in Virginia to show that taxonomic classes were not related to natural distributions of soil attributes. We believe this is a particularly important application. It suggests that at a finer sampling scale, and at the family and series taxonomic levels, the relevance of taxonomic classes to natural distributions is diminished.
The soil-landscape paradigm (Hudson, 1992) is a powerful guiding field map unit delineation within the NCSS. The paradigm's underlying hypothesisuseful relationships exist among landscape attributes and soil classificationfollow concepts articulated by Jenny (1941), and are illustrated for specific landscapes by block diagrams within most NCSS reports. The relevance of the factor concepts (Jenny, 1941) to site-specific information and to modern taxonomy is somewhat problematic because of scale-dependent variability. As Wilding (1994) observed, Jenny's model was functionally restricted by "geographic limits and analytical databases." The factor model was predicated upon data from Pleistocene landscapes and upon the "soil maturity" concept. Thus, the classic "factor" approach would seem to be more appropriate for regional-scale questions than for catena- or site-specific variance. Other than recent work by Gessler (1996), we are unfamiliar with any rigorous examination of soil-geomorphic relationships via statistical methods using large sample sizes from a small area. Relationships among soil classes by landscape position and soil classes formed by cluster analysis have not been examined.
We are interested in field-scale soil variability (over distances of approximately 5-1000 m), and how that variability can be partitioned into smaller, more homogeneous areas for land management purposes. We are concerned with the apparent limitations of Soil Taxonomy to achieve this, and are interested in statistical clustering methods as aids to examining and elucidating logical soil groupings. Our hypotheses are that, at a field scale, (i) classification by Soil Taxonomy creates classes that are not geographically distinct and are only partially related to landform, and (ii) numerical classification produces more homogeneous classes. Our objectives are to: (i) classify soils by Soil Taxonomy and cluster analysis, (ii) identify the degree to which numerically identified samples are distinct and are related to landforms, and (iii) compare the geographic distributions of soil classes identified by Soil Taxonomy and cluster analyses, including the relationships of classes to landforms.
| Methods |
|---|
|
|
|---|
|
Soils formed in loess and underlying Pre-Illinoian glacial till. Loess thickness varies from about 1 m on lower backslopes to more than 2 m on finger ridges. Hillslope sediments of variable thickness and composition are between the loess and the till. Most of the site has been mapped in the "in-progress" soil survey as Arisburg (fine, montmorillonitic, mesic Aquertic Argiudolls), with inclusions of Armstrong (fine, montmorillonitic, mesic Aquollic Hapludalfs).
Sampling
Intersecting transects were placed to traverse landforms both parallel and normal to slope gradients (Fig. 1). Transect placement and sampling intervals along transects were determined subjectively in an effort to capture the full range of soil variability within landforms (Young et al., 1991). Transects were straight lines, inflected where necessary to conform to landforms. Sampling intervals along transects were 15 m, except for ridges, which occupied relatively small proportions of the watershed area. Sampling density was increased to obtain sampling populations nearly equal to backslopes. Ridges were sampled at 7.5-m intervals from multiple, parallel transects 7.5 m apart. A total of 257 pedons was sampled from the entire watershed. All samples were assumed to be independent.
A 5-cm diam. tube hydraulic soil probe (Giddings Machine Company Inc., Ft. Collins, CO) was used to sample to depths of about 1.2 m, which is the length of the sampling tube.
Cores were subdivided for description and laboratory analysis as follows:
HorizonDepth incrementA-1variable;20-cm maximumA-2variable; the rest of the A horizonB-1upper 15 cm of argillic horizonB-2next 15-cm incrementB-3next 20-cm incrementB-4next 20-cm increment
The A-1 horizon is roughly equivalent to the Ap horizon. The A-2 horizon variables were not used because approximately 60 pedons did not contain them. Color and structure were helpful, but were not diagnostic in separating the lower A from the upper B horizons. Clay films in upper B horizons distinguished them from lower A horizons. Uniform sampling increments were used in the B horizons to compare more accurately soil properties with depth among pedons. Changes within argillic horizons were gradual, and arbitrary boundaries did not mix dissimilar materials. All B horizons were either "Bt" or "Btg."
Analyses and Variables
All pedons were subjectively classified in the field as being either within a "ridge" or "backslope." Pedons on backslopes were further subjectively classified as being either convex, plane or concave both in plan and profile orientation, and as being on an upper, mid, lower or foot slope position.
Morphological observations were with standard methods (Soil Survey Division Staff, 1993), except for "iron depletion" and "iron-manganese stains and concretions" codes. A "none" category was added to each, and a "gray matrix" category was added to the "iron depletion" coding.
After morphological observations, all samples were air-dried and ground to pass a 2-mm sieve for the following analyses: (i) Particle size: clay, two silt fractions (coarse and fine), two sand categories (very fine sand, and fine sands and coarser); (ii) Organic carbon; (iii) NH4OAc-extractable bases (Ca, Mg, K, Na); (iv) KCl extractable A1, on samples with pH < 5.6; (v) CEC by NH4OAc; and (vi) pH (water).
All analyses were by the University of Missouri Soil Characterization Laboratory except particle size. Soil pH was measured in a 1:1 soil solution suspension using an Orion Digital Ionanalyzer/501 pH meter (Orion Research, Inc., Beverly, MA) with a combination electrode. Total soil carbon was determined with a Leco CR 12 Carbon Analyzer (Leco Corp., St. Joseph, MI). Particle size was by the modified pipette method (Indorante et al., 1990).
Eight of the variables considered in this study are pedon specific (i.e., are applicable to the pedon as a whole), whereas others are horizon specific (Table 1) . Color variables (in Munsell units), iron depletions and iron-manganese abundance were recorded for B horizons only. A total of 94 variables was available for use.
|
|
Statistical Analysis
The more familiar hierarchical agglomerative clustering methods develop a similarity matrix (e.g., a correlation matrix) for all cases, allowing for the creation of a dendrogram that shows the relationships among all cases (i.e., pedons). The resulting dendogram can be viewed as an indicator of distance measures among centroids in multrivariate space. If the number of cases is large (e.g., over 100), the similarity matrix becomes cumbersome and impractical for use in defining optimal clusters.
For this reason, an "iterative partitioning" method was used ("k-means" in SYSTAT [Wilkinson, 1992]), which Aldendorfer and Blashfield (1984) described as (p. 43):
1. Begin with an initial, random partition of the data set into some specified number of clusters; compute the centroids of these clusters.2. Allocate each data point to the cluster that has the nearest centroid.
3. Compute the new centroids of the clusters; clusters are not updated until there has been a complete pass through the data.
4. Alternate Steps 2 and 3 until no data points change clusters.
Iterative partitioning does not develop a dendrogram because all pairwise relationships are not evaluated.
All variables were standardized to zero mean and unit variance in SYSTAT prior to cluster analysis, thus allowing comparison of variables measured on different numerical scales.
Cluster analysis will create groups with statistically significant differences, even from evenly distributed data (Radloff and Betters, 1978; Aldenderfer and Blashfield, 1984). Group separation and validity are best evaluated by visual reference to group "profiles" (Aldenderfer and Blashfield, 1984), which are line charts of the means for each variable, by cluster group.
Classes formed by cluster analysis were compared with classes formed by overall geomorphic position (ridge vs. backslope), by position within the backslope (position along the slope gradient, profile curvature, and plan curvature), and by Soil Taxonomy, by the chi-squared test of association in SYSTAT. Some cell frequencies were sparse, so the tests comparing taxonomic and cluster classes were suspect. Pairwise statistical comparisons were not made. Association between cluster groups and taxonomic classes must be very strong to be meaningful; i.e., unless a taxonomic class is consistently associated with a given cluster group, the validity of the relationship is suspect. Two-, three-, and four-group cluster classifications were evaluated by mapping group locations (as in Fig. 1) and by examining cluster group mean profiles (as in Fig. 2a and b) , for each classification.
|
| Results |
|---|
|
|
|---|
Group 1 is the largest group, and is distinguished primarily by the overall mean clay content, pH, and base cation content (Fig. 2a). Group 2 pedons have less clay in most horizons and are more acidic, with lower base saturation and fewer base cations, while Group 3 soils are more clayey, with higher pH and base saturation and more exchangeable base cations. Figure 2 shows group "profiles" for the three-group clustering. The means of each group on each soil property are plotted as standard deviations from the overall sample means.
Group 1 soils contain the highest concentrations of organic carbon and silt (Fig. 2b). Group 2 pedons contain the lowest organic carbon concentrations in the upper profile, have the brightest chromas in the argillic horizon, and have less total silt but more coarse silt and very fine sand. Group 3 soils have duller chromas and more iron depletions than the other groups. Other patterns are apparent in Fig. 2. The "color difference" variables in general do not provide good separation among groups. We suspect that the ineffectiveness of the color attributes was because the landscape is relatively young, and dissection has not developed sufficiently to have created pronounced soil hydrologic differences within the portion of the landscape we sampled.
Cluster Groups and Geomorphic Position
Pedons classified into Group 1 are more likely to occur on the ridge than might occur randomly (Fig. 3)
. Conversely, occurrences of Groups 2 and 3 are low on the ridge and high on backslopes. However, 38 of the 171 pedons in Group 1 were located on backslopes.
|
or plan surface curvature
. Profile curvature classes and cluster groups are significantly associated. Cluster Group 1 pedons tend to be on plane surfaces rather than convex or concave surfaces (Fig. 4)
. Group 2 pedons most commonly are on convex surfaces, but some are on concave surfaces. Most Group 3 pedons are on concave rather than convex surfaces, but a few are on plane surfaces. Slope curvature is not a reliable predictor of specific soil attributes even though relationships among cluster groups and profile surface curvatures are statistically significant.
|
|
|
Cluster Groups and Taxonomic Classification
Only taxonomic classes with five or more pedons were compared with cluster groups. Statistically significant associations exist among cluster groups and taxonomic classes (Fig. 6)
. However, these relationships are not adequate for accurate prediction at specific locations. Cluster memberships does not accurately predict taxonomic class, or vice versa.
|
| Discussion |
|---|
|
|
|---|
This result should not be used to imply that landform elements will be not be strongly correlated with measurable soil physical and chemical attributes in other landscapes. As previously stated, relief was minimal in this watershed, and loess blanketed most of the watershed. More strongly dissected landscapes or regions with different parent materials and depositional environments should be examined to test the limits of the observations of this study.
The three-group clustering placed nearly all ridge pedons into a single group, but members of that group occur on backslopes as well. This is a young, moderately dissected landscape on which the upper parts of sola have formed in loess, a parent material which is texturally homogeneous over short distances (Daniels and Hammer, 1992). As previously mentioned, strong pedological distinctions have not developed among ridge and backslope soils. However, some locations on the backslope are statistically, but subtly, distinct. The differences generally are related to silt and clay concentrations, redoximorphic features and relative distributions of basic cations, and usually are associated with concave and lower positions. These differences are not readily apparent in the absence of statistical analyses, and illustrate the utility of objective statistical analyses of large systematically collected, multivariate data sets.
Cluster analysis revealed important pedological relationships that were not apparent when pedons were classified by landform alone. Three distinct sample clusters of backslope pedons were identified by cluster analysis, with the ridge population indistinguishable from one of the backslope populations. Young and Hammer (2000) examined these data for differences among ridge, shoulder and backslope positions for a number of soil properties. However, grouping all backslope pedons together combined unlike pedons, and diluted differences among groups.
When differences are detected among groups, the assumption (perhaps unwarranted) can be made that among-group variability exceeds within-group variability because homogeneity exists within groups. Cluster analysis is a useful method for investigating this assumption.
Pedologically, the relationships among groups and profile surface curvature are sensible. Concave sites are associated with Group 3 soils, in which lessivage and gleization (Table 2) are strongly expressed, ferrugination is weakly expressed, and melanization is strong in the A1 horizon but decreases rapidly with depth. These are areas in which water periodically concentrates. The increased periodicity of wetting and drying increases the argillic horizon clay content, enhances movement and accumulation of cations, and increases gleying above the argillic horizon. The relatively thin A horizon may be result of post-settlement erosion, which was probably extensive on these sites. Lessivage and melanization have been weak in Group 2 soils on convex positions. This is consistent with a surface water and hillslope sediment pattern of "runoff" rather than "runon." Leaching is intense, indicating throughflow has effectively removed base cations.
Relationships among taxonomic and cluster classes were significant but not strong. The hierarchical methodology of Soil Taxonomy created more classes than are necessary to described soil patterns on this landscape. Some taxonomic classes had no relationship to distributional patterns of soil attributes. Cluster classes generally are distinguishable, are geographically separable, are based on differences among many properties, and are bound-from-within to a multivariate centroid that was determined from the data distribution. Thus, the defining attributes are based upon the mapped soils and are not an artificial construct. The soil taxonomic classes into which these soils were classified are less geographically separable. Some classes are distinguished by slight differences in a single property. The taxonomic classes are "bound-from-without" by divisions inappropriate to existing soil patterns rather than "bound-from-within" by process-controlled soil attributes. The term "bound-from-without" is used to convey the idea that the divisions in Soil Taxonomy were not relevant to the distributions of soil properties in this landscape. Many of the taxonomic divisions were in the family control section, and most of those were in the particle size classes.
| Conclusions |
|---|
|
|
|---|
Indorante et al. (1996) proposed that digital elevation models (DEM) and allied technologies could be used to retain "expert knowledge" acquired while making a soil survey, but not captured in the published soil survey. They also discussed ways to interpret and use soil data acquired for purposes other than soil survey, such as precision agriculture and septic field evaluation. Cluster analysis and related statistical methodologies show much promise as methods to evaluate relationships among soil and landscape attributes for a variety of interpretive and classification uses. These numerical methodologies should be strongly considered as components in organized, systematic attempts to refine and modify soil maps, soil interpretations and humans' understanding of soil-forming processes after the "once over" soil survey has been completed.
Scale-dependent variability continues to challenge those who study, map and interpret the soil resource. Cluster analysis has revealed the limitations of both landform and Soil Taxonomy in detecting meaningful, mappable soil bodies in this study area. High-resolution digital elevation models may provide an additional method to help quantitatively detect geomorphic surfaces and hydrologic differences, and could improve the predictive power of the landform paradigm (McSweeney et al., 1994). However, DEM-based technologies also are scale-dependent (Hammer et al., 1995), and their applications will require recognition and accommodations of the scale-dependent limitations. We recognize that Soil Taxonomy is not intended to be a mapping template. Furthermore, we are not suggesting that statistical methods replace the soil-landform paradigm (Hudson, 1992). We do not suggest that a unique, statistically derived classification system be developed for every soil survey area. We propose that cluster analysis, and perhaps other numerical methodologies, can be a useful way to array technologies and methods to identify and quantify soil-landform relationships. Patterns of homogeneity and combinations of distinguishing soil attributes can be more objectively identified through mathematical analyses. Cluster analysis can help define the centroids of the "cartographic series" as first proposed by Knox (1987) and further discussed by Nettleton et al. (1991). As large digital datasets are developed by major land resource area (MLRA) soil survey updates, cluster analysis provides a useful data analysis method for project managers. We envision cluster analysis as one of many statistical methods used in future soil survey activities. Statistical analyses, combined with careful field observation and evaluation by trained and experienced soil scientists, can produce a new generation of more quantitative soil surveys.Moore Russell Ward 1972
| ACKNOWLEDGMENTS |
|---|
| NOTES |
|---|
|
|
|---|
Received for publication December 17, 1996.
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Crop Science | |||
| Journal of Natural Resources and Life Sciences Education |
Vadose Zone Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||