SSSAJ Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Web of Science (13)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bogaert, P.
Right arrow Articles by D'Or, D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Bogaert, P.
Right arrow Articles by D'Or, D.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Bogaert, P.
Right arrow Articles by D'Or, D.
Related Collections
Right arrow Data Management
Right arrow Spatial Distribution
Right arrow Statistics
Soil Science Society of America Journal 66:1492-1500 (2002)
© 2002 Soil Science Society of America

DIVISION S-1—SOIL PHYSICS

Estimating Soil Properties from Thematic Soil Maps

The Bayesian Maximum Entropy Approach

Patrick Bogaert* and Dimitri D'Or

Dep. of Environmental Sciences and Land Use Planning–Environmetrics. Université catholique de Louvain. Place Croix du Sud 2, bte 16. 1348 Louvain-la-Neuve, Belgium

* Corresponding author (bogaert{at}enge.ucl.ac.be)


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Current soil process models require the most accurate values for each of their input parameters at the finest spatial scale. Traditionally, soil property values are obtained either from soil maps or from geostatistical methods using exact laboratory measurements. Both data types convey substantial information: soil maps provide exhaustive but soft (vague) information, whereas laboratory analyses provide hard (accurate) but scarce measurements. Ideally, they should be combined. This objective can be reached using a recently developed method, namely the Bayesian maximum entropy (BME) approach, that allows the user to incorporate hard and soft data in a spatial estimation context. In this work, both the regular BME algorithm and a new variant of it using a Monte Carlo procedure (BME/MC) are proposed for obtaining an estimated map for the textural (sand, silt, and clay) fractions from a limited number of accurate measurements and a spatially exhaustive soil map. Compared with popular geostatistical methods like ordinary kriging (OK), this approach has the advantage of using soft information on a sound theoretical basis. The entire probability distribution function can be estimated at each estimation location, allowing the computation of confidence intervals, probability of exceeding a threshold, etc. Using expectation properties in a Monte Carlo procedure, the BME/MC algorithm takes additionally into account the fundamental constraints on the textural fractions (they are summing to one and belong to the [0, 1] interval). As illustrated with a real data set from Belgium, using BME results in much more accurate textural fractions estimates and more realistic maps than those obtained with regular geostatistical algorithm.

Abbreviations: BME, Bayesian maximum entropy • BME/MC, Monte Carlo procedure • cdf, cumulative density function • CER, classification error rate • IK, indicator kriging • ME, mean error • OK, ordinary kriging • pdf, probability density function • RM, regional mean • RMSE, root mean square error


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
IN A LOT OF COUNTRIES, substantial effort has been devoted to the making of soil maps. The objective was to give easily interpretable and spatially exhaustive information (mainly for land use and land evaluation) about some soil properties such as the sand, silt, and clay contents. To obtain the desired accuracy at reasonable sampling costs, a compromise was to combine a few exact measurements with numerous but moderately accurate measurements, for example, by relying on the field expertise of the soil scientist. Soil maps thus contain an important amount of past inherited information.

Recently, an increasing number of environmental models were developed for various purposes such as the evaluation of the potential leaching of pollutants (Addiscott and Whitmore, 1991), soil erosion sensitivity assessment (Morgan et al., 1998), or water retention capacity evaluation (Rawls et al. 1991). Most of these models cannot easily process soft (vague) knowledge, but rather make use of hard (exact) values.

For the sake of legibility, the information contained in soil maps is frequently ordered in classes, and is thus rather imprecise. When soil maps are used for prediction purposes, a frequent way of doing this is to quantify the legend (Morse and Thornburn, 1961; Webster and Beckett, 1970; Van Kuilenberg et al., 1982). Hence, using soil maps for prediction purposes is hazardous since only one representative value is often considered for each mapping unit (i.e., the same representative value is attributed to all the points in the same mapping unit and the intramapping unit variability is completely hidden). As a result, estimated maps are exhibiting abrupt changes from one mapping unit to another (Voltz and Webster, 1990). The major concern is: can we use these moderately informative soil maps to get more accurate information about the underlying continuous variables? The challenge is to find a method combining various data types to get a unique estimate of a property at a given spatial location.

Beyond the traditional indicator approach (Oberthur et al., 1999), other research studies attempted to simultaneously use the soft information provided by the soil map and the exact measurements. Some of them suggest stratification before estimation (Stein et al., 1988), others proceed by heuristic weighted average (Heuvelink and Bierkens, 1992) or introduce properties of soil map delineations into OK (Boucneau et al., 1998). Beside geostatistics, Martin-Clouaire et al. (2000) propose to combine a raster geographical information system (GIS) with a constraint satisfaction solver. But all these methods suffer of at least one of the three following limitations: (i) degradation of the soft information provided by the soil map (e.g., intervals of values provided by the texture triangle are reduced into indicators, or most of the times, only one value is chosen to represent the texture class); (ii) computational problems because of the lack of a sufficient number of exact measurements (e.g., for variogram estimation in the case of stratification); and (iii) lack of sound theoretical basis.

To overcome these issues and open new perspectives for the analysis of spatiotemporal processes, a new method called the Bayesian maximum entropy (BME), was introduced by Christakos (Christakos, 1990, 1991, 1998). This approach allows the user to incorporate a wide variety of data sources of various quality on a sound theoretical basis and in a computationally efficient framework. These data sources may come in various forms, like intervals of values, probability density functions (pdf) or even physical laws (Christakos, 2000).

For soil texture mapping, several specific constraints have to be taken into account. If texture is specified by sand, silt, and clay fractions, these fractions must belong to the [0,1] interval and they must sum to 1. Moreover, these constraints induce interdependence between the texture fractions, so that each texture class is not simply defined by an independent set of intervals of values, but instead as a complex domain of values for the three fractions. To take explicitly into account these constraints, a variant of BME (BME/MC) was developed. It uses expectation properties for computing the posterior pdf at each estimation point using a Monte Carlo integration procedure.


    THEORY
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The BME Formalism: Combining Hard Values and Interval Information
In this section, we briefly present the general principle of the BME formalism. More detailed discussions about the underlying theory, the advantages of the method and the processing of other data types can be found in Christakos (2000). As a convention, Latin lowercase (x) will denote random variables and Greek letters ({chi}) their realizations. Bold characters will denote a vector of variables. The BME approach can be presented as a two-stage procedure, namely the prior and the posterior stages.

The Prior Stage: Incorporating General Knowledge
Consider the vector of variables xmap = (xhard, xsoft, xk) where xhard, xsoft and xk, respectively denote the values at hard and soft data points and the unknown value at estimation point.

The objective of this stage is to find the most general a priori probability function fG({chi}map), given the general knowledge G. This general knowledge may consist of summary statistics, physical laws, scientific theories, logical principles, etc. These may either be computed from data collected on site or from similar situations encountered elsewhere.

The building of the most general prior distribution is achieved by maximizing the entropy (i.e., minimizing the amount of information added to the general knowledge) under constraint of respecting the prior characteristics coming from the general knowledge G.

For continuous variables, the entropy H is defined as (Shannon, 1948, p. 628; Papoulis, 1991, p. 559, Eq. [15]–[79]):

[1]

Introducing the Lagrange multipliers µ{alpha}, this is equivalent to maximizing the relation:

[2]
where g{alpha} ({chi}map) is a set of functions of {chi}map allowing the incorporation of the general knowledge G (for more details see Christakos, 2000, p. 74–76). Setting the partial derivatives to zero and solving the system of equations with respect to the µ{alpha} yields the Maximum entropy solution for the prior pdf (Christakos, 2000):

[3]
where Z is the partition function

[4]

The prior joint pdf fG = fG is the most general pdf which ensures that all the available knowledge prior to any measurements is taken into account, but not more than this knowledge.

In cases where the two first-order moments (mean and covariance or variogram function) constitute the general knowledge, it can be shown that Eq. [3] yields a multivariate gaussian distribution.

The Posterior Stage: Incorporating Specific Knowledge
Specific knowledge consists of any measurement collected on site. These data can either be hard (accurate) or soft (vague) measurements. Examples of soft information are intervals of values, probability functions, empirical charts, expert knowledge, etc.

Consider first that the soft data issued from the available specific knowledge are composed of probability density functions fS({chi}soft) (we will speak of pdf-type soft data). The updated (posterior) pdf at estimation point pk can be written as (Serre, 1999):

[5]

This is the general BME solution for probabilistic-type soft data. When soft data consist of interval-type information characterized by the interval bounds, the soft pdf can be considered as uniform between these bounds and Eq. [5] becomes the BME solution for interval-type soft data (Serre, 1999):

[6]
where {alpha} and ß are the bounds of the intervals.

Improving the Estimation with BME Using Constraints: The BME/MC Formalism
According to the definition of the texture classes on the texture triangle (Fig. 1) , an interval can be derived for each fraction by simply taking the lowest and the highest possible values as interval bounds. But, doing so, we consider implicitly that the three fractions are independent. The variation domain should thus be a hyper-rectangle. However, texture fractions are not independent at all since they have to sum up to 1. If all the possible combinations of values for the three fractions were represented in a three dimensional space, they would rather be located on an equilateral triangle (Fig. 2) . The situation is made even worse when actual definition of the texture classes are considered on this textural triangle, as their definition may exhibit complex shapes for the domain and they depend on the specific classification scheme that is used (Fig. 1). To circumvent these problems, a variant of the BME algorithm is developed.



View larger version (52K):
[in this window]
[in a new window]
 
Fig. 1. Belgian textural triangle (Tavernier and Maréchal, 1958). Each point on the triangle refers to a specific composition of sand (50 µm–2 mm), silt (2–50 µm), and clay (<2 µm).

 


View larger version (25K):
[in this window]
[in a new window]
 
Fig. 2. Textural triangle for the sand, silt, and clay fractions expressed in proportion. Permissible joint values for the sand, silt, and clay fractions must lie on the triangle, so that sand + silt + clay = 1.

 
Assume that a prediction {chi}i,0 of the values is sought at location p0, where i {epsilon} {1,2,3} refers to the sand, silt, and clay contents, respectively. Suppose we know a set of hard data {chi}hard = {{chi}i,j} for variables i = 1,2,3 at locations pj, j = 1,...,nh as well as a joint soft pdf fS({chi}soft) (specific knowledge), where {chi}soft = {{chi}i,0} is the soft data for variables i = 1,2,3 at location. The BME theory provides Eq. [5] for computing the posterior pdf.

However, the analytical expression for the joint soft pdf fS({chi}soft) is not easy to obtain, because of the complex shape of its domain of variation. This problem can be avoided if we use the definition of the expectation with respect to the soft pdf fS({chi}soft):

[7]

An estimate for this expectation is then:

[8]
where L is the number of random draws that must be taken large enough for getting stable results (in this paper, L is taken equal to 50000). It is now possible to compute Eq. [5] by randomly drawing a high number of {chi}soft[l] values from the fS({chi}soft) pdf and computing the expectation using Eq. [8]. This provides an evaluation of the integral in the left-hand side of Eq. [7].

Equation [8] makes easy the implicit incorporation of a complex definition for the integration domain. For our study, the set of random variables {{chi}1,0[l], {chi}2,0[l], {chi}3,0[l]} corresponds to the three textural fractions and must respect the constraints such that

and

The variation domain is thus the textural triangle over which triplet of values are a priori assumed to be uniformly distributed. The only remaining problem is the joint simulation of the {chi}soft[l] values over this domain. This can be accomplished through a series of geometrical transformations. For conciseness, these developments are not presented in this paper.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Soil texture is of crucial importance for the determination of soil sensitivity to environmental disasters like erosion or pollutant dispersion; for example, sand and clay contents are key parameters in a number of pedotransfer functions and soil process models. Texture composition has thus to be estimated with the greatest accuracy to minimize error propagation in such models. As an example, the sand, silt, and clay fractions are here estimated in the region of Tienen (Belgium) using 277 samples and an exhaustive soil map. Estimation is performed on a fine scale grid using the Regional Mean (RM) for each texture class, OK, BME for interval-type soft data (BME), and BME/MC. Estimation accuracy is assessed by cross-validation (Cressie, 1991, p. 101).


    Study Area and Data Source
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The study zone is located east of the city of Leuven (Belgium) and occupies a 26 by 16 km area extending from Tienen (south) to Aarschot and Diest (north). The area can be subdivided in three pedogeomorphological regions characterized by a texture gradient: in the south part, the loess belt (région limoneuse); in the north part, the sandy area of Flanders (région sableuse) and in between, a transition region (région sablo-limoneuse). From south to north, the soil texture is thus gradually changing from silt to sand, with accumulation of clay in the alluvial plain of the Demer valley. Most of the soils in the study area can be described as Alfisols. Northwards, soils become more spodic, with Spodosols in the sandy area. The Velp valley crosses the study area from west to east, about 5 km north of Tienen. The Demer valley spreads along the north border of the study region. In these two valleys, soils can be classified as Fluvents. Topographically, small hills in the south (maximum elevation around 110 m) progressively flatten northing to the Flanders plain (average elevation: 10 m).

The Belgian soil survey took place between 1947 and 1971. About 13 000 soil profile descriptions with laboratory analyses were registered in the Aardewerk soil data base (Van Orshoven et al., 1988). Those soil profiles were not as such interpolated to derive the soil map, but rather used to characterize representative soil types in a region. Soil properties were also qualitatively registered on a 75 by 75 m grid all over the country using auger borings. The three recorded variables at these points are the soil texture class, the drainage class, and a class for the profile development. Those qualitative data were used to classify the locations according to their similitude with the representative soil profiles described in the same area. To delineate soil polygons, surveyors used the knowledge of the soil type at the auger borings positions and their perception of the landscape features. They summarized all that information using the soil factor equation described by Dokuchaiev (Glinka, 1927). This equation presents the observed soil type as the result of the influence of soil forming factors (topography, parent material, biological activity, climate, and time). The final soil map was designed at scale 1:20 000.

In this study, two data sets are used, namely a hard and a soft data set. The hard data set consists in 277 locations for which the percentages of sand, silt, and clay are registered. This is the subset of the Aardewerk database corresponding to the study area. The soft data set is directly built from the soil map. From the digital version of this map, the soil texture layer for the study area is extracted (Fig. 3) . Intervals of values for the sand, silt, and clay fractions can be derived using the knowledge of the texture class and the texture triangle (Fig. 1). The interval-type data are then introduced in the BME framework for interval-type soft data. The texture information can also be used in the BME/MC framework explained above. Ordinary kriging, conversely, is not able to use that kind of information and rely only on the hard data set. The original polygon texture map is then discretized to provide the texture class at each estimation point. Finally, the three texture fractions are estimated on a 100 by 100 m grid, the size of the grid cells being a compromise between requirement for the highest resolution and reasonable computation time.



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 3. Texture map for (a) the study zone and (b) location of the hard data points (for the signification of the texture classes, see Fig. 2).

 
The spatial structure for the sand, silt, and clay contents can be modeled by a linear model of coregionalization consisting of a nugget effect [Nug(h)] and a spherical structure with a range of 4000 m (Sph[h,4000 m]). The covariance matrices for the nugget and the spherical structures are given hereafter:

[9]

The experimental cross-variograms and the fitted model are represented in Fig. 4 .



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 4. Experimental variograms and cross-variograms for the sand, silt, and clay contents (dots) and fitted model (plain line). The model is composed of a nugget and a spherical structure with a range of 4000 m.

 
Methods and Comparison Criteria
Estimation of the texture fractions is done on the 100 by 100 m estimation grid successively for sand, silt, and clay using OK, BME, and BME/MC. A texture map is then constructed using the estimates of the three fractions at each location by classification of each triplet of values according to the Belgian texture triangle (Fig. 1). For all computations and statistical analyses, we used the BMELIB toolbox (Christakos et al., 2002) written for MATLAB (1999).

The three methods are compared on the basis of their ability to yield the most accurate sand, silt, and clay maps, and additionally on their ability to reproduce the actual texture map. To evaluate which method is giving the most accurate estimation for the three texture fractions, three statistical criteria are considered:

  1. Mean error (ME), as an indicator of bias:

    [10]
    with {chi}i,hard, the hard value at point pi; i,hard, the value at point pi estimated by cross-validation; nh, the number of hard data points.

  2. Root Mean Squared Error (RMSE), as a precision measure:

    [11, 3.]
    Classification Error Rate (CER), as a measure of the ability of the method to reproduce the actual soil map:

    [12]
    with nk, the number of estimation points; i(i; {tau}i) = 0 if i = {tau}i, and 1 elsewhere. In this last expression, {tau}i is the actual texture class at point pi according to the soil map, and i is the estimated texture class.

The best appropriate method would be the one with ME, RMSE, and CER closest to zero. Criteria 1 and 2 are computed from a cross-validation procedure where each hard data point is discarded and then reestimated using the remaining hard and soft data (Isaaks and Srivastava, 1989). Criterion 3 is computed by performing a point-to-point comparison at each point of the estimation grid.

A fourth method is also included: the RM for each texture class. This corresponds to a frequent way of quantifying the legend of soil maps (Finke, 2000; Voltz and Webster, 1990). Every point located in a given mapping unit will be attributed to the average value computed from all the hard data sharing the same texture class, whatever the spatial location of the mapping unit. This does not result in any estimation of the variable itself but rather in an estimation of the mean value.

From a theoretical viewpoint, RM gives the lowest possible ME and RMSE. This is because of the intrinsic properties of this "estimator": the average of a distribution is the value which ensures a zero mean error and a minimum standard deviation. Subsequently, ME and RMSE computed from RM can be considered as target values to be approached by the other methods. The relative RMSE, defined as the ratio between the RMSE computed from any of the other estimation methods and the RMSE calculated from RM, should thus be as close to 100 as possible. Clearly, RM gives thus the best possible global precision, but delivers large zones with constant values completely disregarding the intramapping unit spatial structure. This results in a very low local precision.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Estimated Maps and Error Distributions
Figure 5 shows the sand maps computed from the RM, OK, BME, and BME/MC estimates, respectively. The silt maps (not shown) present a very similar pattern since silt is strongly negatively correlated with sand. The clay fraction has a nearly totally erratic structure because of the predominance of the nugget effect (see Fig. 4). While RM gives only a representation of what the mean value is expected to be, the OK map depicts the spatial variation of the sand values. Since OK is only using the 277 hard data points, it strongly smoothes the features of the map and only major structures are identified, like the high values in the north and the lowest contents in the south. With BME, the incorporation of the soft data allows the production of a more complex and realistic map, clearly identifying structures like the Velp valley. But much more complex structures, like those encountered in the North, remain with vague contours. The BME/MC approach succeeds in taking fully into account the information provided by the soil map while introducing an intramapping unit variability. The East part of the Demer Valley is now clearly depicted and abrupt transitions are faithfully represented.



View larger version (48K):
[in this window]
[in a new window]
 
Fig. 5. Maps of the estimates for the sand fraction (expressed in percentages). (a) Regional Mean, (b) ordinary kriging (OK), (c) Bayesian maximum entropy (BME), and (d) the Monte Carlo procedure (BME/MC).

 
These qualitative assertions are corroborated by the error distributions depicted in Fig. 6 and the ME and RMSE computed in Table 1. From theory, the four methods are expected to be unbiased. This is true, except for BME/MC estimates. For this method, we made explicitly the assumption of a uniform repartition of the points on the texture triangle. A projection of the hard data points on the triangle (not shown) shows that it is not the case here, some parts of the triangle being much more densely covered than others. To correct this artifact, one could use a Gaussian kernel to fit a density function over the triangle and use it as prior to randomly draw the triplets of values. But this will singularly increase the computation load without any guarantee of significant increase in accuracy. Indeed, extreme values, which should be the most important to hold, will receive very few weight from the Gaussian kernel.



View larger version (13K):
[in this window]
[in a new window]
 
Fig. 6. Error distributions for (a) sand, (b) silt, and (c) clay estimates. Regional mean (RM) is displayed with dotted line, ordinary kriging (OK) with dash dotted line, Bayesian maximum entropy (BME) with dashed line and the Monte Carlo procedure (BME/MC) with plain line.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Mean error (ME) and root mean square error (RMSE) for sand, silt, and clay computed from regional mean (RM), ordinary kriging (OK), Bayesian maximum entropy (BME), and Monte Carlo procedure (BME/MC), respectively.

 
Concerning the accuracy of the methods, OK shows the largest error distributions and the highest RMSEs. The BME/MC approach produces the most accurate estimates, as indicated by the more narrow error distribution and the lowest RMSEs, excepted for clay. This behavior of clay is due to the nearly complete nugget effect that characterizes its spatial structure.

To highlight the importance of incorporating the soft information, the relative RMSE values are summarized in Fig. 7 , the objective for each method being to be as close as possible to the RMSE values computed from RM. While the absolute RMSE values do not seem to be very different, using soft data adequately appears to increase significantly the estimation accuracy. The relative RMSE value is reduced, e.g., for sand from 184% with OK to 109% with BME/MC. However, the clay fraction being mainly governed by a nugget structure, the gain of incorporating soft information is less substantial for this fraction.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 7. Relative root mean square error (RMSE) for sand (stars), silt (open squares), and clay (open circles) estimates.

 
In addition to this, it is important to keep in mind that, beside these global measures of efficiency, the most important aspect in this context is certainly the spatial pattern exhibited by the methods. As shown by the sand maps, BME appears to perform far better than OK. The BME maps have a much more realistic pattern than the highly smoothed maps obtained from OK.

Classification Error Rate
The last comparison criteria focus on the ability of the methods for reproducing accurately the actual texture map. Actual and estimated texture maps are given in Fig. 8 .



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 8. (a) Reference texture map, (b) estimated texture maps with ordinary kriging (OK), (c) Bayesian maximum entropy (BME), and (d) Monte Carlo procedure (BME/MC).

 
As OK is only able to draw the main zones, with highly smoothed contours, the corresponding CER value is high (40%). While BME performs better than OK for each fraction separately, the texture map is not so different from the OK map and the CER remains very high (38.5%). However, BME/MC proves its aptitude to take fully into account the texture class information, and it reproduces faithfully the actual texture map. Even small structures, like the valleys, are revealed and the CER is highly reduced (4%).


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
A challenging task for soil scientists consists in mixing various sources of information to predict a variable's value at unsampled locations. The quality of the estimation will mainly depend on the ability of the method to use these available information sources on a rigorous way. As accurate measures are scarce because of logistic and economical constraints, soil scientists should use any other relevant information. The soil map, when it exists, is certainly one of them.

In this paper, we have shown the ability of the BME approach to produce accurate soil texture estimates from a soil map and a limited number of additional samples. Estimates produced with BME are more accurate than those obtained from OK, and the maps showed much more realistic spatial patterns. This reinforce results obtained from previous simulated case studies (D'Or et al., 2001; D'Or and Bogaert, 2001).

To take into account the constraints specific to the texture estimation, we have developed a variant of BME that uses expectation properties in a Monte Carlo procedure (BME/MC). We have shown that BME/MC produced significantly better estimates than BME by increasing the estimation accuracy and lowering substantially the (relative) RMSE and the CER. The BME/MC approach proved to be an efficient tool for estimating random variables that are constraint in a complex way.

Although we used in this study both hard and soft data, it is worth noting that the BME algorithm, conversely to classic geostatistical methods, can also be used when only soft information is available. For example, when only a soil texture map is available, BME is still able to provide estimates for the three textural fractions (D'Or and Bogaert, unpublished data).

Another way of reaching the same objective would be to rely on the indicator kriging (IK) formalism, as recommended by some authors (Oberthur et al., 1999). Unfortunately, this method is completely useless in situations such as our case study. As the soft information is exhaustively available over the area, a prior cumulative density function (cdf) can be built at each estimation location. The exactitude property of kriging then imposes the restitution of this cdf as estimate. All the points in the same mapping unit will consequently be attributed the same value and hard data will be completely ignored if they are not located on the estimation grid points. Furthermore, IK is only able to take the soft information into account through approximations. The precision of the estimation is strongly dependent on the number of thresholds and on their values. The BME approach, conversely, does not require any degradation of the information into indicators; it is based on a strong theoretical background and produces at each estimation point a continuous posterior cdf, allowing the computation of confidence intervals and other elaborate statistics.

Finally, it is worth noting that soil properties which are not explicitly included in the soil map could be estimated using other soft data sources like, for example, landscape patterns (Moore et al., 1993) or digital elevation models (Chaplot et al., 2000). Remote sensing was already used for improving the objectivity and accuracy of soil patterns delineation (Ahn et al., 1999). With the profusion and diversification of data sources, BME thus appears to be a valuable method for combining advantageously the available sources of information.


    ACKNOWLEDGMENTS
 
The authors are grateful to Pr. B. Delvaux (Université Catholique de Louvain) for his review comments on this manuscript as well as two anonymous referees for their helpful comments.

Received for publication April 18, 2001.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 Study Area and Data...
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Web of Science (13)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bogaert, P.
Right arrow Articles by D'Or, D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Bogaert, P.
Right arrow Articles by D'Or, D.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Bogaert, P.
Right arrow Articles by D'Or, D.
Related Collections
Right arrow Data Management
Right arrow Spatial Distribution
Right arrow Statistics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Vadose Zone Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome