|
|
||||||||
a Faculty of Agriculture, Food, and Natural Resources, McMillan Building A05, The Univ. of Sydney, NSW 2006, Australia
b Hydrology Program, Dep. of Land, Air, and Water Resources, 123 Veihmeyer Hall, Univ. of California, Davis, CA 95616
c Dep. of Water Resources, Water Use Efficiency Office, 901 P Street, Third Floor, P.O. Box 942836, Sacramento, CA 94236-0001
d Dep. of Environmental Sci., 2217 Geology Building, Univ. of California, Riverside, CA 92521
* Corresponding author (jwhopmans{at}ucdavis.edu).
| ABSTRACT |
|---|
|
|
|---|
b), saturated water content, and saturated hydraulic conductivity. The prediction errors of water content were about 3 to 4% by volume. Unsaturated hydraulic conductivity predictions improved significantly when a so-called performance-based algorithm was utilized to minimize residuals of soil hydraulic data rather than hydraulic parameters. The root mean squared of residuals for predicted values of water content and unsaturated hydraulic conductivity were reduced by about 50% when compared with predicted hydraulic functions using a published neural networks program Rosetta. Results from a sensitivity analysis suggest that the hydraulic parameters are mostly sensitive to sand content and saturated water content.
Abbreviations:
b, bulk density MR, mean residual OM, organic matter PTF, pedotransfer function RMSR, root mean squares of residual
| INTRODUCTION |
|---|
|
|
|---|
The soil hydraulic properties are usually represented by a parametric model. Although many of these have been developed (Kosugi et al., 2002), the most commonly used soil-water retention model is the one introduced by van Genuchten (1980):
![]() | [1] |
(h) denotes the volumetric water content (L3 L3) at the corresponding soil-water matric head h (L),
r and
s are the residual and saturated water content,
is a scaling parameter (L1), n is a curve shape factor (), and m is an empirical constant that can be related to n, or m = 1 1/n. When substituted into the unsaturated hydraulic conductivity (K) model of Mualem (1976), the unsaturated hydraulic conductivity is described by:
![]() | [2] |
r)/(
s
r), l is a pore geometry parameter, and Ko is a matched saturated hydraulic conductivity (L T1), extrapolated from fitted unsaturated K values. This fitted value for Ko is usually smaller than the true saturated conductivity, Ks, since the latter is largely controlled by soil structural elements, such as cracks and macropores. Therefore, Ko is usually considered a fitting parameter (van Genuchten and Nielsen, 1985). Soil hydraulic properties, typically measured for small core samples with a length scale of 101 m or smaller, vary greatly in space (Nielsen et al., 1973). Consequently, a large number of samples is needed to characterize fields with length scales of 102 m or larger. However, measurement of soil hydraulic properties is generally difficult, time consuming, and expensive, so that few complete datasets are available. Therefore, indirect methods have been pursued that predict soil hydraulic properties from more easily measured soil parameters. An excellent review of indirect methods, including the application of neural networks, is presented by Leij et al. (2002). Wösten (1990) postulated that the use of these indirect methods is acceptable as long as it includes the uncertainty of the estimations.
Specifically, the use of PTFs was introduced by Bouma (1989). These are predictive functions of certain soil properties estimated from other simpler and routinely-measured soil properties that are generally available (McBratney et al., 2002). The application of PTFs to predict soil-water retention using basic soil properties such as sand, silt, and clay content, and
b is now commonly used in soil science studies because predictions of either water content at a specific soil-water matric potential or water retention parameters (Schaap et al., 1998) have been quite successful (Romano and Palladino, 2002). Also the saturated hydraulic conductivity can be predicted reasonably well from such basic soil properties. Models such as Mualem's Eq. [2] are subsequently used to predict the unsaturated hydraulic conductivity from water retention data. However, more accurate unsaturated conductivity functions are expected if these are predicted directly from measured K values, for example, by neural network analysis.
Only few studies show the application of PTFs to predict unsaturated hydraulic conductivity data from basic soil properties. Among those are Bloemen (1980) in the Netherlands, who correlated parameters of the Brooks-Corey function to soil texture and organic matter (OM) content. Similar techniques were applied by Gonçalves et al. (1997) in Portugal, Jaynes and Tyler (1984) for glacial till soil in the USA, and Vereecken (1995) in Belgium using alternative analytical expressions for the K(h) function. Among the first to use neural networks to predict K was Tamari et al. (1996), who included horizon designation, soil textural class, OM content,
b, and water content at specific soil-water matric potential values as input variables. Wösten et al. (1999) extracted data from the European soil hydraulic database to derive van Genuchten function parameters using sand, silt, and clay content, soil
b, and organic carbon content. Schaap and Leij (2000) predicted parameters l and Ko from sand, silt, and clay content, and
b using neural network analysis. The latter study concluded that K-predictions were improved if soil-water retention function parameters were included as input parameters. The performance of different published PTFs in predicting K(h) for selected German soils was evaluated by Wagner et al. (2001).
Among the main issues that limit the accurate prediction of unsaturated hydraulic conductivity is the lack of a soil hydraulic database that includes measured unsaturated conductivity data. Public domain databases such as UNSODA (Nemes et al., 2001) do provide both hydraulic properties and basic soil physical data from different parts of the world for various soil types. However, the unsaturated hydraulic conductivity values that are included in these data sets are generally obtained by many different measurement techniques. Typically, these methods are limited by specific assumptions and apply to relatively narrow water content ranges, so that K-prediction results are expected to depend on measurement type (Mallants et al., 1997). Although the measurement type effect applies to prediction of soil-water retention data as well, its effect may not be as consequential, because the predicted
range of the retention curve is much smaller than the predicted unsaturated K range.
Thus, when PTFs are used to predict soil hydraulic data, uniformity of measurement methods is desirable. Specifically, Schaap and Leij (1998) showed that the PTF prediction depended on the training data set, whereas the accuracy was largely controlled by data quality. It is expected that improved prediction accuracies will be obtained when a training data set is used with soil hydraulic and related physical properties that are determined from similar measurement techniques. In this paper, we examine the simultaneous prediction of soil-water retention and unsaturated hydraulic conductivity from soil hydraulic data that were estimated with the multistep outflow method (Eching et al., 1994b), whereby the soil hydraulic parameters of Eq. [1] and [2] are estimated with an inverse modeling technique. Characteristically, this measurement technique provides estimated soil hydraulic parameters from the matching of experimental observations of transient water flow with numerical modeling results.
The presented measured data span 310 soil samples, largely from three different datasets, representing a variety of alluvial soils and soil textures across three different regions in the California San Joaquin Valley. The main objective of the presented analysis is to show that neural network prediction of both soil-water retention and unsaturated hydraulic conductivity will improve if all analyzed data are obtained by identical measurement methods.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The outflow method originates from the one-step outflow experiment of Gardner (1956). Kool et al. (1985) formulated the inverse solution for the one-step pressure outflow experiment using a numerical solution of transient water flow, that is, Richards' equation, with the van Genuchten model of Eq. [1] and [2] representing the soil hydraulic properties. Starting with initial parameter estimates, a numerical model solution computes the theoretical drainage outflow rate of an initially-saturated soil sample. Parameters of the soil hydraulic functions are updated iteratively in an optimization routine, thereby continuously reducing the residuals until a predetermined convergence criterion (reduction in objective function value between two consecutive iterations) is achieved.
Kool et al. (1985) successfully applied the inverse method to estimate
r,
, and n from cumulative outflow measurements. Van Dam et al. (1994) proposed the multistep outflow method, by increasing the air pressure in multiple smaller steps. Their results showed that the outflow data from a multistep experiment provided sufficient information to yield a unique solution. Alternatively, Eching et al. (1994b) demonstrated that unique solutions were obtained if the multistep outflow method was combined with automated soil-water matric head measurements of the draining soil core.
A comprehensive review of inverse modeling for estimation of soil hydraulic properties, including one-step and multistep methods was presented by Hopmans et al. (2002). Although relatively complex, inverse modeling can provide quick results. As an additional advantage, inverse modeling for soil hydraulic characterization allows the simultaneous estimation of both the soil-water retention and unsaturated hydraulic conductivity function from a single transient experiment. The inverse method mandates combination of experimentation with numerical modeling, thus requiring both accurate experimental procedures and advanced numerical modeling and optimization algorithms. Since the optimized hydraulic functions are mostly needed as input to numerical flow and transport models for prediction purposes, it has the added advantage that the hydraulic parameters are estimated by similar numerical models. In addition, the parameter optimization procedure provides a confidence interval of the optimized parameters, although their interpretation may be misleading. Some caution must be exercised when applying the multistep outflow method. First, laboratory measurements, although accurate, provide hydraulic information for a relatively small soil core, detached from its surroundings. Moreover, as is the case for any method, the parameter estimates are only valid for the range of the experimental conditions, and care must be exercised in their extrapolation. Finally, inverse problems for parameter estimation of soil hydraulic functions can be ill posed because of experimental design, measurement, and model errors.
Training Dataset
The presented 310 soil hydraulic data were collected from three different field projects. The first dataset consists of 144 undisturbed soil samples that were collected from seventy two 64- by 64-m plots at two depths (25 and 50 cm) in a 40-ha field (Tuli et al., 2001a). This Long Term Research on Agricultural Systems (LTRAS) project was conducted at the Russell Ranch of the University of California near Davis, CA, to study the long-term effects of irrigation and nitrate application to the sustainability of California agriculture. The field includes three different soil series: the Yolo (fine-silty, mixed, superactive, nonacid, thermic Mollic Xerofluvents), the Rincon (fine smectitic, thermic Mollic Haploxeralfs), and the Brentwood (fine, smectitic, thermic Typic Haploxerepts). Within each 64- by 64-m plot, 8.25-cm i.d. and 6-cm-long soil cores were collected with a soil core sampler. The range of values of the main soil physical properties as obtained from these soil cores were:
b, 1.22 to 1.66 g cm3; OM, 0.43 to 1.63%; saturated hydraulic conductivity, 0.0002 to 17.7900 cm h1; saturated water content, 0.32 to 0.50 cm3 cm3; sand (502000 µm), 11 to 56%; silt (250 µm), 34 to 80%; and clay (<2 µm), 3 to 22%.
The second data set consists of 88 soil cores collected from a 32-ha furrow-irrigated field (Diener) on the west side of the San Joaquin Valley (Eching et al., 1994a), near Five Points, CA. The soil is of the Panoche series (fine-loamy, mixed, superactive, thermic Typic Haplocambids), having very deep and well-drained uniform profiles with a wide range of textures. Soil texture varied from a silty loam and sandy clay loam on the south east side of the field to a loamy sand and sandy loam with patches of silty clay, clay loam, and silty clay in the rest of the field. Undisturbed soil cores were taken from the 0.3- and 0.6-m soil depth at 44 locations, uniformly distributed within the irrigated field. The range of values of the main soil physical properties as obtained from these soil cores were:
b, 1.26 to 1.87 g cm3; OM, 0.03 to 0.18%; saturated water content, 0.32 to 0.54 cm3 cm3; sand, 13 to 99%; silt, 1 to 76%; and clay, 1 to 15%. No saturated hydraulic conductivity data were available for the Diener dataset. Extracted core locations in the field were grouped into clays (6 locations), loams (12 locations), and sands (26 locations).
The third data set consists of 69 sediment cores. Of the three data sets, this is the only data set representing unsaturated sediments below the root zone. The core samples represent unsaturated sediments in their native, anthropogenically unaltered depositional environment. Continuous cores were extracted with a Geoprobe Systems (Salina, KS) direct push drill rig. A Geoprobe Macrocore sampler (5.2-cm o.d.) containing a PVC liner (3.8-cm i.d.) was driven in 1.2-m intervals through unsaturated sediments to a depth of >15 m. Sediment cores were obtained from 18 locations spaced 3 to 12 m apart within a 1-ha orchard at the Kearney Field Station (Parlier) in the San Joaquin Valley, CA. The location overlies the near-distal part of the Kings River alluvial fan emanating from the Kings River watershed at the foot of the generally granitic Sierra Nevada mountain range. The continuous cores were cut in 10-cm long core sections that were fitted within PVC and aluminum sleeves to fit a 5.1-cm i.d. Tempe pressure cell (Tuli et al., 2001b). The range of values of the main soil physical properties as obtained from these soil cores were:
b, 1.26 to 1.87 g cm3; OM, 0.01 to 0.20%; saturated water content, 0.22 to 0.47 cm3 cm3; saturated hydraulic conductivity, 0.002 to 30.0 cm h1; sand, 13 to 98%; silt, 1 to 76%; and clay, 1 to 17%. Five major textural units were distinguished in the cores: sand, loamy sand, sandy loam, silt/silt loam/loam/silty clay loam, clay loam/clay, and variably thick hardpan at the 3- to 5-m depth. A former alluvial channel bed of limited width and consisting of clean medium sand was encountered at the 7- to 10-m depth. Nine additional soil data were included from Eching et al. (1994b) and Corwin et al. (2003).
For each core sample, soil properties and hydraulic functions were determined with the following procedure. Upon saturation, the soil cores were placed on a screen to measure the saturated hydraulic conductivity (Ks) with the constant head method (Klute and Dirksen, 1986). After completion of the saturated hydraulic conductivity measurement, the samples were assembled in Tempe pressure cells for estimation of soil-water retention and unsaturated hydraulic conductivity function using the multistep outflow method. The samples were resaturated with the 0.01 M CaCl2 solution by wetting through a bottom porous membrane assembly. For Datasets 1 and 2, the bottom plate consisted of a 1-bar ceramic plate. For the third dataset, the bottom assembly included a thin porous nylon membrane with low hydraulic resistance (Hopmans et al., 2002). A positive air pressure was applied to the top of the cell, while cumulative water outflow is automatically recorded from a pressure transducer that was installed in the bottom of a burette. The soil-water matric head inside the draining soil core was simultaneously measured with a miniature tensiometer connected to a pressure transducer. The multistep pressure increments were determined by soil texture, but maximum air pressures did generally not exceed 600 cm. The air pressure was increased when the cumulative outflow curve approached a plateau value, indicating near-hydraulic equilibrium. With the transient soil-water matric head and cumulative drainage data, the parameters of the soil-water retention and unsaturated hydraulic conductivity functions were estimated from the inverse solution of the Richards' equation as presented in Eching et al. (1994b) and Hopmans et al. (2002). During the optimization,
s was fixed to its measured value whereas the soil tortuosity-connectivity parameter, l, was assumed to be 0.5. Therefore, the optimized parameters were
r,
, n, and Ko. The final dataset includes weight percentages of sand, silt, and clay content, and field dry
b as determined from standard methods (Klute and Dirksen, 1986), measured
s and Ks, and optimized van Genuchten parameters
r,
, n, and Ko.
The statistics of the complete dataset are given in Table 1, whereas the soil textural distribution is presented in Fig. 1 . The textural range of the combined sample set is dominated by sands to silt loams. The multistep outflow method is typically not suitable for soil hydraulic property measurement of clayey soils because the maximum applied pressure step will generally not exceed 600 to 700 cm. The texture triangle shows the textural differences between the three data. The LTRAS samples are dominantly silty loam and loamy soils. The sampled Kearney soils are low in clay content, whereas the Diener samples consist mainly of loamy and sandy loam soils. For the neural network analysis, soil-water retention and unsaturated hydraulic conductivity data were extracted from the set of hydraulic functions (Fig. 2a,b) at discrete matric pressure head values: 0, 40, 60, 80, 200, 400, and 600 cm. The textural differences between the three datasets are readily apparent in Fig. 2. Specifically, the finer-textured soil materials of the LTRAS site have high soil-water retention, while the Kearney soils with their low clay content have the smallest water retention and highest hydraulic conductivity.
|
|
|
The feed-forward neural networks that is applied in this study consists of a set of input units, x, representing the input variables, and a set of output units, y, representing the output variables, interconnected by hidden units, z. Each set of the three types of units are arranged in layers. The mathematical model consists of a set of operations or network that is presented in Eq. [3]. First, an input vector x is multiplied by weighting factors that are assembled in array W, resulting in the hidden unit vector z. In a second step, this vector z is passed to a layer containing the activation or transfer function, f, which produces r. Finally, in the third step, the target vector y is computed from a linear combination of r, with the weighting factors in array U, or:
![]() | [3] |
![]() | [4] |
Neural networks also have significant disadvantages that must be taken into consideration. First, their interpretation is often difficult and subjective, as the fitting with the transfer function is a black-box approach. Second, as is usually the case in optimization, the sets of optimized weighting factors are not necessarily mathematically unique because of the likelihood of convergence at local minima. Consequently, different initial weight values may yield different neural network results that deviate from the global minimum. To avoid nonuniqueness of the final solution, many network predictions can be obtained from multiple realizations of the input dataset by the bootstrap technique, also known as bagging (Breiman, 1996).
Neural Network Training of Soil Hydraulic Parameters
The objective of the presented study is to train the neural network so that the parameter vector p = [
r,
s,
, n, K0] can be predicted from a basic soil property vector x that includes soil texture,
b, saturated water content, and saturated hydraulic conductivity. Conventionally, this is done by optimization of the neural network weights so that the objective function that includes the sum of squares of the residuals between the measured and predicted parameters is minimized, or
![]() | [5] |
(x) is the output vector with the predicted parameters, as determined from the input vector x.
Because the parameters in p are highly nonlinear and correlated, thus possibly nonidentifiable, a fitted parameter set does not warrant an equally good prediction of soil hydraulic data. Therefore, for estimation of the soil-water retention curves, Minasny and McBratney (2002a) proposed an alternative objective function for training neural networks to predict the van Genuchten parameters from the basic input data. Instead of minimizing the hydraulic parameters, the objective function operates on the residuals of the measured and predicted soil-water retention data, that is,
(h) data pairs. They called this the Neuro-m method, which was referred to by Romano and Palladino (2002) as having a performance-based objective function.
The Neuro-m method is extended in this study to simultaneously predict water retention and unsaturated hydraulic conductivity, with the following objective function:
![]() | [6] |
(x,h,p) and log
(x,h,p) denote the predicted water content and unsaturated hydraulic conductivity values at matric head h with the fitted parameter vector p, calculated via the neural network from the input vector x. Reciprocal values of the variance,
2, of the respective measurements are used as weights to account for differences in magnitude between
and log K. Equation [6] was minimized with a modified version of the Levenberg-Marquardt method.
Values of K were log-transformed because K is generally found to be log-normal distributed (Schaap and Leij, 2000), so that they were computed from Eq. [2] to yield
![]() | [7] |
, and n 1 ensured that their back-transformed values were positive and that n > 1.
The neural network consisted of a single hidden layer. We conducted a trial to determine the appropriate number of hidden units by training the data with a range of hidden units (Minasny and McBratney, 2002a). From this trial, we concluded that predictions did not improve significantly by use of more than six hidden units. Four different neural network models were trained to test the ability of various input parameter combinations (Table 2) to predict p. These input combinations represent a hierarchical structure of data availability (Schaap et al., 1998). The least amount of input was needed for the first training set that included particle size distribution data only (Ni = 3). The other three training sets also included input parameters:
b (Ni = 4),
b and
s (Ni = 5), and
b with
s and log Ks (Ni = 6).
|
Bagging
Recent empirical evidence suggests that combining different neural networks can enhance the prediction accuracy (Perrone and Cooper, 1993). With use of bootstrap aggregating or bagging (Breiman, 1996), one can generate many different data sets from a single original data set to fit different neural network models. These networks are then combined to form a single aggregated predictor.
The bootstrap method (Efron and Tibshirani, 1993) was developed to assess the accuracy of a prediction by generating different prediction models from different realizations of the training dataset. Dane et al. (1986) used bootstrapping to provide confidence intervals for the statistical distribution of soil
b and to determine the minimum sample size needed to estimate the mean with a specified degree of precision.
Bootstrapping assumes that the training data set is a representation of the population and that multiple realizations from the population can be simulated from this single dataset. This is done by repeated sampling with replacement from the original dataset, D, of size N to obtain B bootstrap data sets, each of size N. Therefore, each bootstrap data set contains different data. Since the neural net is trained for each realization, the bagging procedure produces B neural networks. Each bootstrap dataset Db, b = 1, 2, ..., B, yields a prediction model,
b(x), where y either represents a vector with predicted parameter (parameter-based) or
and log(K) (performance-based) values. The bagging estimate is calculated from the mean of all B model predictions, or
![]() | [8] |
Bagging is especially useful when analyzing highly variable data sets. The aggregated predictor averages the prediction across all bootstrap samples, thereby reducing the prediction variance. The prediction accuracy increases if the prediction method is unstable; that is, small changes in the training data of the bootstrap can result in large changes in the resulting predictor (Breiman, 1996).
The size B of the bagged or bootstrap aggregated predictor used was 50; that is, data were resampled 50 times, thus producing 50 neural networks. For each neural net, the van Genuchten parameters were predicted, and
(h) and K(h) data pairs for each soil sample were calculated. The mean and 95% confidence interval of the predicted hydraulic parameters, water retention, and hydraulic conductivity data were computed by all B predicted data pairs, Eq. [8]. Finally, predicted soil hydraulic parameters were determined by fitting the Mualem-van Genuchten function (Eq. [1] and [2]) to the mean predicted water retention and hydraulic conductivity data. This algorithm was implemented in a program called Neuro Multistep, which is available upon request.
Performance Measure
The performance of the neural network was evaluated from values of the mean residual (MR) and root mean square residual. The MR is a measure of prediction bias, with negative and positive MR-values indicating underestimation and overprediction, respectively. It is defined by
![]() | [9] |
and y represent predicted and measured
or log K values, respectively, at the seven different matric head values for each of the 310 soil samples (N = 7 x 310). The root mean square of residuals (RMSRs) defines the expected magnitude of the prediction error, or
![]() | [10] |
Our prediction results were compared with those obtained by the neural network program Rosetta of Schaap et al. (2001). Briefly, Rosetta estimates soil-water retention parameters
r,
s,
, and n with a training data set of soils from the temperate and subtropical regions in Europe and the USA. Their unsaturated hydraulic conductivity parameters, Ko and l, were predicted separately by the UNSODA database.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
par) and performance-based optimizations (
per) for three different soil samples (silt loam from LTRAS, sand from Kearney, and sandy loam from Diener) of the training data set are highlighted in Table 4. While parameter-based optimization gives closer values of n, Ko, and similar RMSR values for water retention, the RMSR values for log(K) are generally much higher, as expected from using Eq. [10] as the performance criterium.
|
(retention data) and log(K) (unsaturated hydraulic conductivity). The RMSR for
is approximately 4% moisture content, whereas the RMSR for log(K) is about one order of magnitude. Increasing the number of relevant input variables to include
b and
s improved the prediction, particularly of log(K).
The predicted soil-water content data are compared with their measured values in Fig. 3a
. Whereas most
data are concentrated near the 1:1 line across the whole water content range, some Kearney data (triangles) were underpredicted. The predicted unsaturated hydraulic conductivity values (Fig. 3b) matched their corresponding measured values except for a small number of low hydraulic conductivity values of the LTRAS samples (diamonds). As the results of Table 3 show, incorporating measured Ks values as an additional input parameter, improves the prediction of unsaturated hydraulic conductivity only slightly, while increasing the bias of both
and log(K).This indicates that Ks has little meaning for unsaturated K, which is better defined by Ko.
|
(h) and K(h) functions with 95% confidence intervals for a LTRAS sample with a sand and clay content of 21 and 18%, respectively, and
b,
s, and Ks values of 1.5 g cm3, 0.41 cm3 cm3, and 0.08 cm h1, respectively. For the second sample, a sand from Kearney in Fig. 4b, the sand and clay content,
b,
s, and Ks values are 98 and 2%, 1.46 g cm3, 0.37 cm3 cm3, and 60 cm h1, respectively. Finally, Fig. 4c presents the predicted curves and confidence intervals for a Diener sample with sand and clay content of 43 and 14%, and
b and
s values of 1.53 g cm3 and 0.4 cm3 cm3, respectively. As expected, all predicted hydraulic data fall within the confidence bands. In this example, the prediction of the Kearney sand is quite uncertain as reflected by the high confidence interval and large RMSR values, which demonstrates the lower prediction capability for sand. Generally, all conductivity predictions show that the prediction uncertainty is larger as the unsaturated hydraulic conductivity decreases with decreasing matric potential values.
|
b as input parameters. This value was reduced to 0.84 when K was predicted from the soil-water retention parameters
r,
s,
, and n. The average RMSR value of the study of Zhuang et al. (2001) was 1.24. The RMSR of log(K) of our training data set is of similar magnitude or lower than either reported study. In comparing the presented predictions with other studies, we must note that our study predicts the water retention and unsaturated conductivity simultaneously from the same input data, while most other training data sets apply neural networks to water retention and unsaturated hydraulic conductivity separately. To determine the influence of the training data set on the prediction, we applied the Rosetta neural networks of Schaap et al. (2001) to our training dataset. The RMSR values with Rosetta (Table 3) were about twice as large. The larger prediction error by Rosetta is a consequence of two factors. First, it demonstrates the value of the training data set, that is, our data set includes hydraulic data for Californian alluvial soils only, whereas the Rosetta training data set consists of a much wider range of soils across the globe. Second, we hypothesize that the higher prediction accuracy of Neuro Multistep is caused by the mere fact that all soil-water retention and unsaturated hydraulic conductivity data were determined by the multistep outflow method in the same laboratory. As also pointed out by Vereecken (2002), the evaluation of prediction methods for unsaturated hydraulic conductivity must consider the number and type of measurement methods that were used. Because of the limited number of soil types that were used in the training data set of Neuro Multistep, one must be careful in extrapolating our results to soils with larger values of clay content than included here.
To further investigate the usefulness of the prediction of neural networks to specific textural groups, we separated the training data set into two main soil textural groups: sands (sand, loamy sand, and sandy loam) and loams (loam, silt loam, sandy clay loam, clay loam, and silty clay loam), each with about an equal number of soil samples. The distribution of RMSR for both textural groups as well as for all textures combined is presented in Fig. 5a
(retention) and Fig. 5b (unsaturated conductivity). This is done for Neuro Multistep with all four input data sets of Table 3, as well as for Rosetta with percentages of sand, silt, and clay, and
b. Each box plot presents the median (center line) and the 25 and 75% percentiles (top and bottom) with the cross lines representing the median. Notably, the prediction error is larger for the sandy soil group than for the loamy group. We presume that the difference in prediction error is attributed to the high nonlinearity of the coarser soil group. However, it can be noted that opposite results were obtained with Rosetta, with lower RMSR values for sandy soils than loamy soils. This may be because of the better representation of sandy soil materials in the training dataset of Rosetta (Schaap et al., 2001).
|
|
r,
s,
, n, and log(K0), and water retention at a matric head value of 100 cm:
100. The input and output parameters are normalized by their respective mean (µ) and standard deviation (
) as follows:
![]() | [11] |
|
and n, which is also because of log transformation in the prediction.
Sand content seems to influence most of the parameters. An increase in both sand and silt content leads to a decrease in values of
r,
s, and
, and increase in the values of n and Ko. The influence of clay content on prediction of water content at a matric head of 100 cm (
100) appears to be relatively small compared with the other input variables, implying that the change in clay content will not affect the prediction of
. This may be the result of the low clay content of the soil used in the training data set. The value of
100 is influenced by the combination of all input variables. Saturated water content,
s, appears to be another important input variable, having more influence on the parameter predictions than
b. A change in
s value affects all parameters. It should be noted that the plot only shows the sensitivity at specific values, whereas interactions between all input variables are expected. The use of neural networks enables incorporation of the nonlinearities and interactions between the input and output variables.
| CONCLUSIONS |
|---|
|
|
|---|
With this unique dataset, we successfully developed PTFs that simultaneously predict water retention and hydraulic conductivity by neural network analysis. We note that the predictions in this paper can only be used for the range of soil textures that were included in the training data set (sands and loams). Moreover, the predicted hydraulic properties pertain to the experimental measurement range of soil-water matric heads between 0 and 600 cm only.
The neural networks model developed in this paper has not been validated on an independent dataset. Currently, we developed the model with all available data to maximize its predictive capabilities. Additional data for other soil types and geographic regions will have to be included in the training dataset, thereby providing a more general applicable prediction. However, we doubt that a single PTF can be found that provides equal and accurate predictions for every soil and geographic region in the world as what was presented here.
The neural networks analysis in this paper is implemented in a program called Neuro Multistep. The program can be obtained by contacting either Dr. Budiman Minasny (budiman{at}acss.usyd.edu.au) or Dr. Jan W. Hopmans (jwhopmans{at}ucdavis.edu), or can be downloaded from the University of Sydney website: http://www.usyd.edu.au/su/agric/acpa/software (verified 19 Nov. 2003).
Received for publication June 12, 2003.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
T. Harter, E. R. Atwill, L. Hou, B. M. Karle, and K. W. Tate Developing Risk Models of Cryptosporidium Transport in Soils from Vegetated, Tilted Soilbox Experiments J. Environ. Qual., January 4, 2008; 37(1): 245 - 258. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Parasuraman, A. Elshorbagy, and B. C. Si Estimating Saturated Hydraulic Conductivity In Spatially Variable Fields Using Neural Network Ensembles Soil Sci. Soc. Am. J., September 20, 2006; 70(6): 1851 - 1859. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Wendroth, S. Koszinski, and E. Pena-Yewtukhiv Spatial Association among Soil Hydraulic Properties, Soil Texture, and Geoelectrical Resistivity Vadose Zone J., March 8, 2006; 5(1): 341 - 355. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. S. Onsoy, T. Harter, T. R. Ginn, and W. R. Horwath Spatial Variability and Transport of Nitrate in a Deep Alluvial Vadose Zone Vadose Zone J., February 1, 2005; 4(1): 41 - 54. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Crop Science | |||
| Journal of Natural Resources and Life Sciences Education |
Vadose Zone Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||