SSSAJ Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 27 August 2007
Published in Soil Sci Soc Am J 71:1585-1592 (2007)
DOI: 10.2136/sssaj2006.0130
© 2007 Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow A correction has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chang, Y.-C.
Right arrow Articles by Yeh, H.-D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Chang, Y.-C.
Right arrow Articles by Yeh, H.-D.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Chang, Y.-C.
Right arrow Articles by Yeh, H.-D.
Related Collections
Right arrow Data acquisition and assimilation
Right arrow Soil Pollution
Right arrow Statistics

SOIL PHYSICS

Optimum Allocation for Soil Contamination Investigations in Hsinchu, Taiwan, by Double Sampling

Ya-Chi Chang and Hund-Der Yeh*

National Chiao Tung Univ., 75 Po-Ai St., Hsinchu, 300 Taiwan

* Corresponding author (hdyeh{at}mail.nctu.edu.tw).


    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The double sampling (DS) scheme is a cost-effective sampling method that combines an expensive measurement procedure with an inexpensive but less accurate one. Double sampling works when the true correlation of determination ({rho}2) between two techniques is known in advance, but that is hardly ever the case. By assuming a {rho}2 < 0.9, a three-stage procedure (TSP) in DS selects a preliminary pool of samples and estimates {rho}2, from which the optimal allocation of samples is determined. There may be excessive and unnecessary sampling during a TSP when {rho}2 > 0.9. The main objective of this study was to extend the TSP and determine the optimum allocation of the samples under a fixed budget condition for the case {rho}2 > 0.9. A soil from Hsinchu, Taiwan, contaminated with heavy metals Zn, Cu, Pb, Ni, Cr, and Cd was sampled at 0.15, 0.3, 0.45, and 0.6 m deep at 36 sites (144 samples). All samples were analyzed with field-portable x-ray fluorescence and a subset of 40 samples was selected for analyses of Zn, Cu, and Pb with standard methods in the laboratory. Results from both measurements were linearly correlated with estimated {rho}2 values of 0.96, 0.95, and 0.97 for Zn, Cu, and Pb, respectively. Considering a {rho}2 = 0.99, the optimum subsample and sample sizes were 5 out of 167, 4 out of 173, and 5 out of 167 for Zn, Cu, and Pb, respectively. The extended TSP analyses reduced the number of superfluous samples to only two or three, which was less than obtained by TSP (9–16).

Abbreviations: DS, double sampling • FPXRF, field-portable x-ray fluorescence • HEPB, Hsinchu Environmental Protection Bureau • LAB, standard method in laboratory • NTD, Taiwan dollars • SDR, Studentized deleted residual • TSP, three-stage procedure


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
In a sampling program, two different techniques may be used to analyze samples. One would be an expensive procedure that records accurate measurements for the variable of interest, while the other would be an inexpensive procedure that is subject to large measurement error or poor accuracy recording an auxiliary variable. In addition, the budget for site characterization is usually fixed and the number of samples is crucial for obtaining accurate results under a fixed budget. In such a situation, the DS method is useful for estimating the mean and variance of a distribution. Double sampling is simple random sampling without replacement in which the variable of interest is available only for the subsample, and an auxiliary variable is available for the full sample (Cochran, 1977). The DS scheme has been studied for many years and is widely used in survey sampling (Bolfarine and Zacks, 1992). Buonaccorsi (1990) applied the DS scheme for multivariate measurement error problems and focused on inference for the regression parameters.

Tenenbein (1970) demonstrated that the gains of the DS scheme are quite substantial when the correlation coefficient and the relative cost of the two different measuring devices are high. The optimum sample size that minimizes the variance for the fixed budget in the DS scheme depends on the correlation coefficient ({rho}) or the coefficient of determination ({rho}2), which, in many cases, is unknown before sampling. Thus, Tenenbein (1971) provided a TSP for determining the optimum sample size based on the DS scheme. In the first stage, {rho}2 is estimated from a preliminary sample of m = 30 accurate–inaccurate data pairs. An estimated sample size of n is determined and {rho}2 is recalculated in the second stage. The third stage uses this correlation to determine the optimum sample size of n. Tenenbein (1974) further recommended a rule to select the value of m for the TSP in a DS scheme for estimating means. Gilbert (1987) used a DS scheme and the TSP to determine the optimal allocation of the samples under fixed-budget and fixed-variance conditions in an experiment with nuclear weapon material in an isolated region of Nevada. Cochran (1977) compared DS with a regression estimate to simple random sampling with no regression adjustment and demonstrated that, under some conditions, the linear regression DS will yield a more precise estimate of the population mean than would be achieved by simple random sampling.

The problem with the TSP is that the preliminary sample pool recommended by the TSP assumes that the true {rho}2 is <0.9. Unnecessary sampling may occur when the true {rho}2 is >0.9. The objective of this study was to extend the TSP provided by Tenenbein (1974) to determine the optimum allocation of samples under a fixed-budget condition using the DS technique without the restriction that the true {rho}2 is <0.9. A set of field data of soil contamination in Hsinchu, Taiwan, was chosen to assess the performance of this extended TSP.


    THEORY
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Outlier Detection
Studentized deleted residuals (SDR), also known as jackknife residuals, are used to assess the assumption or goodness of fit of a model. Each residual is calculated from a model that includes all but the observation corresponding to the point in question. When there are np – 1 predictor variables X1, ..., Xnp–1, the regression model Yi = {alpha}0 + {alpha}1Xi,1 + {alpha}2Xi,2 + ... + {alpha}np–1Xi,np–1 + {varepsilon}i is termed a first-order model with {alpha}0, ..., {alpha}np–1 being parameters and {varepsilon}i being the error terms for i = 1, ..., n.The fitted values are expressed as Yi = {alpha}0 + {alpha}1Xi,1 + {alpha}2Xi,2 + ... + {alpha}np–1Xi,np–1, with {alpha}0, ..., {alpha}np–1 the least squares estimated regression coefficients, and the residual terms are expressed as ei = YiYi. The outliers can be identified if their SDRs are large in absolute value. The SDR ti was denoted as (Neter et al., 1996, p. 368–375)

Formula 1[1]
where SSE, the error sum of squares, = {Sigma}i=1n(YiYi)2={Sigma}i=1nei2 and hii, the diagonal element of the hat matrix, can be expressed as hii = Xi'(X'X)–1Xi, with X being a n x np matrix:

Formula 2[2]
Because n – 1 cases are used to predict the ith observation, each SDR ti follows t distribution with (n – 1) – np = nnp – 1 degrees of freedom. In addition, one can conduct a formal test by means of a Bonferroni test procedure to determine whether the case with the largest absolute SDR is an outlier. The appropriate Bonferroni critical value therefore is t(1 – {alpha}/2n; nnp – 1) (Neter et al., 1996, p. 368–375) with the significance level {alpha}.

Double Sampling
The auxiliary variate xi that is correlated with yi has been used to make a regression estimate of the population mean Formula 2= N–1{Sigma}Nyi for the population of size N in some applications of DS. In the first (large) sample of size n', only xi is measured; in the second, a random subsample of size n, both xi and yi are measured. The linear regression estimate of population mean Formula 2, denoted by Formula 2lr is

Formula 3[3]
where Formula 3 = n–1{Sigma}nyi is the mean for the variable of interest for the subsample, Formula 3' = n'–1{Sigma}n'xi and Formula 3 = n–1{Sigma}nxi are the means of the xi sample and subsample, respectively, and b={Sigma}n(xiFormula 3)(yiFormula 3)/{Sigma}n(xiFormula 3)2 is the least squares regression coefficient of yi on xi, computed from the subsample. If b in Formula 3lr is replaced by the finite population regression coefficient B = Syx/Sx2, with Sx2 and Sxy2 being the population variance of x and covariance of x and y, then the error in the approximation is of the order 1/{surd}(n) relative to Formula 3lr (Cochran, 1977, Ch. 12). Therefore, one can examine the variance of the following approximation with the finite population regression coefficient B:

Formula 4[4]
Let ui = yiBxi. In the second phase, the large sample is considered a finite population. The small sample is drawn at random from the large sample. The expected value and the variance of Formula 4lr are, respectively,

Formula 4
and

Formula 5[5]
where Formula 5'= n'–1{Sigma}n'yi is the mean for the variable of interest for the large sample and su'2 is the variance of u within the large sample and is an unbiased estimate of Sy2(1 {rho}2). It follows that

Formula 6[6]
where the subscript 1 in Eq. [6] and subscript 2 in Eq. [5] denote variations within the first and second phases of sampling, respectively. Thus, the linear regression variance estimator of a population Formula 6lr from design-based consideration is approximately (Cochran, 1977)

Formula 7[7]
where Sy2 = {Sigma}N(y Formula 7lr)2/(N – 1) is the population variance of y, and {rho}2 = Sxy2/(Sx2Sy2) is the coefficient of determination with the correlation coefficient {rho}. To find the optimal sample size that is economical and satisfies the minimum acceptable accuracy, based on the theory of DS, the Lagrange multiplier is used to obtain the general formula for n and n'. Suppose the total cost for survey sampling can be expressed as

Formula 8[8]
where C is the total amount of money available for measuring collected samples, and cA and cF are the costs per unit of making a measurement from an accurate (laboratory) technique and a fallible (field) technique, respectively. Note that C does not include the cost of collecting samples. The variance shown in Eq. [7] is minimized and subject to the cost constraint when

Formula 9[9]
and

Formula 10[10]
where

Formula 11[11]
and f0 is set equal to 1 if Eq. [11] gives f0 > 1. For a specified cost C and if both Eq. [9] and [10] hold, the variance is minimized as follows:

Formula 12[12]

If all resources are devoted instead to a single sample with no regression adjustment, this sample has size C/cA and the variance of its mean is

Formula 13[13]
Hence, optimum use of DS gives a smaller variance if

Formula 14[14]

If the relationship between the values from the laboratory and field techniques is linear and the coefficient of determination satisfies Eq. [14], then linear regression DS will, on average, yield a more precise estimate of the population mean than would be achieved by the accurate method on C/cA units selected by simple random sampling (Gilbert, 1987). Tenenbein (1974) found that V1/2 will not exceed about 15% of its minimal value if Eq. [9] and [10] are replaced by

Formula 15[15]
and

Formula 16[16]
when the true {rho}2 is <0.9 for the fixed-budget case. As noted above, since the true {rho}2 is never known in practice, it is necessary to conduct pilot studies or to use suitable data from prior studies to estimate {rho}2 and evaluate the linear regression hypothesis.

Three-Stage Double Sampling
If a pilot study is considered, the TSP recommended by Tenenbein (1974) may be used to estimate {rho}2 and then n and n'. For the fixed-budget case, the first stage consists of randomly selecting m units from the population and making both accurate and fallible measurements on each unit to estimate {rho}2. Using the definition of Tenenbein (1970), suppose a random sample of n' units is taken from the population and a subsample of n units is drawn from the main sample and each of these n units is classified by both measuring techniques. Then the remaining n' – n units are classified only by the fallible classifier. Let ntf denote the number of units whose true classification is t and whose fallible classification is f (t,f = 0,1). Also let nx and ny denote the number of units whose fallible classifications are 1 and 0, respectively. The resulting data can be represented schematically as shown in Fig. 1 . For each unit in the sample, the random variable Ti (i = 1, 2, ..., n) is defined as

Formula 17[17]
Let p = Pr(Ti = 1) and q = 1 p. The maximum likelihood estimate of p and its asymptotic variance (V) can be expressed as (Tenenbein, 1971)

Formula 18[18]
and

Formula 19[19]


Figure 1
View larger version (12K):
[in this window]
[in a new window]

 
Fig. 1. Diagram for the classified units (Tenenbein, 1971) in which n' is a random sample taken from the polulation and n is the subsample drawn from n'. The classified units are: ntf, the number of units whose true classification is t and fallible classification is f(t,f = 0,1); and nx and ny are the number of units whose fallible classifications are 1 and 0, respectively.

 
In the fixed-cost problem, the variance of Formula 19 is dependent on {rho}2, and an error in estimating {rho}2 will result in an increased variance of Formula 19. If one uses the estimated Formula 192 rather than true {rho}2 to determine the sample sizes n and n', then the percentage increase in the variance of Formula 19 can be shown to be 100{lambda}v (Tenenbein, 1971), where

Formula 20[20]
where Formula 200 is the same expression as f0 in Eq. [11], but with {rho}2 replaced by Formula 202. Table 1 shows the various values of 100{lambda}v for various values of R, {rho}2, and Formula 202. The variable r is the deviation from the true coefficient of determination {rho}2, that is,Formula 202 = {rho}2 ± r where r = 0.05, 0.10, 0.20 and 0.25. The variance increases with the deviation from the true coefficient. The increase in variance is slight, however; even when |Formula 202{rho}2|=0.20, the increase is <12% except when Formula 202>0.9. Tenenbein (1974) found that a sample size of 21 satisfies the following equation for all values of {rho}2 and R:

Formula 21[21]
The preliminary sample size m can be decided by the following rule for the fixed-budget case:

Formula 22[22]
If Eq. [14] is satisfied, then Formula 22 and Formula 22' can be obtained by Eq. [9] and [10], respectively. If m < Formula 22, then Formula 22–m additional units should be randomly selected for measurement by both methods. This selection is the second stage of the sampling plan. At this point, {rho}2 can be estimated by using all Formula 22 data, and Eq. [14] can be rechecked before proceeding. The third stage consists of using Formula 22 for n in Eq. [10] to re-estimate n'. Once the data are in hand, V(Formula 22lr) is computed by Eq. [7]. The flowchart of the TSP recommended by Tenenbein (1974) is demonstrated in Fig. 2 .


View this table:
[in this window]
[in a new window]

 
Table 1. Increase (Eq. [20]) in the variance resulting from sampling errors in estimating the coefficient of determination ({rho}2) with different ratios (R = cA/cF) of the cost per unit of making a measurement from an accurate technique (cA) to the cost per unit of making a measurement from a fallible technique (cF).

 

Figure 2
View larger version (28K):
[in this window]
[in a new window]

 
Fig. 2. Flowchart of the three-stage procedure, where m is the preliminary sample size, xi is the auxiliary variate; yi is the variate of interest; {rho} is the correlation coefficient between yi and xi;Formula 22 and Formula 22' are the estimated sample sizes of subsample n and main sample n', respectively; Formula 22lr is the linear regression estimate of the population mean; and V(Formula 22lr) is the variance of Formula 22lr.

 
Extended Three-Stage Double Sampling
The preliminary sample size m recommended by Tenenbein (1974) is based on the assumption that {rho}2 ≤ 0.9. The increase in the variance could be very high when {rho}2 > 0.9 (Table 1). The increase in the variance is 98.7% when R = 50 and the estimate Formula 222=0.74 is used rather than true {rho}2 = 0.99. The sampling plan may have the risk of exceeding the budget when using the sample size of m = 21. Hence, the TSP is not adequate in such a situation. Table 2 indicates the increase in the variance if the initial Formula 222 is 0.99. The increase in the variance decreases when {rho}2 is close to Formula 222 for various R. The increase in the variance is <40% if the initial Formula 222 is 0.99 (Table 2). We suggest using Formula 222=0.99 to calculate the estimated value n when {rho}2 > 0.9. The {rho}2 could be obtained from n pairs of true–fallible measurement data. Since the {rho}2 is known, the optimum sample size n and n' can then be determined.


View this table:
[in this window]
[in a new window]

 
Table 2. Increase (Eq. [20]) in the variance if initial estimated coefficient of determination ({rho}2) is 0.99 with different ratios (R = cA/cF) of the cost per unit of making a measurement from an accurate technique (cA) to the cost per unit of making a measurement from a fallible technique (cF).

 
A flowchart for the proposed approach with outlier detection and optimal allocation for soil sampling is illustrated in Fig. 3 . A Fortran code was developed and used to analyze the data.


Figure 3
View larger version (20K):
[in this window]
[in a new window]

 
Fig. 3. Flowchart for the proposed approach with outlier detection and optimal allocation for soil sampling, where ti is the Studentized deleted residuals defined in Eq. [1], t(1 – {alpha}/2; nnp –1) is the Bonferroni critical value, {rho}2 is the coefficient of determination, {alpha} is the significance level; np – 1 is the number of predictor variables, and n is the number of data.

 

    MATERIALS AND METHODS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
A soil contamination site with an area of 2.5 x 105 m2 located west of Hsinchu is shown as a shaded area in Fig. 4 . The Hsiang-Shan industrial park is one of the two major industrial parks in Hsinchu, and it previously supported some industrial activities such as metal processing, electroplating, the glass industry, and the chemical industry. It was suspected that the wastewater discharged from the Hsiang-Shan industrial park into local irrigation canals might be the source of contamination of a nearby agricultural field. Thus, in 1993, the Hsinchu Environmental Protection Bureau (HEPB) initiated an investigation project on soil contamination by the metals Zn, Cu, Pb, Ni, Cd, and Cr. A report published in 2000 from the HEPB revealed that there were 34 zones where the concentrations of heavy metals in the soil exceeded the soil pollution control standards regulated by the Taiwan Environmental Protection Agency (TEPA) (Fig. 4). Zones 2M and 2R were selected for detailed investigation and possible future treatment.


Figure 4
View larger version (184K):
[in this window]
[in a new window]

 
Fig. 4. Location of sampling sites in the Hsiang-Shan industrial park.

 
The area to be sampled (2M and 2R) was divided into 12 units. At each unit, three separate sampling points were chosen and samples were taken at depths of 0.15, 0.3, 0.45, and 0.6 m. Samples were analyzed with a field portable x-ray fluorescence (FPXRF) NITON XL-722S (Thermo Electron Corporation, 2006), which is a site-screening procedure using a small, hand-held portable instrument that provides rapid turnaround (about 2 min/sample) and low-cost in situ analysis of inorganic contaminants. The concentrations of Ni, Cd, and Cr in all collected samples were too low to be detected by FPXRF. Of the 144 samples analyzed, however, 132, 40, and 116 samples were contaminated by Zn, Cu, and Pb, respectively. Forty samples with high concentrations of all three metals were selected from the total number of contaminated samples and submitted to the laboratory for analysis (LAB) following the standard methods required by TEPA. The LAB analyses have a greater accuracy than FPXRF, but are more expensive and take more time to complete. This procedure resulted in 40 pairs of LAB–FPXRF data for each metal.

In the LAB method, soil samples were digested using the National Institute of Environmental Analysis Method 321.60T. An approximately 3-g (dry weight) sample was weighed in the reactor. Concentrated HCl (21 mL) and 7 mL of concentrated HNO3 were added and allowed to stand for 16 h at room temperature. The samples were then heated in a water bath to the degree of ebullition for 2 h. After quantifying the extracts, concentrations of these metals were analyzed by flame atomic absorption spectrometer.


    RESULTS AND DISCUSSION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The 40 pairs of LAB–FPXRF data showed a linear relationship, but for each metal one of the points clearly deviated from linearity (Fig. 5 ). The solid lines show the fitted regressions Formula 22 between the two techniques using all data points. The method of SDRs was used to test for outliers (Neter et al., 1996). The Bonferroni simultaneous test procedure with a family significance level of {alpha} = 0.1 and t(0.99875; 37) was equal to 3.24. The points shown as outliers in Fig. 5 had absolute SDR values >3.24. The new {rho}2 calculated without the outliers were higher than those obtained considering the original data (Fig. 5).


Figure 5
View larger version (12K):
[in this window]
[in a new window]

 
Fig. 5. Plots of the 40 data points measured with both standard laboratory methods (LAB) and field-portable x-ray fluorescence (FPXRF) techniques for (a) Zn, (b) Cu, and (c) Pb.

 
Estimation of the Sample Size by the Three-Stage Procedure
The DS technique may be applicable to this site for two reasons: (i) the measurement data of FPXRF and LAB have a linear relationship and the {rho}2 satisfies Eq. [14], and (ii) the measurement data obtained by FPXRF are substantially less expensive than those obtained by LAB. In practice, the true {rho}2 between LAB and FPXRF is seldom known for each heavy metal before taking the samples. The TSP recommended by Tenenbein (1974) was first implemented for this experiment. According to the HEPB report, the total budget of sampling and analysis was 170,000 Taiwan dollars (NTD), and the costs of each measurement using standard methods in LAB and FPXRF were 5625 and 850 NTD, respectively. Using the foregoing procedure for the fixed-budget case, we obtained with Eq. [15]

Formula 23[23]
Using the rule given by Eq. [22], we concluded that m = 21 samples should be collected and analyzed by both LAB and FPXRF methods in the first stage for these three heavy metals. Table 3 shows the statistics and Fig. 6 illustrates the regressions (solid lines) for 21 randomly selected LAB and FPXRF sample data for these three metals. Once Formula 232 was calculated in the first stage, Eq. [9–11]GoGo can be used in a second stage to obtain f0, n and n', and Formula 232(Table 4).


View this table:
[in this window]
[in a new window]

 
Table 3. Concentrations of Zn, Cu, and Pb determined by standard laboratory methods (LAB) and field-portable x-ray fluorescence (FPXRF) techniques and their statistics{dagger} for 21 soil samples.

 

Figure 6
View larger version (13K):
[in this window]
[in a new window]

 
Fig. 6. Plots of 21 and 7 randomly selected data points measured with both standard laboratory methods (LAB) and field-portable x-ray fluorescence (FPXRF) techniques for (a) Zn, (b) Cu, and (c) Pb.

 

View this table:
[in this window]
[in a new window]

 
Table 4. Optimum sample sizes and their statistics inferred from 21 pairs of measurements of Zn, Cu, and Pb using the three-stage procedure.

 
The optimum sample sizes for LAB and FPXRF for the contaminant Zn are 9 and 140, respectively. In fact, 40 samples for LAB and 132 samples for FPXRF of Zn had been analyzed in the HEPB project. The total cost was 5625(40) + 850(132) = 337,200 NTD, which substantially exceeds the budget of 170,000 NTD. The extra cost is 337,200 – 169,625 = 167,575 NTD if the optimum sample size of n = 9 and n' = 140 is used. As indicated above, n = m = 21 samples of both LAB and FPXRF had been taken for estimating the coefficient of determination Formula 232 in the first stage. Thus, the actual sample sizes of LAB and FPXRF were n = 21 and n' = 140, and the total cost was 5625(21) + 850(140) = 237,125 NTD with the TSP used. This implies that there were 21 – 9 = 11 LAB samples collected that were superfluous for Zn. The costs for collecting the samples for Cu and Pb also exceeded the budget, with 9 and 16 superfluous LAB samples collected for Cu and Pb, respectively.

These results show that unnecessary samples were taken after using the TSP with a high correlation coefficient of determination. The preliminary sample size m recommended by Tenenbein (1974) is based on the assumption that {rho}2 ≤ 0.9. As shown in Table 1, the increase in the variance could be very high when {rho}2 > 0.9. The TSP is not adequate in the case of the Hsiang-Shan industrial park. We suggest using the estimated Formula 232 of 0.99 to calculate the estimated value n when {rho}2 > 0.9.

Estimation of the Sample Size by the Extended Three-Stage Procedure
The sample size n = 7 if Formula 232=0.99 is used to determine the optimum sample size in the second stage for Zn, Cu, and Pb. The regression lines in Fig. 6 are the dotted lines. The coefficients of determination obtained from the seven pairs of LAB–FPXRF measurement were 0.996, 0.998, and 0.983 for Zn, Cu, and Pb, respectively (Table 5). Since the Formula 232 is known, the optimum n for Zn, Cu, and Pb were 5, 4, and 5, respectively, and the optimum n' were 167, 173, and 167, respectively. This shows that only two to three superfluous LAB samples were collected for these three metals using extended TSP.


View this table:
[in this window]
[in a new window]

 
Table 5. Optimum sample sizes and their statistics inferred from seven pairs of measurements of Zn, Cu, and Pb using the extended three-stage procedure.

 

    CONCLUSIONS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
A sampling work plan is very important for remediation of environmental pollution. Site characterization is usually restricted by the budget and the accuracy of the data measurement. To avoid unnecessary expense, DS is an excellent method if there are two or more contrasting techniques for analyzing sample units. Data from the study area indicate that DS is useful in HEPB for two reasons: (i) the data from the LAB and FPXRF measurements exhibited a linear relationship with a high correlation; and (ii) the FPXRF analysis was substantially less expensive than the LAB analyses.

This study assessed a complete DS program that was slightly different from the TSP provided by Tenenbein (1974) for conditions of a fixed budget. The optimum allocation of the samples was determined under fixed-budget conditions based on the theory of DS. Since field data are time consuming and costly to obtain, this study provided a useful and valuable case study for designing a site characterization work plan to monitor environmental pollution, considering both cost and accuracy. In addition, the results from this study may be more generally useful in determining the optimum allocation of samples for reducing the cost of sampling.


    ACKNOWLEDGMENTS
 
This study was partly supported by the Taiwan National Science Council under Grant NSC 94–2211–E–009–012.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permission for printing and for reprinting the material contained herein has been obtained by the publisher.

Received for publication March 27, 2006.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow A correction has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chang, Y.-C.
Right arrow Articles by Yeh, H.-D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Chang, Y.-C.
Right arrow Articles by Yeh, H.-D.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Chang, Y.-C.
Right arrow Articles by Yeh, H.-D.
Related Collections
Right arrow Data acquisition and assimilation
Right arrow Soil Pollution
Right arrow Statistics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Vadose Zone Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome