SSSAJ Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (24)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yanai, R. D.
Right arrow Articles by Binkley, D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Yanai, R. D.
Right arrow Articles by Binkley, D.
Agricola
Right arrow Articles by Yanai, R. D.
Right arrow Articles by Binkley, D.
Related Collections
Right arrow Forest Soils
Published in Soil Sci. Soc. Am. J. 67:1583-1593 (2003).
© 2003 Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA

SYMPOSIUM

Detecting Change in Forest Floor Carbon

Ruth D. Yanai*,a, Stephen V. Stehmana, Mary A. Arthurb, Cindy E. Prescottc, Andrew J. Friedlandd, Thomas G. Siccamae and Dan Binkleyf

a SUNY College of Environmental Science and Forestry, Syracuse, NY 13210
b Univ. of Kentucky, Lexington, KY 40546
c University of British Columbia, Vancouver, BC V6T 1Z4
d Dartmouth College, Hanover, NH 03755
e Yale Univ., New Haven, CT 06511
f Colorado State Univ., Fort Collins, CO 80523

* Corresponding author (rdyanai{at}mailbox.syr.edu).


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Changes over time in forest soils are important to global C balance and to local ecosystem function. Detecting change in C storage in the forest floor is hampered by high variability and the use of study designs that are not adequate to statistically detect change. Using estimates of variability from previous forest floor studies, mostly conducted in the northern USA and Canada, we conducted statistical power analyses to assess the ability of such studies to detect various magnitudes of change in forest floor C. The studies we surveyed were unable to detect statistically significant changes in forest floor C or mass smaller than 15 to 20%. Studies that remeasure plots or sites (i.e., paired designs) have greater statistical power to detect changes than those in which experimental units are independently located for the two sampling dates. The causal mechanisms of forest floor change influence the magnitude of the change, and accordingly our ability to detect such changes. The direct effects of climate change may be too small to be detectable by current designs, but larger changes in forest floor mass resulting from forest management, changes in tree species, changes in fire regime, or the introduction of earthworms are more likely to be detectable. With paired resampling and more efficient allocation of sampling effort, it should be possible for future studies to detect smaller changes.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
THE FOREST FLOOR is comprised of litter (leaf, root, and fine woody material) and partially decomposed organic matter that accumulates above the mineral soil in many forested ecosystems. Some forest floors also contain substantial amounts of mineral particles that are mixed from below by animals or other agents. The wide diversity of structures, masses, and compositions of forest floors suggested that they might hold the key to understanding major features of forests, such as productivity and sustainability. Early research focused on how forest floors varied across landscapes in response to climate factors, plant species composition, and management practices.

In recent decades, interest has developed in understanding and quantifying rates of change in these horizons, not just the causes of variation from place to place. This interest has been spurred by the need to estimate rates of change in ecosystem C content, or net ecosystem production (Roy et al., 2001). For example, Richter and Markewitz (2001) documented that aggradation of the forest floor (about 1 Mg ha-1 yr-1) accounted for about 20% of the net ecosystem production of a loblolly pine (Pinus taeda L.) plantation in South Carolina. Assessing the rate of change in soil C is also fundamental for quantifying belowground production in forests using mass balance approaches. Belowground allocation of C can be calculated from measured inputs, outputs, and net change in storage in the soil (Raich and Nadelhoffer, 1989; Giardina and Ryan, 2002).

Detecting change in forest floors has proven to be very challenging. Some of the difficulty in detecting change is because of the spatial variation in forest floors within stands, which can indeed be considerable due to factors such as tip-up pits and mounds, local variation in topography, and inputs of coarse woody debris that vary greatly over time and space. Detecting change in the forest floor over time is also limited by experimental designs that were not optimized for this purpose.

The objectives of this report were to synthesize information on forest floor masses and variability across a variety of forest types in North America, to assess the magnitude of change in forest floor C content that can be detected using various experimental designs, and to estimate the effort required to detect a specific change in forest floor C. Finally, we considered the ecological factors that cause changes in the forest floor, and the likelihood of detecting changes of this magnitude.


    STATISTICAL BACKGROUND
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Detecting Change in Forest Floors
When monitoring change in forest floor C storage, we commonly measure the same system at two points in time and test whether the two populations are statistically different for the two dates. In the case of forest floor measurements, we often discover that they are not. When this happens, it is important to understand the performance characteristics of the study design and accompanying statistical tests. If the study design is inadequate, then reporting a failure to detect a statistical difference tells us little; the mass may have changed substantially, but the design was unable to detect it. On the other hand, if a well-designed study fails to detect change statistically, we are relatively confident that any real change that might have occurred was small.

Power Analysis
Power is a key performance characteristic of the statistical tests used to evaluate change over time. Power is defined as the probability of correctly rejecting a null hypothesis of no change when in reality a change has taken place (i.e., the probability of detecting a change, such as an increase or decrease in soil C, as statistically significant, if such a change has truly taken place). In designing studies to detect changes in the forest floor, power analysis allows us to choose an appropriate sampling scheme, and to judge whether the study is even worthwhile to conduct given the magnitude of change expected and the sampling resources available.

Power is determined by several factors, including characteristics of the design protocol such as the type of experimental unit (e.g., stand or plot), the experimental design (i.e., paired versus independent samples for the two sampling dates), and the number of samples; the significance level chosen for the test and whether the alternative hypothesis is one-sided (for a directional change) or two-sided (when the direction is not specified a priori); the variability of the attribute measured; and the size of the effect or magnitude of change to be detected. We will discuss the role of each of these factors in the power analyses.

Experimental Unit
When evaluating change in forest floor C, the experimental unit may be a stand, a plot within a stand (from which multiple samples are taken), or a point location. Evaluating change then involves statistical inference to extrapolate or generalize from a sample of experimental units (i.e., replicates) to a population of experimental units. Thus a study design in which the experimental unit is the stand permits inference to a population of stands, and a design in which the experimental unit is a plot within a stand permits inference only to that stand. More general inferences applicable to a larger spatial scale are clearly derived from studies in which the experimental unit is a stand. The study design typically includes subsampling within the experimental units. For example, if the experimental unit is a stand, several plots may be sampled within each stand, and these plots may themselves be subsampled. The influence of subsampling on variability is addressed in the Discussion section (below).

Experimental Design
The two experimental designs we address are independent and paired. The samples for the two sampling dates are regarded as independent if different experimental units are measured each time. In a paired design, the same experimental units are visited at both dates. Pairing does not require the exact locations on the forest floor to be revisited, but rather requires the same stand or same plot to be measured at both times.

The variability affecting power to detect change depends on the study design. For the independent design, the variability among the experimental units of the measured soil attribute for each sampling date is the relevant variation. We make the simplifying assumption that the variance among experimental units remains the same for the two time points. This assumption is required for a pooled t-test, an analysis conducive to simple power calculations. If variability is considerably different for the two time points, the actual analysis may be a separate variance t-test (Zar, 1996, p. 129), or a variance-stabilizing transformation could be applied (e.g., square root or logarithm) and the means compared on this transformed scale. Power will differ slightly depending on the analysis chosen, but the essential relationship of power with sample size, variability, and magnitude of change is captured by the power analyses derived for the pooled t-test. When evaluating independent samples, the time interval can be any length because the variability at the first sampling date is independent of the variability at the second sampling date.

For the paired design, power is determined by the variability among the differences calculated for each pair of experimental units. The variability of the differences is specific to the length of time between sampling dates. That is, if the paired data represent observations 10 yr apart, the measure of variability is specific to evaluating power for a change over a 10-yr period. Data observed 5 yr apart would not be expected to show the same variability of differences as data 10 yr apart, even when observing the same pairs of experimental units.

Statistical Analysis
The statistical analysis requires specification of the null and alternative hypotheses and significance level for the test. A one-sided alternative hypothesis is chosen if change is expected in a certain direction (such as an increase in forest floor C). Testing a one-sided alternative hypothesis is more powerful, but if ecological theory does not dictate an a priori direction for the change, a two-sided alternative hypothesis must be employed.

The significance level or {alpha} level of the test is the probability of committing a Type I error. A Type I error occurs when the null hypothesis is actually true, but it is incorrectly rejected by the statistical test. Power increases as the significance level {alpha} is increased. Consequently, reducing the chance of a Type I error competes with power, and a compromise choice must be sought. Commonly suggested values for minimal acceptable power range from 0.7 to 0.8 (Lenth, 2001), and the traditional significance level chosen is {alpha} = 0.05. A study designed to achieve power, for example, of 0.75 for {alpha} = 0.05 balances the need to avoid the undesirable effect of the statistical test not detecting a change when one has occurred (i.e., a Type II error) with the desire to avoid a Type I error. The significance level is more stringent than the power level because committing a Type I error is viewed as the more egregious mistake. That is, falsely claiming a change has occurred when in fact it has not is more problematic than failing to detect change when it has occurred. Excessive power represents inefficient use of experimental resources, with the resources expended to gain the extra power better re-allocated to address other research objectives.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Survey of Forest Floor Studies
We used previous studies of the forest floor, mostly conducted in the northern USA and Canada, to investigate power to detect changes in forest floor C. We used studies with which we were involved as investigators, studies published in the literature, and studies reported to us by other investigators. We found that published studies, including our own, often did not report all the information required to make our analyses. For example, we commonly pair samples, conduct a paired t-test, and report its significance, but we rarely report the standard error of the paired differences.

A variety of experimental designs and sampling methods were employed in these studies. Important variations include the size of forest floor samples and the number of samples collected. Samples are sometimes combined (composited) before analysis to produce a more representative sample or to reduce the analytical load without reducing spatial coverage. Some studies were conducted at a single site, while others used multiple stands, sometimes of multiple forest types. For studies that included treatments, we have focused on the unmanipulated controls. The variability in these forest floors may reflect past disturbance and forest management but not the immediate post-treatment effects of fertilization, irrigation, or burning.

Methods of collecting forest floor samples vary, and the method can affect both the mean and the variance of C in the forest floor. The common sampling protocol for forest floor mass is destructive: a template is placed on the soil surface, and the O horizon is removed for analysis. As a result, soil measurements cannot be repeated exactly, unlike non-destructive measures such as tree diameter. The boundary between the forest floor and the underlying mineral horizon may be defined at 20% organic C (Soil Survey Staff, 1975), but this cutoff is difficult to determine in the field (Federer, 1982). Variation can be reduced by reporting the mass of the forest floor as mass of organic matter or C per unit of area, rather than as total mass. Because the organic matter concentration in the mineral soil is low, errors introduced in the identification of the boundary between the organic and mineral horizons are smaller for organic matter or C than for mass (Federer et al., 1993).

It can be helpful, in some forest floors, to take blocks of a size (commonly 15 cm on a side) that can be lifted, turned over, and mineral soil removed from the bottom up with a good view of horizons from the side. Top-down sampling tends to collect less material, as in the quantitative pit method (Huntington et al., 1988), which allows for quantification of coarse fragments. The other common method is the use of a frame, commonly 32 cm on a side, from which the forest floor is removed from the top down. Taking larger samples has the advantage of minimizing variation introduced by errors at the edge. When comparing measurements made at different times, it is important to be aware of the possible bias introduced by changes in sampling methods. Sampling depth is clearly important to the amount of forest floor collected. Other important differences between sampling methods include the treatment of rocks and the inclusion or exclusion of buried wood and fine roots.

Statistical Methods
We calculated the magnitude of change that could be detected given the actual sample size of the study and the predicted sample size required to detect a 20% change in the mean, both assuming power of 0.75 and the given variability and study design. Both analyses evaluate a null hypothesis of no change in mean forest floor C versus a two-sided alternative hypothesis of change in mean (either increase or decrease). The significance level ({alpha}) chosen was 0.05. The magnitude of change was expressed as percentage of change relative to the mean for the first sampling date.

For the paired designs, power was based on a stand as the experimental unit. For independent designs, either a stand or a plot served as the experimental unit. For the independent design calculations, variances were assumed equal for the two sampling times. For those cases in which a plot was considered the experimental unit, data for multiple stands of the same forest type were often available. Rather than evaluate every stand individually as a separate population, the within-stand variability was computed for each stand, and the average standard deviation for all stands of that type was used in the power analysis. That is, typical within-stand variability was used to characterize all stands of the same type in a study.

The power analysis calculations derive from the formula {delta} = (t{alpha},{nu} + tß, {nu}), where {delta} is the difference detected, s is the standard deviation of the paired differences, n is the number of pairs, t{alpha},{nu} is the (1 - {alpha}/2) x 100 percentile of the t-distribution (in the case of a two-tailed test), tß, {nu} is the 100 x (power) percentile of the t-distribution, {alpha} is the significance level of the test, and {nu} is the degrees of freedom (n - 1), and ß is 1 minus the power (Zar, 1996, p. 109). This equation can be solved for n, the number of experimental units required to detect a given {delta}. When analyzing power of an independent design, we replace by , s is the standard deviation common to both sampling dates, and {nu} is (2n - 2) where n is now the number of experimental units for one sampling date (Zar, 1996, p. 135). Power calculations were conducted using version 13 of the MINITAB statistical program (MINITAB, Inc., 2000). Power results from MINITAB were confirmed by comparison with worked examples in Zar (1996).

Relaxing the significance from 0.05 to 0.10 reduces the calculated sampling effort. The sample size required to detect a 20% change (or any other percentage of change for that matter) using {alpha} = 0.10 reduces by about one quarter the number of experimental units compared with {alpha} = 0.05. Similarly, if a one-sided test could be justified, the number of experimental units required to detect a given change would be reduced by about one quarter compared with the two-sided tests used here. Other combinations of power and significance can be computed using the formula described above.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Forest Floor Studies Surveyed
A total of 21 studies were included in our survey, with most in the north temperate zone, and a few from tropical and boreal forests (Tables 1 to 3). The forest floor organic mass and C content were highest in spruce and fir forests, generally higher in northeastern than in northwestern North America, and lower in the South and in the tropics.


View this table:
[in this window]
[in a new window]
 
Table 1. Change over time in paired observations of forest floors, with the magnitude of change required to be detectable at {alpha} = 0.05 with power = 0.75 and the sampling intensity required to detect a 20% change.

 

View this table:
[in this window]
[in a new window]
 
Table 3. Forest floor organic mass in studies of multiple stands, with the magnitude of change required to be detectable at {alpha} = 0.05 with power = 0.75, and the sampling intensity (number of stands) required to detect a 20% change, using the variation in the measurements at one point in time (i.e., without pairing).

 

View this table:
[in this window]
[in a new window]
 
Table 2. Forest floor organic mass within stands, with the magnitude of change required to be detectable at {alpha} = 0.05 with power = 0.75, and the sampling intensity (number of samples or plots) required to detect a 20% change, using the variation in the measurements at one point in time (i.e., without pairing). For studies of multiple stands, this table shows the average of stands; Table 3 shows the power analysis across stands.

 
The forest floor sampling designs varied considerably and we restricted our survey to quantitative studies using small blocks or frames. The size of forest floor blocks collected ranged from 10 to 32 cm on a side; the area of a sample thus varied by an order of magnitude, from 0.01 to 0.10 m2. The number of samples collected also varied widely. The fewest number of samples per experimental unit was three to five for studies that had large numbers of experimental units (three to 30 stands). When a single stand was studied, the number of samples collected ranged from 20 to 90.

Power Analyses: Case Studies
We first present two case studies to illustrate general features of power to detect change in forest floor C. The first case study employs a paired design to evaluate change at a regional scale (i.e., for a population of stands). The second case study is an unpaired design aimed at detecting change within a single stand. In the first study, the experimental unit is a stand, and in the second, the experimental unit is a plot.

In the first study, 30 stands were sampled across the northeastern USA in 1980 and 1990, to measure changes in Pb in the forest floor (Friedland et al., 1992). We applied a power analysis to the repeated measurement of forest floor mass in these stands. Since the same stands were measured at two points in time, the analysis applies to the paired differences. Figure 1 displays the characteristic increase in power with both increasing sample size and increasing percentage of detectable change. At one extreme, power to detect a change of 10% is low unless the sample size exceeds 80. To detect a 30% change, a sample of only 15 to 20 stands will yield high power. The specific properties of the Friedland et al. (1992) study design can be assessed from these curves. In this study, with 30 experimental units and a paired study design yielding the variability observed (s = 1.862), a change of 17% would be detectable with power = 0.75 for an {alpha} = 0.05 test. Alternatively, we may frame the question in terms of sample size required to detect a specified percentage of change at a given level of power. For example, holding power at 0.75, we would need approximately 28 experimental units (stands) in the Friedland et al. (1992) study to detect a change of 20%, but approximately 100 experimental units to detect a change of 10%.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 1. Power to detect different magnitudes of change in forest floor organic mass for various sample sizes, using the variance of paired differences measured in a regional study of 30 stands (Friedland et al., 1992).

 
The efficacy of pairing is illustrated by treating this same study as if the stands were not paired, but were located independently in the two sampling efforts. The among-stand variability used in the analysis was derived by pooling the variability for the two sampling dates. The number of experimental units required to detect a specified percentage of change is displayed for power = 0.75 and {alpha} = 0.05 for both a paired design and an independent sample design with stands serving as the experimental units (Fig. 2) . To cover the potential range of variability for the paired design, power curves based on three values for the variability of the paired differences are shown: the actual variability observed, and variability derived from the upper and lower 95% confidence intervals for the standard deviation of the paired differences. Regardless of the level of variability assessed, the paired design outperforms the independent study design. For example, given a sample size of 30 experimental units, the detectable change with power of 0.75 increases to 33% for the independent design from 22.5% for the worst-case scenario of the paired design (s = 2.405). To detect a change of 20%, the paired design would require approximately 38 experimental units based on the worst-case variance scenario, whereas the independent design would require 80.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 2. The sample size required to detect a given percentage of change with power = 0.75. The three curves represent different levels of variability derived from a regional study of 30 stands (Friedland et al., 1992). The middle curve is based on the observed variance of the paired differences, and the upper and lower curves are based on the upper and lower bounds of the 95% confidence interval for the standard deviation of the differences.

 
Our second case study applies power analysis to the independent sample design for a single site (Fig. 3) . In the reference watershed at Hubbard Brook, 60 to 80 samples were collected from random locations at each sampling date (Yanai et al., 1999). The site includes considerable spatial variation, ranging from northern hardwoods at low elevations to spruce-fir at high elevations. We used the variance at one point in time to predict the change that would be detectable if a future sample had the same variance and the same sampling intensity. The three power curves represent three levels of variability determined from the within-site standard deviations for three different sampling dates. To illustrate the importance of variance in the calculation, we note that the percentage of change detectable for a sample of size 70 is 27% for the low standard deviation (s = 4.03), 31% for the middle standard deviation (s = 4.66), and 42% for the high standard deviation (s = 6.27). To detect a change of 20% would require 219 samples, on average, and to detect a change <10% would require more than 700 samples. Pairing individual samples, or locating samples randomly within plots that could be resampled over time, would presumably increase power, but the variation over time in paired differences was not available to make that calculation.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 3. Power analysis of independent samples of a watershed at Hubbard Brook (Yanai et al., 1999). High, middle, and low values of variance correspond to data collected in 1987, 1997, and 1982, respectively.

 
Power Analyses: General Survey
The results displayed in Tables 1 through 3 catalog power of an assortment of previous studies over a range of forest types, variability, and sampling schemes. The two columns representing the power results are the percentage of detectable change, which is linked to the specific sample size of the reported study, and the sample size required to detect a 20% change in mean forest floor C, which is dependent on the variability of the reported study. Both power-related calculations are derived for an {alpha} of 0.05 and power of 0.75. The magnitude of change that could be detected using current sampling schemes ranged from about 15% to well over 100% (Fig. 4) . Table 1 shows results for a paired experimental design using a stand as the experimental unit. Table 2 results are based on a plot (within stand) as the experimental unit, with independent plots measured at the two sampling dates. Table 3 displays results for stands as the experimental units, but for the independent experimental design.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 4. Frequency distribution of detectable change in the 21 studies we surveyed. There are more than 21 observations because some studies were analyzed for both paired and independent designs, or with both plots and stands as experimental units.

 
Tables 1 through 3 are intended for use as case study information, and also to reinforce general relationships between power and characteristics of the study. As case study information, we could use the results to guide design of a forthcoming study. By focusing on the forest type, region, and sampling scheme, we could locate the study most similar to the one being designed, and approximate the sample size needed to detect a certain percentage of change (or conversely, the percentage change that would be detectable with the planned sample size). When reading the tables for general features, we recognize the same qualitative relationships available from Fig. 1 through 3. The percentage of change detectable becomes smaller as the number of replicates (experimental units) increases and as variability decreases. The column showing sample size required to detect a 20% change is the more general information because it depends only on the variability of the particular study. Especially for Table 2, the percentage of detectable change for a given study should be viewed with the recognition that we are reporting results for plots as experimental units, even though the study may have been designed to use stands as experimental units. These Table 2 results are still useful to characterize within-stand variation, but we caution that detectable change shown for some examples does not reflect the actual study implemented. A final caution is that the standard deviation is estimated poorly from small samples. In such cases, results of the power calculations should be regarded as rough approximations.

In almost all cases in which stands are used as the experimental units, we would expect the experimental design to be paired. A potential use of the Table 3 results would be for a meta-analysis type approach in which results from multiple stands measured in unrelated studies would be combined to provide inference to a population of stands. For example, suppose we were able to assemble observations on forest floor C for 10 stands in 1990. In 2000, we assemble similar data from 12 stands, none being the same as the 1990 stands. Assuming that the measurement protocols and stand characteristics permit comparison, we could use these data to test for a change in mean condition between 1990 and 2000 using the independent sample t-test. Because of the wealth of good stand-specific data, such analyses may be productive.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Factors Likely to Produce Change in Forest Floors
We have focused thus far on the magnitude of change in forest floor organic mass or C that can be detected in various forest types using common sampling schemes. The fact that the best study designs were able to detect a difference of about 15 to 20% (Tables 1 to 3) is good news if the changes we seek to detect are this large. On the other hand, if it were important to detect smaller changes, then designs commonly used in the past would be inadequate. Here we consider the magnitude of forest floor change likely to result from fire, forest management, climate change, species composition, and N fertilization, to see whether these changes might be detectable.

Fire decreases forest floor mass, and the magnitude of the decrease depends on the intensity of the burn. Hot wildfires can consume much or most of the forest floor organic matter, leaving only an ash residue on top of mineral soil. Cooler fires (such as controlled, prescribed fire) may burn only the litter layer (Oie or L and F horizons), leaving the Oa (H) layer mostly unburned (Fisher and Binkley, 2000). Reported reductions in forest floor organic matter range from around 15%, which may be difficult to detect, to up to 100% (DeBano et al., 1998), which would of course readily be detected. Fire regime can be expected to change as a result of changing climate (Neilson, 1993), but the likelihood of widespread fire is difficult to predict. Similarly, an increase in weather extremes could lead to disturbances such as hurricanes, ice storms, and blow downs (New England Regional Assessment Group, 2001), which could increase forest floor mass, and potentially increase mixing with underlying mineral soil.

The effect of forest management on the forest floor has been well studied, especially for clear-cutting. Partial cutting could be expected to cause smaller changes. Large effects on forest floor C storage are possible, in either direction (Johnson, 1992). Increases in forest floor mass after logging range from 22 to 61% (Hendrickson et al., 1989; Mattson and Swank, 1989; Johnson et al., 1985), presumably due to the incorporation of logging residues left on site. Reported decreases in forest floor mass have ranged from 17 to 71% (Mattson and Smith, 1993; Brais et al., 1995; Johnson et al., 1995), probably due to disturbance from logging equipment and the dragging of logs, which mix forest floor material into the mineral soil (Martin, 1988; Ryan et al., 1992) where it is no longer sampled as part of the forest floor. Attributing all forest floor C losses after harvest to increased decomposition rates (Sartz and Huttinger, 1950; Trimble and Lull, 1956; Covington, 1981) may not be appropriate (Yanai et al., 2003).

Increasing the productivity of northern forests through N fertilization might lead to detectable increases in forest floor mass if the increase in litter production (Miller et al., 1976) were not offset by a similar increase in decay rate. Forest floor mass increased by 10 to 26% in a Scots pine (Pinus sylvestris L.) forest following N fertilization (Nohrstedt et al., 1989) and C storage in the humus layer of Scots pine and Norway spruce stands increased by 14 to 87% with repeated N fertilization (Makipaa, 1995). Reductions in forest floor microbial activity have often been reported in fertilized forests (Agren et al., 2001). Although studies of the effects of N fertilization on decay rates have been inconsistent (Prescott, 1995), there has been a trend toward slower decomposition of high (>16%) lignin litters following N fertilization (Hunt et al., 1988; Magill and Aber, 1998; Carreiro et al., 2000). Nitrogen fertilization appears to increase the proportion of litter that becomes humus and to slow humus decay in particular (Berg et al., 1987; Berg and Eckbohm, 1991; Magill and Aber, 1998; Cotrufo et al., 2001), which would have a greater influence on forest floor mass than would effects during early stages of decay. Thus, fertilization of northern coniferous forests may well result in a measurable increase in forest floor mass.

Global warming could be expected to reduce forest floor mass because of the strong influence of temperature on rates of litter decomposition. Several transect experiments have shown that decomposition rates are positively correlated with mean annual temperature (Berg et al., 1998; Moore et al., 1999), and artificial warming of litter layers has consistently resulted in increased rates of CO2 evolution and mass loss (Peterjohn et al., 1993; Hobbie, 1996; Winkler et al., 1996). Forest floors from cold sites appear to be most responsive, with Q10 values >2 at temperatures <5°C (Kirschbaum, 1995; Niklinska et al., 1999; Bottner et al., 2000). These responses to a step change in temperature may be short-lived, however; longer incubations or comparisons along natural temperature gradients indicate much smaller responses to warmer temperatures (Giardina and Ryan, 2000). In addition, responses to temperature may be limited if moisture or aeration is inadequate (Haynes, 1986).

These expected reductions in forest floor C as a result of faster decay may be offset by increased rates of litter input under changing climate. On a global basis, aboveground litterfall mass is positively correlated with mean annual temperature (Vogt et al., 1986), and increases in litterfall can be expected with the increases in net primary production predicted under likely climate change scenarios (VEMAP, 1995). In a meta-analysis of 32 ecosystems worldwide (Rustad et al., 2001), a mean experimental increase in soil temperature of 2.4°C resulted in an average increase in aboveground plant productivity of 19%, compared with an average increase in soil respiration of 20%. Thus, although either effect alone could be expected to produce a measurable effect on forest floor mass, the combined effect of increased litter production and increased decomposition rates might be undetectable.

Changes in species composition of forests could cause considerable changes in forest floor masses. Shifts in ranges of tree species have been predicted in response to climate change (Ivarson and Prasad, 1998; Walther et al., 2002). On a global scale there is a relationship between climate (actual evapotranspiration) and the quality (chemical and physical properties) of the litter (Aerts, 1997); thus global warming could enhance decay rates indirectly by promoting tree species with more readily decomposable litter. Hobbie (1996) reported larger differences in decay rates among different species than within a species at elevated temperatures (4–10°C) and concluded that both factors are important in controlling decomposition rates. The magnitude of the effect of tree species alone can be surmised from the data for common garden experiments in Table 2. Comparisons of forest floor mass or C under different species within a site ranged from 20 to 53% (Thomas and Prescott, 2000), 12 to 83% (Prescott et al., 2000b), and 127 to 1433% (France et al., 1989); so many such differences are readily detectable.

Increasing the proportion of broad-leaf species might be expected to reduce forest floor mass, because of higher litter quality and faster decay. However, the initial faster decay of broad leaves may be short-lived, producing at least as much humus as needles (Berg and Eckbohm, 1991; Berg et al., 1996; Prescott et al., 2001; Giardina et al., 2001). Other generalizations about faster nutrient cycling and the formation of mull humus following conversion to broad-leaf forests may be largely unwarranted (Binkley and Giardina, 1998). In the studies we surveyed, examples can be found of greater (Binkley et al., 1992a; Binkley, 1983) and lesser (France et al., 1989; Thomas and Prescott, 2000) forest floor under broad-leaf species (Table 2).

Changes in other species, such as earthworms, may have subtle or profound effects on soil organic matter. Depending on the species of earthworms and the quality of the organic matter, earthworms may limit losses of organic matter, modify the quality of organic matter, or reduce organic matter by accelerating decomposition (Hendrix, 1995; Lavelle, 1997). From the limited number of studies in forest systems, it appears that earthworms have great potential to reduce forest floor mass, but not necessarily total soil C. A recent study conducted in a mixed hardwood forest in central New York, for example, found 90% less forest floor mass in plots with worms than in those without (Bohlen et al., 2003). Losses of C from the forest floor were accompanied by increased C in the mineral soil, and should not be used directly in accounting for total change in C storage. Vimmerstedt and Finney (1973) found a dramatic loss of humus in reforesting strip-mine spoils in the 2 yr following introduction of earthworms. Again, these losses reflect both decomposition and incorporation of organic matter into the mineral soil. A longer-term study in northern Minnesota followed forest floor and soil C losses for 14 yr, during which earthworm numbers increased dramatically from 0 at the beginning of the study to nearly 600 m-2 (Alban and Berry, 1994). Forest floor mass decreased by about 85%, and total soil C (to a depth of 50 cm) decreased by 0.6 Mg ha-1 yr-1. In all of these studies the duration of the effect of newly introduced earthworms is unknown. Experiments conducted in tropical soils on various time scales have shown that earthworms significantly reduced organic matter in the short term, but in the long term they reduced decomposition rates compared with soils without earthworms (Martin, 1991). It is unknown to what extent protection of organic matter by earthworms operates in temperate soils.

Designing Experiments for Detection of Change
Commonly, we are interested in questions about populations of forest stands at broad scales. This suggests that defining a stand as the experimental unit presents the strongest case for inference. A change in mass for a single forest or watershed cannot be easily extrapolated with statistical confidence to other sites, although extrapolations can be made by subjective inference or by combining several single-site studies. When the experimental unit is a stand, subsampling within each stand will typically be necessary to estimate mean forest floor C for the stand (i.e., the experimental unit value input into the statistical test for change). The objective of the subsampling (i.e., within stand) design is to obtain the best estimate of this mean. A systematic sample is the most efficient within-stand subsampling design because it minimizes redundancy of information contributed by neighboring locations by maximizing the distance between sample locations. Rectangular grids are most practical in a field setting, although less efficient than triangular grids. Webster and Oliver (1990)(p. 272–273) concisely review these issues of subsampling efficiency.

For the paired design, the mean difference in forest floor C is the stand measurement of interest. Because we cannot sample exactly the same point locations for the two sampling dates, the question arises of how to locate the plots or points with the stand in the second sample. Although it is not necessary to pair at the subsample level, intuitively there would seem to be some advantage to this because each individual subsample would then be paired with a nearby location expected to yield a similar response. Forming the subsample pairs by locating the second subsample point as close as possible to the first point is a logical procedure. Judgment may be invoked to rule out apparently different, though spatially proximate locations. That is, there is no requirement that the paired subsampling location be selected at random, and there is good reason not to do so if local heterogeneity is apparent.

The second major issue of subsampling within stands is sample size. What is the optimal tradeoff between number of experimental units (i.e., stands) and number of subsampling locations within each stand? Standard formulas provide the recommended allocation (e.g., Kuehl, 2000, Section 5.9). These formulas require estimates for among-stand and within-stand variance, and relative cost of obtaining an experimental unit versus a subsample measurement. For the paired study design, these variance components would be derived from differences of paired locations at the stand level and within-stand level. Reliable estimates of such variance components are difficult to obtain. The studies shown in Table 2 provide estimates for the among-stand component for paired designs. Webster and Oliver (1990)(Chapter 13) present a detailed discussion relevant to estimating variance components for subsamples at various distances of separation. Further, they elucidate the relationship between regionalized-variable theory and nested sampling design by connecting information contained in a variogram to components at different spatial scales. In effect, if an appropriate variogram is estimated for a location, the expected contribution of subsampling variability can be approximated for a given spacing between the points of the systematic grid.

When the experimental unit is a stand, the mean of the subsample, not the individual subsample values, is sufficient information for the statistical analysis. Consequently, the effort allocated to sample processing and analysis could be greatly reduced by compositing or bulking samples within the experimental unit. This procedure is acceptable if the property measured is not affected by disturbing and mixing the soil (Webster and Oliver, 1990, p. 256–258). This permits greater replication of experimental units without significantly increasing costs. Even with pairing, sample sizes were typically too small in the studies we surveyed to detect changes in means of less than about 20%. Therefore, any increase in the number of experimental units would improve the power to detect small differences.

Our emphasis has been on assessing change based on the mean forest floor C. We note that future studies may expand the analyses to focus on patterns of spatial variability within stands (or other experimental units) rather than just on the mean values. For example, if the average forest floor mass of a stand increased by 20% over a period of 20 yr, how was that increase distributed across microsites of low or high initial mass of forest floor? How did the pattern of increase relate to the spatial pattern of the trees? We suggest that the full spectrum of spatially explicit sampling designs be explored in future studies; small increases in sampling effort may add new dimensions to the interpretations of change over time.


    ACKNOWLEDGMENTS
 
This paper was prepared for a symposium on approaches and technologies for detecting changes in forest soil C pools at the annual meeting of the Soil Science Society of America, 22 Oct. 2001. Eric Vance and Leandra Belvins helped to improve the manuscript. Funding was provided by the NSF, USDA-NRICGP, Andrew W. Mellon Foundation, and the Heinz Family Foundation.

Received for publication January 7, 2002.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 STATISTICAL BACKGROUND
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 




This article has been cited by other articles:


Home page
Soil Sci.Home page
L. C. Kiser, J. M. Kelly, and P. A. Mays
Changes in Forest Soil Carbon and Nitrogen after a Thirty-Year Interval
Soil Sci. Soc. Am. J., March 1, 2009; 73(2): 647 - 653.
[Abstract] [Full Text] [PDF]


Home page
Soil Sci.Home page
C. H. Shaw, J. R. Boyle, and A. Y. Omule
Estimating Forest Soil Carbon and Nitrogen Stocks with Double Sampling for Stratification
Soil Sci. Soc. Am. J., September 30, 2008; 72(6): 1611 - 1620.
[Abstract] [Full Text] [PDF]


Home page
Soil Sci.Home page
J. M. Kelly and P. A. Mays
SOIL CARBON CHANGES AFTER 26 YEARS IN A CUMBERLAND PLATEAU HARDWOOD FOREST
Soil Sci. Soc. Am. J., April 11, 2005; 69(3): 691 - 694.
[Abstract] [Full Text] [PDF]


Home page
Soil Sci.Home page
A. J. VandenBygaart and B. D. Kay
Persistence of Soil Organic Carbon after Plowing a Long-Term No-Till Field in Southern Ontario, Canada
Soil Sci. Soc. Am. J., July 1, 2004; 68(4): 1394 - 1402.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (24)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yanai, R. D.
Right arrow Articles by Binkley, D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Yanai, R. D.
Right arrow Articles by Binkley, D.
Agricola
Right arrow Articles by Yanai, R. D.
Right arrow Articles by Binkley, D.
Related Collections
Right arrow Forest Soils


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Vadose Zone Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome