SSSAJ Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 2 June 2005
Published in Soil Sci Soc Am J 69:967-975 (2005)
DOI: 10.2136/sssaj2004.0186
© 2005 Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (1)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Persson, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Persson, M.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Persson, M.
Related Collections
Right arrow Soil Methods/Instrumentation
Right arrow Soil Physics

Soil Physics

Accurate Dye Tracer Concentration Estimations Using Image Analysis

Magnus Persson*

Dep. of Water Resources Engineering, Lund Univ., Box 118, SE-221 00 Lund, Sweden

* Corresponding author (magnus.persson{at}tvrl.lth.se)


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
In this paper, the accuracy of dye tracer concentration estimations using image analysis is examined. The variability before and after application of different image correction methods was investigated in three experiments using a digital camera. In each of these experiments, one correction was applied and the remaining variability after correction was calculated as the SD of uniformly colored patches on a color scale. Correction for inhomogeneous illumination results in a relatively small remaining variability. The variability after correction for different color temperatures (or white point) was larger if the correction was made for image files directly from the camera. However, when using the raw data from the image sensor in the camera, the remaining variability was significantly reduced. A calibration experiment was also conducted, in which photographs of calibration samples of dye-stained soil were taken. Between 72 and 260 samples were prepared for each of three soils. The samples had dye concentrations from 0 to 1.5 g L–1. Effects of exposure settings and calibration model were investigated. The exposure settings only affected the results significantly in one soil which became dark when the dye concentration was high. Overexposure made the image lighter and the root mean square error (RMSE) of the concentration estimate decreased for this soil. By applying a neural network (NN) model, the RMSE of the dye concentration estimates could be as low as 0.0747 to 0.0944 g L–1. Reasonable accuracy (0.10–0.13 g L–1) could also be achieved with a polynomial calibration relationship derived from around 20 soil samples.

Abbreviations: CCD, charge coupled device • HSI, hue, saturation, and intensity • HSV, hue, saturation, and value • NN, neural network • RGB, red, green, and blue • RMSE, root mean square error


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
DYE TRACERS have been used for many years by soil scientists investigating the effects of soil heterogeneity, as they allow visualization of spatial flow patterns (e.g., Flury and Flühler, 1995). In field experiments, dye-stained water is infiltrated into the soil, vertical or horizontal sections are subsequently excavated, and the dye patterns are photographed. This method has proven very useful for detecting preferential flow paths in the soil. Dye tracers have revealed preferential flow including vertical plumes (Kung, 1990), flow in fissures and worm channels (Lin and Mcinnes, 1995), and along ped faces and in cracks (Yasuda et al., 2001).

Traditionally, image analysis of the dye photographs has only involved separation between stained and nonstained soil. However, during the 1990s, image analysis improved to the extent that estimation of dye concentration from soil color was possible (Aeby et al., 1997; Ewing and Horton, 1999), and just recently this method has been applied to solute transport studies. Several factors apart from dye concentration will affect the color of the dye-stained soil. The most important ones are the intensity and color temperature of the illumination. Forrer (1997) and Forrer et al. (2000) calculated concentration of the dye Brilliant Blue FCF in field experiments. They applied corrections to the original photographs for geometrical distortion, inhomogeneous illumination, and differences in white balance (which they called color tinge). Small soil samples taken from the photographed sections were analyzed and the dye concentration was determined. A depth-dependent relationship was found between the dye concentration and the color of the soil samples. Stadler et al. (2000) performed dye infiltration experiments in three different frozen soil samples. They used three different calibration relationships in their study, one for each soil. Morris and Mooney (2004) instead defined four different concentration classes based on threshold values of image parameters. Weiler and Flühler (2003) used a more complex method to classify dye-stained soil into three concentration categories. Aeby et al. (2001) and Vanderborght et al. (2002), using fluorescent dyes, applied similar corrections as Forrer (1997) to their photographs. The advantage of fluorescent dyes is that several different dyes can be used simultaneously. However, more advanced equipment is needed, including special lamps and optical filters.

The correction methods described above are not ideal. For an ideal correction method, the resulting variability in the color of uniformly colored objects after correction would be the same as for images taken under identical conditions, that is, the variability will be equal to image noise. Previous studies have focused on image analysis; however, the accuracy of different correction procedures and their influence on dye concentration estimations has not been studied in detail. The choice of camera, lens, and camera setting is also fundamental to the image quality, and thus the accuracy of the concentration estimation.

The objective of this study was to (i) quantify how effective different correction methods are by calculating the remaining variability after correction, and (ii) quantify errors in dye tracer concentration estimates using image analyses in different soil types. Four experiments were conducted. The first three were done to study different correction methods. In each of these three experiments, one correction was made and the resulting variability in the color of uniformly colored objects was calculated. In the fourth experiment, the calibration experiment, photographs of dye-stained soil samples were taken. Different calibration models were tested to describe the relationship between soil color and dye concentration. Recommendations for future experiments are made on the basis of the results.


    THEORY
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Color Space and Exposure
The computer (and its monitor) perceive or describe colors as combinations of the additive primary colors red, green, and blue (RGB). This RGB color space is one of the most common and is that normally used by digital cameras. Two other color spaces commonly used in image analysis are hue, saturation, and intensity (HSI) and hue, saturation, and value (HSV). One of the advantages of using these color spaces is that an object of a specific color has the same hue and saturation irrespective of lightning intensity changes. However, studies have shown that saturation is not completely independent of brightness (i.e., the intensity or value) using these color spaces (Hanbury, 2002).

In the RGB color space, all colors are represented by combinations of red, green, and blue, each of which is defined by a value. In many practical implications, these values are represented by an integer value in the range 0 to 255 (8 bits). Black is then represented by R = G = B = 0, and white by R = G = B = 255. All perfectly gray shades with no color cast have R, G, and B values that are equal.

When taking a photograph, two parameters determine the exposure, that is, the amount of light reaching the film or sensor; the aperture and the shutter speed. The aperture f/ is defined as the focal length of a lens divided by the diameter of the lens opening.

To determine the correct exposure and white balance settings, a gray card can be used. A gray card is 50% gray and has a reflectance of 18% (the relationship between density and reflectance is logarithmic) and is commonly used in photography to calculate the correct exposure settings as the cameras light meter is calibrated to this value. If correctly exposed, the RGB values of a photographed gray card should have the values R = G = B = 128.

Correction for White Balance
White balance is a name given to a system of color correction to deal with differing lighting conditions. Normally our eyes compensate for different lighting conditions, for instance, a white object will appear to be white under different lighting. Each type of light can be represented by a numerical color temperature. The color temperatures of typical outdoor lighting conditions are 4800 to 6200 K and for shady conditions 6200 to 7800 K. Most traditional films are designed to give accurate colors for daylight conditions, but there are also films specially designed for other light sources. The color temperature of the light hitting the film can be changed by using camera filters. A digital camera must find the so-called white point (i.e., white objects should appear white) to correct other colors cast by the same light, and this is normally handled by the camera's software. In many digital cameras, the white balance can also be preset by taking a photo of a white or gray area. After the white point has been determined, the cameras software calculates the RGB values of each pixel using the white balance as a reference. This correction can also be made for any image using image analysis software. Forrer (1997) described how photographs taken under different lightning conditions can be corrected to a common norm using a reference gray scale visible on all photographs. The median RGB values of fifteen different fields on the gray scale were used to generate a relationship between the images' RGB values and the corrected images with no color cast.

Correction of Inhomogeneous Illumination
Photographs of dye-stained soil generally show spatial inhomogeneities in illumination, which can be corrected for using a process called background subtraction. This process requires a reference photograph taken of a gray card (or another uniformly colored object) placed on the soil. This photo provides a map of the spatial distribution of the incident light and is called a flat-field image [F(x,y)]. Two ways of correcting for inhomogeneous illumination have been used in previous studies. Aeby et al. (2001), for instance, used background subtraction directly within the RGB color space. The RGB values of any image I(x,y) can be corrected for inhomogeneous lightning by dividing by F(x,y) and normalizing with the mean value of the flat-field image:

[1]
where IF(x,y) is the image corrected for inhomogeneous illumination. However, this method might lead to color shifts. Forrer (1997) instead converted the RGB to the HSV color space and then only corrected the V values of the image V(x,y):

[2]
where VF(x,y) are the corrected V(x,y) values, is the mean V value of the flat-field image and FV(x,y) are the V values of the flat-field image. After the correction, the images are converted back to the RGB color space using the H, S, and the corrected V values. However, it has recently been shown that the saturation value of the HSV color space is not totally independent of the V value (Hanbury, 2002); thus, background subtraction of HSV images may also lead to minor color shifts.

Image Recording Using Digital Cameras
In a digital camera, the images are recorded using a camera chip called a charge coupled device or CCD (there are also other types of image sensors). On the CCD, several photo sites, or pixels record the amount of light hitting the CCD. Each pixel can only see one color; red, green, or blue. To produce a RGB image from the CCD data, (at least) two steps have to be conducted. First, the camera's internal image processing engine then interpolates colors from the value of neighboring pixels to calculate a full color for each pixel. The next step is to correct the RGB values for the color temperature of the light. A white point, which is either calculated by the camera's software using information from the pixels, or preset by the user, is used to calculate the final image RGB values. Other image enhancements are normally also applied to the image, such as sharpening. Sharpening is an image filter that makes images appear sharper by increasing contrast near edges (see Russ, 2002); however, it also increases the image noise. The term RAW refers to the raw or unprocessed data as it comes directly off the CCD; that is, no in-camera processing is performed. The advantage of using the raw format is that the image has not been processed or white balanced which means it can be corrected without loss of information, thus the raw format can be said to represent the digital negative captured (Russ, 2002; Bockaert, 2003). For the camera used in this study, the raw data contains 12 bits per color while the final images only have eight bits per channel.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Experimental Set-Up and Camera Settings
All four experiments were conducted in a laboratory. The only light sources were two 500-W halogen lamps with a color temperature of 3300 K. These were used to illuminate a table on which the calibration boxes and color scales were placed as described below. The lamps were placed on each side of the table at approximately 2-m distance and 45° angles. The placement of the lamps was adjusted to make sure that no reflections were visible to the camera.

All photographs were taken using a Nikon D100 digital camera (Nikon Corporation, Tokyo, Japan) with a 50-mm lens (AF Nikkor 50 f/1.8D). The camera was placed on a tripod at the same level as the table, at approximately 4-m distance. To reduce vibration, the antimirror shock function of the camera was enabled. The camera was connected to a laptop computer via the USB port. The software Nikon Capture Control (Nikon Corporation) was used to remotely control the camera. Using this software, the images taken by the camera are directly transferred to the hard disk of the computer. Another software, Nikon Capture Editor (Nikon Corporation), was used to convert RAW images to TIFF files. All additional image analyses were made using Matlab software (The Mathworks, Inc., Natick, MA).

Variability After Correction of White Balance and Inhomogeneous Illumination
To study how effective various correction methods are, which was the first objective of this paper, three separate experiments were made. In these experiments, the same equipment as in the calibration experiment described below was used, but to isolate the effects of different correction methods, photographs were taken of the uniformly colored patches on Kodak gray and color scales instead of dye-stained soil. Furthermore, in each of the three experiments, only one correction was made, that is, correction of white balance using TIFF files, correction of white balance using CCD raw data, or correction of inhomogeneous illumination, respectively.

The remaining variability in RGB values of the uniformly colored patches was used as a measure how effective the correction methods were. If a correction method is ideal, the resulting variability after correction would be the same as for images taken under identical conditions, that is, the variability will be equal to image noise.

In the first experiment, the variability associated with the correction for white balance was examined by taking seven photographs of the Kodak gray and color scales–specifically the white point measured using the camera's preset function and six values of color temperature (3000, 3700, 4000, 4500, 5500, and 6500 K). Camera-produced TIFF files were used. The method presented by Forrer (1997) was used to convert these images to images with no color cast. Fifteen patches of different shades of gray from the Kodak gray scale were used to generate a normalized gray scale, the values of which were linearly stretched between 0 and 255. A piece-wise linear transfer function to convert the image's RGB values to the normalized values was created, each of the RGB values was treated separately. The variability of the RGB values of the corrected images were calculated for three patches of the Kodak color scale labeled blue, green and 3/color. These patches were selected since they are the colors on the scale that most resemble dye-stained soil. For each of the patches, an area of 11 by 11 pixels (this size was chosen arbitrarily) was selected and the median RGB values were determined. The SD of the six median RGB values (one for each image) for each color patch was then calculated. The variability of the RGB values induced by the correction method was then calculated by taking the average of all nine SDs (one for each RGB value and color patch).

The potential of recording the digital raw data and applying corrections for white balance during the process of translating the CCD raw data to image RGB values was tested in the second experiment. For this test, the Kodak scales were illuminated with three different light sources of different color temperature (3300, 5000, and 6500 K), and one photograph was taken using each light source. Since the light sources also had different luminance, the exposure of each of the three photographs was determined using a gray card. Thus, there was no need for correcting these images for differences in illumination. The CCD raw data was opened in Nikon Capture Editor, and a piecewise linear transfer function between the image RGB values and the normal values was applied to the image. This was done using the Nikon Capture Editor software's curve function using a procedure identical to that described above using the same norm values. The software then processed the CCD raw data and produced an 8-bit RGB image. For each image, three patches of the color scale—blue, green and 3/color—were selected and the median RGB values of an area of 11 by 11 pixels were calculated as described above.

The third experiment was made to study different background subtraction methods. Four photographs were taken of the Kodak scales placed in different positions on the table. For these photographs, only one of the halogen lights was turned on to produce inhomogeneous lighting conditions. Again the same three patches of the color scale were selected and the median RGB values of an area of 11 by 11 pixels were calculated. A flat-field image was taken using a gray card. The RGB values of the four images were then corrected for inhomogeneous illumination using both the RGB and the HSV background subtraction methods. The variability of the corrected RGB values were calculated as described above.

Calibration Experiment
To determine the relationship between dye concentration and image color, a calibration experiment was conducted for three different soils named after their site of origin: Löddeköpinge, Lund, and Revinge. Selected properties of the soils used are given in Table 1. Known amounts of water, dye, and soil were mixed and packed into small transparent calibration boxes made of Plexiglas (0.01 m thick) which were placed on a table. Photographs of the dye-stained soil were taken through the transparent front wall of the calibration boxes to reduce influence of soil surface roughness. The boxes were 0.10-m wide, 0.05-m high, and 0.025-m deep. The dye concentration of these samples, Cw (g L–1 of water), varied between 0 and 5 g L–1, and water content {theta} varied between 0.20 and 0.30 m3 m–3 (±0.0008 m3 m–3). The total number of samples was 260 for the Löddeköpinge soil, 252 for the Lund soil, and 72 for the Revinge soil. The tracer used was the food-grade dye pigment Vitasyn-Blau AE 85 (Swedish Hoechst, Gothenburg, Sweden), which chemically is almost identical to the frequently used dye Brilliant Blue FCF. The dye has been used in several field experiments due to its good visibility, low toxicity, and weak adsorption on soils (e.g., Flury and Flühler, 1994)


View this table:
[in this window]
[in a new window]
 
Table 1. Some selected soil properties of the soils used.

 
A flat-field image was produced by taking the average RGB values of four photographs of a gray card covering the calibration boxes. The gray card was rotated 90 degrees between each of the four photographs. This was done to reduce the effects of the surface roughness of the gray card. The halogen lamps gave a fairly homogeneous illumination; however, small variations could be detected in the RGB values over the flat-field image (around ±5%). The HSV background subtraction method was used to correct the images of the calibration boxes for the inhomogeneous illumination. To ensure that the coordinates were the same for each image, two points with known coordinates were marked on the table. Using the reference points, all images including the flat-field image were cropped to the same size. Since the camera was left in place on a tripod during the experiments, the difference in the coordinates of the reference points were in the range of one to two pixels between images before cropping. The pixel size of the images was 0.0005 by 0.0005 m.

In previous studies, it has been noted that for the highest dye concentration, the R values are typically very low—below 10. Since the relative noise is higher for small values of RGB, this leads to larger errors in the dye concentration estimates (M. Persson, S. Haridy, J. Olsson, and J. Wendt, 2004, unpublished data). One way of overcoming this problem is to overexpose the photographs. This means that more light is allowed to hit the image sensor, making all colors appear lighter. In this study, three different exposures were tested; the aperture setting (f/4) was not changed to avoid possible effects of lens sharpness. Photographs of all the calibration boxes were taken using the exposure determined with the gray card as well as two alternative exposures, taken with different shutter speeds, which allowed 60 and 100% more light, respectively, to hit the sensor.

Because of the experimental conditions, no corrections had to be made for geometrical distortion, differences in color temperature between photographs, or surface roughness. The only correction needed during the calibration experiment was for inhomogeneous lightning. For the calibration experiment, the CCD raw data was saved from the camera. Using the software Nikon Capture Editor, the raw data was converted to TIFF files (8 bits) for further image analysis. This format was chosen since it is a lossless format. When converting from the raw CCD data, sharpening was turned off to reduce image noise. The color temperature of the halogen lamps is about 3300 K. The white point was measured using the camera's preset function by taking a photograph of a gray card. This white point was then used when converting all the raw CCD data.

Image Color and Dye Concentration Relationships
From the images of the calibration boxes, the mean and SD of the RGB values were calculated for each soil sample. When calculating these values, an area covering around 2000 pixels was chosen. To determine the relationship between the RGB values of the images to the dye concentration, Cs [Cs = Cw x {theta} (g L–1 of soil)], a calibration function must be selected. Normally, R has by far the highest correlation with Cs, and the relationship between them is usually said to be logarithmic. For instance, Forrer (1997) suggested a relationship between the logarithm of Cs and a second order polynomial in R, G, and B.


[3]
where a to j are empirical coefficients. This relationship has also been used by Forrer et al. (2000) and Stadler et al. (2000). Depending on soil type, some of the variables in Eq. [3] may not be significant. In the present study, an alternative linear relationship between R and Cs was found for two soils which provided an alternative model to the data

[4]
where a to j are empirical coefficients. Ewing and Horton (1999) used the HSI color space to relate image parameters to Cs. Since the coordinates of the HSI color space and the RGB color space are directly related to each other, the coordinates contain the same information. However, sometimes linear relationships can be found using some of the values in one color space but not in another; therefore, testing alternative color spaces can be useful. Therefore, a similar regression analysis was also made for the HSI color space, that is, the RGB values of Eq. [3] and [4] was replaced by their respective HSI values.

To extract all information from the image parameters, a more powerful model than Eq. [3] and [4] may be needed. A NN essentially mimics brain function by acquiring knowledge through a learning process; this allows it to find the optimal weights for the different connections between the individual nerve cells (neurons). Mathematically, an NN can be treated as a universal function approximator. The NN is a nonlinear model that makes use of a parallel programming structure capable of representing arbitrarily complex nonlinear processes that relate the inputs and outputs of any system (Hsu et al., 1995).

To develop and train a NN involves (i) choosing a training set that contains input–output pairs; (ii) defining a suitable network (number of layers and number of neurons in each layer); (iii) training the network to relate the inputs to the corresponding outputs by estimating the NN weights; and (iv) testing the identified NN.

The NN is trained using a self-organizing learning process that minimizes the error between the NN output and the target values. The objective of the training is to find the weights of each neuron that will result in the minimum error.

General properties of NNs, as well as their applications within hydrology, water resources, and soil science, have been thoroughly covered in a number of publications (e.g., Bishop, 1995; Pachepsky et al., 1996; Maier and Dandy, 2000). For background information, the reader is referred to this literature; here, only specific properties of the NN employed are given.

A two-layer (one hidden and one output layer) feed-forward NN was trained by a back-propagation algorithm using Levenberg-Marquardt optimization (Hagan and Menhaj, 1994). Back-propagation can be explained as the adjustment of NN weights and biases by back-propagating the differences between the NN output and actual target. Before NN application, the original input and target data sets were standardized by subtracting the mean and dividing by the SD to ensure that every input receives equal attention during the training (e.g., Maier and Dandy, 2000). Typically, the data available for NN calibration is split in three parts, one for training, one for validation, and one for testing. For the image data sets, 20% of the calibration samples in each soil were randomly selected for validation and 20% for testing. This division was made to use early stopping, a common and practical method employed to avoid overfitting while ensuring proper generalization (e.g., Bishop, 1995). Early stopping means that the NN is trained with the training set and its performance is checked against the test set after each epoch (parameter adjustment based on one sequence through all values). When a consistent increase in the performance for the test set is observed, training is stopped and the NN considered trained. After that, the quality of the NN's ability to estimate Cs is checked using the entire data set as input. The entire set was used to make the comparison with the regression models easier, since the performance of the regression models is calculated using the entire data set. All NN simulations were run using the NN toolbox of Matlab (The Mathworks Inc., Natick, MA) software.

A log-sigmoid transfer function was allocated to every neuron in the hidden layer. The function is defined as

[5]
where iv is the input value to the neuron and ov the output value from the neuron. A pure linear transfer function was allocated to the output layer.

Determination of the number of neurons and how to divide these into separate layers are delicate issues. Large (or complex) NNs require a considerable amount of data to generalize well and are computationally intensive; small (or simple) NNs may not be able to reproduce intricate I/O relationships (e.g., Bishop, 1995). There were three inputs to the NN; R, G, and B. To determine the optimal number of neurons in the hidden layer, the principle of constructive algorithms was used. This method essentially starts testing the number of neurons from a minimum and adds neurons until performance ceases to increase. Each NN was trained 20 times and the average output (henceforth referred to as output) was compared with the targets. The root mean square error (RMSE) and the coefficient of determination (r2) between target and output resulting from the trained NN with different number of hidden neurons were compared, and the design associated to the smaller RMSE was accepted as the most suitable NN design. The number of hidden neurons was increased from one to 20. The minimum RMSE combined with maximum r2 was found using a NN with seven hidden neurons. Thus, seven hidden neurons were used in the following simulations.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Variability after Correction of White Balance and Inhomogeneous Illumination
In Table 2, the results of the three first experiments are presented. The table shows the average SD of the RGB values of the blue, green, and 3/color patches on the Kodak color scale for all correction methods. Variability of the RGB values is affected by image noise. For the uniformly colored patches of the color scale, the SD of the RGB values was about 0.5 to 2.5, with an average of 1.8. Image noise cannot be corrected for, so if the studied correction methods were ideal, that is, removing all variability introduced by differences in the conditions in which the different photographs were taken, the average SD should also be 1.8 for the corrected RGB values. Of course, it should be realized that if the images are not corrected at all, the variability will be much higher.


View this table:
[in this window]
[in a new window]
 
Table 2. Influence of different sources of variability of RGB values of uniformly colored objects.

 
It can be seen that the variability of RGB values after correction of the white balance on the TIFF files are relatively high. However, it should also be noted that the RGB values in the uncorrected images were far from the correct values, the variability (calculated in the same way as for the variability of the corrected images described above) of the uncorrected images were 20.7. Generally, the errors were slightly smaller for those images that had a color temperature close to the actual one (3300 K). When the correction of white balance is made on the raw CCD data files instead, the variability is, as expected, significantly smaller and very close to image noise. This is probably the greatest advantage of using a digital camera under field conditions where the color temperature of the light changes during the day.

The correction for inhomogeneous lightning using the RGB and the HSV correction methods proved to be similar. The uncorrected images had extremely different RGB values, with a variability of 27.0. Both correction methods gave accurate RGB values after correction. Still, the HSV background subtraction method is preferred since it, at least in theory, gives less color shift compared with the RGB background subtraction method. The RGB method is, on the other hand, easier to use since the images need not be converted to the HSV color space and then back to the RGB color space.

The variability of the RGB values introduced by different correction methods should not be seen as universal. They will depend of the difference of the corrected and uncorrected images. For example, the variability after correction for inhomogeneous illumination will be larger for more heterogeneous illumination patterns. The variability presented in Table 2 should instead be seen as guidelines when choosing which correction method to apply. Another interesting issue is in which order the corrections should be applied. However, in the present study, the experimental conditions were such that only one type of correction was needed in each experiment. Thus, it is not possible to determine the optimum order of correction methods.

Calibration Experiment Results
It is almost impossible to mix soil, dye, and water and then repack it homogeneously into boxes to a specific bulk density. Therefore, many soil samples showed visual variation in soil color. This was especially true for the Lund soil, which had the highest clay content. In the end, a representative area, 2000 pixels large, was selected for each soil sample. The median RGB values were calculated from this area for every calibration sample.

For all the soils, the R values gave the best correlation with Cs. In Fig. 1 , the RGB values for the three soils are plotted against Cs. Only the values at the 1/30-s shutter speed are included for clarity. Both the Lund and Löddeköpinge soils display a linear relationship between R and Cs. In previous studies, this relationship has been logarithmic (Ewing and Horton, 1999; Forrer et al., 2000; M. Persson, S. Haridy, J. Olsson, and J. Wendt, 2004, unpublished data), which was also observed for the Revinge soil. From the limited data available in the literature and including the present the data presented here, it seems that in light-colored soils the relationship is logarithmic, while in darker soils (higher organic matter and water content) the relationship is linear. The G value also depends on Cs, but to a much smaller extent compared with R. The B value had a very small correlation with Cs for all the soils.



View larger version (59K):
[in this window]
[in a new window]
 
Fig. 1. Red (R), green (G), and blue (B) values of the dye-stained soil samples plotted against dye concentration (Cs).

 
The regression analysis only considered the parameters in Eq. [3] and [4] that had a correlation with Cs > 0.20. In Table 3, the resulting r2 and RMSE of the models are presented. As expected, Eq. [3] gave the best results for the Revinge soil, but for the Lund and Löddeköpinge soils, Eq. [3] and [4] performed almost equally well. The results for the regression analysis using the HSI color space were similar to those using the RGB color space. In general, the results were slightly better using the HSI color space for the Lund soil, but for the Löddeköpinge and Revinge soils, the RGB color space was better. Since the differences between the regression analysis in the RGB and HSI color spaces were negligible, these results were not presented in the table.


View this table:
[in this window]
[in a new window]
 
Table 3. Performance of the different calibration models.

 
The NN model gave better results for all cases except for the Revinge soil. The probable reason for this is the limited data set for this soil (72 samples) since the NN model needs many data points for training. As a general rule-of-thumb, the NN needs three times more data points than the number of weights. Since the NN had 28 weights, the data set was actually too small for the NN to be trained properly. For the Lund and Löddeköpinge soils, the NN gave very good results; however, it is questionable whether the improvement justifies the extra work involved. As mentioned above, the NN requires a large data set, and the training process is not straightforward. Thus, the NN model is probably only justifiable in cases where high accuracy is needed.

A future, more useful NN approach could be to try to find a universal model which fits many different soil types. Input to such a model could, in addition to RGB values, include soil physical parameters such as clay and organic matter content, and probably also information of soil color. Similar studies, where soil physical parameters are used to predict various phenomena in soil physics using NN have been presented by Pachepsky et al. (1996), Persson et al. (2002), and Persson and Uvo (2003).

The ability to derive a calibration relationship using a limited data set was also tested. Twenty one soil samples of each soil were randomly selected, seven each for the ranges 0 to 0.5, 0.5 to 1.0, and 1.0 to 1.5 g L–1. The parameters of Eq. [3] and [4] were optimized using the limited data set and the errors when applying these parameters to the entire data set were calculated. The RMSE was found to increase by 19, 22, and 12% for the Löddeköpinge, Lund, and Revinge soils, respectively. Thus, acceptable results can be achieved using a fairly limited data set.

The purpose of taking photographs of the calibration samples using different exposure settings was to increase the R value of the samples with highest Cs. In Table 4, the range of RGB values for the calibration boxes are presented. As seen in Table 3, the performance of the different models for the Lund and Löddeköpinge soils did not change very much with the exposure setting. However, in the Revinge soil, the results were clearly improved when the photographs were overexposed by increasing the shutter speed. This can be explained by the fact that the Revinge soil had the lowest R values for the samples with highest Cs. When examining the residuals of the Cs estimations, it was found that the errors in the Revinge soil increased with increased Cs; in the other soils there was no trend in the residuals. Clearly, the accuracy of the Cs estimations can be improved by overexposing images of certain soils.


View this table:
[in this window]
[in a new window]
 
Table 4. Range in RGB (red, green, and blue) values of the dye-stained soil samples for different exposures.

 
The RMSE presented here are in the same range as those presented in previous studies, for example, 0.152 g L–1 (Forrer, 1997) and 0.057 g L–1 (M. Persson, S. Haridy, J. Olsson, and J. Wendt, 2004, unpublished data). The main source of error is probably that the samples were not completely and homogeneously mixed, leading to variations in bulk density and water content. This was especially true for the Lund soil, which had the highest clay content. The SD of the RGB values of the calibration samples was between 1.8 to 3.5. Part of this variability can be explained by image noise (see Table 2), the remaining variation comes from small-scale (pixel-to-pixel) variation of soil color. To reduce the variability, a smoothing procedure is normally used; both averaging and median filters have been applied in previous studies. An averaging filter takes the average values of the RGB values for a selected area and the average values are then given to the center pixel in that area. This procedure reduces the pixel-to-pixel noise. Uncertainty could be reduced by applying averaging filters of different sizes (M. Persson, S. Haridy, J. Olsson, and J. Wendt, 2004, unpublished data). Median filters, used by, for example, Vanderborght et al. (2002), replace every pixel RGB values with the median values of a selected region. This procedure reduces extreme RGB values without reducing the resolution. Several other methods to reduce image noise can be found in Russ (2002).

Errors in Dye Concentration Measurements Due to Variability in RGB Values
To study the influence of different correction methods on the accuracy of the dye concentration estimations, it would be necessary to conduct a calibration experiment in which photographs of dye-stained soil samples are taken under several different lightning conditions. In the present study, the photographs of dye-stained soil samples were only corrected by a slight spatial variation in illumination. However, it is possible to calculate the magnitude of the errors resulting from the remaining variability after different correction methods are applied. This was done by generating an artificial data series of 16 different RGB value combinations in the concentration range 0 to 1.5 g L–1 (with an increment of 0.10 g L–1). Then, 32 new RGB combinations were calculated by either adding or subtracting the SDs presented in Table 2. Finally, the RMSE was calculated for these 32 modified RGB combinations. The results are presented in Table 5. The resulting RMSE was similar for the different shutter speeds and calibration models. Therefore, only the RMSE calculated using the logarithmic model for the 1/30-s shutter speed is presented.


View this table:
[in this window]
[in a new window]
 
Table 5. Root mean square errors in dye concentration estimations resulting from variability in RGB (red, green, and blue) values. Only results using the 1/30-s shutter speed and the logarithmic model (Eq. [3]) are presented.

 
Overall, the relationship between soil color and dye concentration for the Lund soil was the most sensitive to changes in RGB values and the one for the Löddeköpinge soil was the least sensitive. Again, the advantage of correcting for color temperature differences using CCD raw data instead of using image files is evident.

Recommendations for Future Experiments
The following recommendations can be for tracer experiments where dye tracers are used based on the results presented in this study. First, the exposure settings should be determined. A photograph of (at least) two soil samples representing the maximum and minimum dye concentration expected should be taken before the experiments. The light should be measured using a gray card. If some of the RGB values are too low or too high, the exposure should be changed to obtain all the RGB values within an acceptable range (say between 20 and 230). Alternatively, the concentration range can be changed. If (and only if) high accuracy is needed should many (at least 100) calibration samples be taken and a NN calibration model be used. However, fully acceptable results can be achieved by using a limited number of samples (around 20) using the calibration Eq. [3] or [4]. During tracer experiments using ambient light, a flat-field image of a gray card covering the dye-stained area should be taken. As soon as the lightning conditions change, a new flat-field image should be taken. If high accuracy is needed, I recommend that an artificial light source is used. Corrections for inhomogeneous illumination can be made using background subtraction with either Eq. [1] or [2]. If the white point or color temperature varies between photographs, as it will during experiments using ambient light, a digital camera should be used and both the exposure and white point estimates should be determined for each photo (or photo session). Using the raw format, the correction for different white points can be made with minimal information loss. Thus, using a digital camera will lead to more accurate dye concentration estimations since the uncertainty introduced by different color temperature will be significantly reduced.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Four separate experiments were conducted to study the performance of different image correction methods and the relationship between soil color and dye concentration. In the first three experiments, the variability of RGB values associated with image noise, correction for white balance, and background subtraction method was investigated in detail. It was shown that the RGB and HSV background subtraction methods performed equally well. The variability remaining after correction of white balance was considerably lower if the correction was made using CCD raw data instead of using the TIFF files, which were processed by the camera's software.

The accuracy of using image analysis for estimating dye tracer concentration was investigated for three different soils. Between 72 and 260 samples of each soil were prepared and photographed using three different exposure settings. Because of the controlled experimental conditions, images only needed to be corrected for inhomogeneous lightning. Three different calibration models which related the images' RGB values to dye concentration Cs were tested: (i) between the logarithms of Cs and a second order polynomial in R, G, and B; (ii) between Cs and a second order polynomial in R, G, and B; and (iii) a NN model. For the Revinge soil, the logarithmic model gave a lower RMSE than the linear model, whereas the other soils had similar RMSE values with both models. In almost all cases, the NN gave the best estimations; however, the improvements were not great. The different exposure settings did not greatly affect the results either. The exception was for the Revinge soil, where overexposed images gave a much lower RMSE. This was because the R values of the correctly exposed image were very low, around 10 for the highest Cs, which led to increased uncertainty. Recommendations of how laboratory and field experiments should be planned and conducted to achieve the highest accuracy were also presented.


    ACKNOWLEDGMENTS
 
This study was funded by the Swedish Research Council. I also thank Johan Wendt for his help during the laboratory experiments.

Received for publication June 4, 2004.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 THEORY
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 




This article has been cited by other articles:


Home page
Vadose Zone JHome page
M. Persson
Estimating Surface Soil Moisture from Soil Color Using Image Analysis
Vadose Zone J., November 11, 2005; 4(4): 1119 - 1122.
[Abstract] [Full Text] [PDF]


Home page
Vadose Zone JHome page
M. Persson, S. Haridy, J. Olsson, and J. Wendt
Solute Transport Dynamics by High-Resolution Dye Tracer Experiments--Image Analysis and Time Moments
Vadose Zone J., August 16, 2005; 4(3): 856 - 865.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (1)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Persson, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Persson, M.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Persson, M.
Related Collections
Right arrow Soil Methods/Instrumentation
Right arrow Soil Physics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Vadose Zone Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome