Comparison of spectro-biophysical and yield parameters of cotton in irrigated lowlands of Amudaria River

This study aims defining the best predictors of biophysical parameters and yield with vegetation indices derived from Landsat 8 OLI surface reflectance data. The study was conducted in 2015 at five crop fields in Kulavat canal irrigation system in Khorezm province, Uzbekistan. The Environment for Visualizing Images (ENVI) ver. 4.5 and R programming software ver. 1.0.143 were used to process, calculate seven vegetation indices (VIs) and predict biophysical parameters and yield of cotton.

The trend analysis show that in-situ measured biophysical parameters for the whole growth stage of cotton follows the 3rd order polinomial curve (R2 = 0.96-0.99). The NDVI, SAVI, TVI and RVI had linear interrelationship between each other with strong positive correlation of R2>0.9 (highly significant with p-value=0). The VIs showed a logarithmic function relationship with crop height (crht), power function relationship with green biomass (gbm) and leaf area index (LAI), and linear function relationship with the fraction of photosyntetically active radiation below the plant canopy (FPAR) during the entire growing period of cotton. Among seven VIs tested in this study, the NDVI/SAVI and GCI explained 88 and 91 % of variation in crht, respectively. These three indices also well explained gbm variation (R2=0.86). The TVI was slightly better explained FPAR than NDVI and SAVI (all R2>0.87). The NDVI, SAVI and TCG explained 82 % of variation in LAI. Among all VIs, GCI, NDGI and RVI were found to be the best predictor of cotton yield during August, explaining 76-79 % variability (p<0.001).

Based on spectro-biophysical analysis, VIs derived from RS data on July and August (anthesis and peak growth stages of cotton) is more reliable to use for modeling cotton yield (seed and lint yields together). Therefore, field data collection is recommended to perform during these months taking into account in-field crop condition and remotely sensed data acquisition date. In addition, September 5-20 is the second important period (i.e., cotton pick-up) to conduct yield data collection for establishment of relationships between cotton yields with VIs (July-August).


INTRODUCTION
Population growth and climate variability necessitate using available land and water resources effectively to maintain sustainable agricultural production in the Central Asian countries, especially in lowlands of the Amudarya and Syrdarya Rivers. Socio-economic transformation leaded to the intensification of agriculture and expansion of irrigation and melioration infrastructure since 1960s. Whereas mismanagement of land and especially water resources caused land degradation and yield loss. In addition, climate change (extreme events) becoming a major obstacle that decline food production, water contamination, and economic damages.
During the last 40 years, the expansion of cropland in Uzbekistan has slackened, but land is now used much more intensively (Goscomstat. Uzbekistan in figures. The State Committee of Republic Uzbekistan on statistics, 2015); [Djanibekov, 2008]: in the 1980s, on average a hectare of cropland produced 1.8 tn., but now it produces 2.2 tn. (e.g., 18 % more). To understand land use dynamics and to be able to predict possible future developments, constant monitoring is needed. In this context, in-situ phenological observation of crop development and growth is very important. Based on proper and in-time phenology observations, more stable crop yields and quality can be recommended for land users, which can facilitate future improvement of crop, water and land management. Keeping aforementioned statements, the main goal of the research is to assess usability of remote sensing derived vegetation indices as predictors of in-situ observed cotton biomass, leaf area index (LAI), fraction photosynthetically active radiation -(PAR) and yield.

MATERIALS AND METHODS OF RESEARCHES Study area
The study was conducted in Kulavat canal irrigation command area (ICA) in Khorezm province, located in the north-western part of Uzbekistan ( fig. 1 a, b). Cotton, winter wheat, rice, maize, sorghum, watermelons, and alfalfa are cultivated in irrigated lands of 42 000 ha within the study area that have regularly shaped fields [Löw, Duveiller, 2014]. In the region, multiple cropping practices are common, e.g., a double cropping sequences with a summer crop (rice, maize, vegetables etc.) following harvest of winter wheat.

Field selection and in-situ measurements
As generally applied in agriculture, conventional (RS) methodology is based on qualitative analysis of information derived from "training areas" (e.g. ground-truthing, GT). This training areas should provide quantification within and between field variability and thus has to be representative for the given region in further up-scaling (mapping) of agricultural crop types, their biophysical characteristics (height, leaf area index, biomass etc.), and yield using near-real-time and historical (archival) satellite images. However, vegetation status is affected by variable environmental conditions (soil type, fertility, water availability etc.) and agricultural management (sowing, time and amount of irrigation and fertilization, tillage, pesticide application, etc.). Keeping this statement in mind, field measurements were performed at different parts of the study area ( fig.  1c). For detailed biophysical measurements, three test sites within each five fields cultivated by cotton were established depending on access to the fields and per agreement by farmers (table 1). In total six field trips were conducted during April-October, 2015 in order to measure farm-byfarm (or pixel-by-pixel) crop biophysical parameters such as plant density, aboveground plant height, wet biomass/crop yield, fPAR and LAI values.

Earth observation (EO) data acquisition and processing
In total 9 scenes of Landsat 8 1 archive data were acquired in 2015 through EarthExplorer at http://earthexplorer.usgs.gov (table 2). This archive Landsat 8 data with 30 m spatial resolution have radiometric, geometric and atmospheric calibration and directly can be used to calculate various vegetation indices (VIs). The Environment for Visualizing Images (ENVI) ver. 4.5 was used to make layer stacking. At the first step, 7 bands of each 9 images were stacked one by one ( fig. 2, left) following 1 . The output file of individual image had a geographic extent that encompasses only the data extent where all 7 bands overlap. At the second step, created single file from individual images re-stacked again so that it contains all bands from nine images (e.g., 9*7=63 bands, fig. 2, right). This single file, containing 63 bands was used in order to calculate various VIs using R programming software. After layer stacking of all 63 bands at the extent of original image dimensions, the image size was increased on 10 times (e.g., 7.0 GB) from individual layer stacked image (800 MB), thus cropping (cutting) the stacked image within the boundary of focus area was performed to reduce the image size (memory). The 'raster', 'rgdal' and 'sp' packages were installed in R from the freely available source (e.g., https://cran.r-project.org/web/packages/). (left) and 63 bands in single file (right)

Calculation of vegetation indexes
This section focuses on vegetation indices (VIs), which can be broadly defined as combinations of surface reflectance at two or more wavelengths designed to highlight a particular property of vegetation. The VIs based on spectral reflectance have been used in agricultural research since 1969 [Tucker, 1979] in order to find functional relationships between canopy characteristics such as biomass, leaf area index (LAI) etc. and remote sensing observations. Canopy light reflectance properties mainly based on the absorption of light at specific wavelengths are associated with specific plant characteristics. In general, unhealthy vegetation has a lower photosynthetic activity, causing increased visible reflectance and the reduced near-infrared reflectance (NIR) from the vegetation. The spectral reflectance in the visible (VIS) wavelengths (400-700 µm) depends on the absorption of light by leaf chlorophyll and associated pigments such as carotenoids and anthocyanins [Babar et al., 2006]. Reflectance in the VIS range is low because of the high absorption of light energy by these pigments. The reflectance in the NIR wavelengths (700-1300 µm) is high because of the multiple scattering of light by different leaf tissues [Knipling, 1970].
To date, more than 150 VIs have been published in the scientific literature [Pettorelli, 2013]. Among these VIs, commonly used indices, such as normalized differential vegetation index (NDVI), soil adjusted vegetation index (SAVI), normalized difference greenness index (NDGI), transformed vegetation index (TVI), ratio vegetation index (RVI), green chlorophyll index (GCI) and tasseled cap transformations at greenness (TCTG) have been selected in this study (table 3) as predictors of biophysical variables of cotton.  [Huete, 1988]  In order to develop spectro-biophysical and yield relationships, reflectance as well as vegetation values are needed to be extract from layer stacked and cropped RS data based on three plot located pixels within each field (see table 1). Anderson et al. [1993] suggested that it is a more desirable when sample point data is compared with vegetation index value obtained for a single pixel (900 m 2 ).

RESULTS OF RESEARCHES AND THEIR DISCUSSIONS Trend analysis of biophysical parameters
In-situ measured biophysical paramters of cotton (Gossypium hirsutum L.) is given in fig. 3. Cotton height (crht, fig. 3A) follows approximately an exponantial curve from about 7±2 cm (5−8 true leaves covering up to 2 % of land area on May 11−15) up to 88±16 cm (time of the peak growth stage covering an area of 75−90 % on August 5−15). Thereafter, the crht was remained constant with slightly decrease on about 87±14 cm (after 1 st and 2 nd pick up of cotton on September 28) due to drying up and plant disturbance during the harvest. In general, the height of cotton for the whole growth stage follows the 3 rd order of polinomial curve (R² = 0.96).
Plant density (crds, fig. 3B) was ranged from 9 to 120 plants per meter square before the plant singling ( fig. 4). Once plant singling was completed, the crds was remained constant -12±4 plants m -2 with further shedding on about 11±2 plants m -2 (after 2 nd pick up of cotton on September 28). Similarly as the height, the crds of cotton for the whole growth stage is following the 3 rd order polinomial curve (R² = 0.99).
Crop green biomass (gbm, fig. 3C) was ranged from 5 to 120 gram per meter square at the initial stage of cotton development (greater biomass was due to high crop density as plant singling was not yet performed duing the measurement). Cotton wet biomass was increased exponentially untill the peak growth stage (e.g., 3.9±1.5 kg/m 2 on August 5−15). The gbm was then decreased up to 1.9±1.0 kg/m -2 on September 26−30 due to crop drying up after defoliation and harvest 1 . Similarly as the height, the crds of cotton for the whole growth stage is following the 3 rd order polinomial curve (R² = 0.99).
As a cotton plant develops, new leaves appear and expand, increasing sunlight interception. Hence, the fraction of photosyntetically active radiation below the plant canopy (fpar, fig. 3D) decreases from about 1670±225 µmol m -2 s -1 (when fraction of crop canopy cover of land area is low 2 % on May 11−15 and thus less amount of photosynthetically useful radiation is intercepted by the canopy) up to 135±100 µmol m -2 s -1 (time of the peak growth stage covering un area of 75−90 % on August 5−15 where greather amount of photosynthetically useful radiation is intercepted by the canopy). Thereafter, the fpar was increasing on about 480±230 µmol m -2 s -1 (when canopy cover is decreased on 30−45 % after the 1 st and 2 nd pick up of cotton on September 28) due to drying up of plant and less or no remained leaves after the harvest. Such a decrease and increase of the fpar during the growth period of cotton, follows the 3 rd order of polinomial curve (R² = 0.98).
On the contrary to the fpar, leaf area index (lai, fig. 3E) increased most quickly from 0.6±0.2 to 4.0±1.1 m 2 m -2 between 60 to 130 DAP with average rate of increase 0.06 m 2 m -2 day -1 . It is obvious, that, the lai increases exponentially per decrease of the fpar owing interception of the photosynthetically useful radiation by canopy ( fig. 5). Once canopy cover decreasing, the fpar starts to increase thus lai reducing progressively at the premature aging of the cotton. Further, the photosynthetic capacity of the plant reduces due to water stress, low fertility and other stresses and the value of lai reached to 1.5±0.8 m 2 m -2 on September 28. Similarly to the aforementioned biophysical variables, the lai during the growth period of cotton, follows the 3 rd order of polinomial curve (R² = 0.99).

. Relationship between leaf area index (lai) and photosyntetically active radiation (fpar) during the growth period of cotton for 15 plots
Cotton yield variability Cotton yields, measured using a handheld scale for all 5 fields x 3 plots are illustrated in fig. 6. Cotton yield was ranged from 1.3±0.4 (C4L1KV-C4L3KV, Khonka district) to 3.3±0.3 t ha -1 (C2H1KV−C2H3KV, Urgench District). Lower coton yields through C3M1KV (1.6±0.2 t ha -1 ) and C4L3KV could be associated as 20−45 % of balls and squares were injured by cotton bollworm (Helicoverpa armigera Hbn.) and aphis (Aphis gossypii Glov.) ( fig. 7) as well as cells, where heavy rainfall on ~May 7−8, 2015, feed on the meristems injured, and the resulting leaves appear crinkled and have holes in them ( fig. 8). Whereas, higher cotton yield at the C2H1KV-C2H3KV in farm "A. Karimov" could be the reason of a better farm management. In general, the yield variability of cotton is corresponding to the environmental condition governed within these fields.

Fig. 6. In-plot variation of the cotton yield within the five fields
Note: Here, cotton yild comprises lint and seed yields and could be higher than actually reported yield by farmers as latter include various losses; for location and decsription of the field ID refer to table 1

Range and dynamics of the vegetation indices (VIs)
The VIs curves for cotton, plotted as a function of time are given in fig. 9. The shape of the VIs curves during the growth period of cotton follows the 3 rd order polinomial curve, which are similar as curves of the biophysical variables described in previous section.
The NDVI values ( fig. 3A) ranged from 0.07±0.01 on April 19 (as 92−98% of land surface is barren during the germination stage of cotton) to 0.80±0.11 on August 9 (dense vegetation during the peak growth stage of cotton). The values of the SAVI was 1/3 fold less than NDVI (in avergae) and ranged from 0.05±0.01 (on April 19) to 0.53±0.07 (on August 9) (fig. 3B).
The NDGI ( fig. 3C) ranged from -0.11±0.00 (on April 19) to 0.18±0.08 (on August 9). Once cotton developed and canopy cover expand, variation of the NDGI values (difference between minimum and maximum) became high, e.g., 0.23−0.26 (July−August) due to higher reflectance in the green channel during the period.
The range of RVI ( fig. 3E) from 1.16±0.02 on April 19 (e.g., more bare soil) to 10.4±3.54 on August 9 (e.g., dense vegetation) is within the range of the findings of Jackson & Huete [1991], where RVI values of cotton in their study ranged from 1.43 (bare soil) to 20.2 (near-maximum green vegetation). The RVI is thus a non-linear measure of vegetation that is very sensitive to variations in areas with high vegetative cover but much less sensitive in areas with sparse cover.
The GCI ( fig. 3F) ranged from 0.43±0.03 (on April 19) to 5.86±1.68 (on August 9). Cotton after a week starts to germinate, however, major part of the soil surface still bare, which result smaller TCG value (i.e., 1270±670 on April 19). Once plant starts to grow up, the TCG would shift up until the peak vegetation stage (i.e., 2060±340 on August 9) and declines as the crop matured and then would 'tassel out' with senescence on September 26 ( fig. 3G).

Spectro-biophysical relationships
The relationships between the various vegetation indices (VIs), and biophysical quantities (plant height, biomass, LAI/fPAR and yield), both linear and non-linear models have been developed based on the best-fit R 2 values [Thenkabail, 2003]. Pearson's correlation coefficient (R 2 ) and asymptotic p-values (approximated by using the t or f-distributions) for all possible pairs of spectro-biophysical parameters are calculated using the R software.

Crop height, biomass, fPAR and LAI related with VIs
Correlation coefficients and p-values for all possible pairs of spectro-biophysical parameters of cotton are given in table 4. It is obvious that the NDVI, SAVI, TVI and RVI of Landsat 8 (OLI) data had linear interrelationship between each other with strong positive correlation of R 2 >0.9 (highly significant with p-value=0). It can be explained as these indices are based on reflectance values in the red and NIR wavelengths. For all the seven spectral reflectance indices, significant correlations (R 2 ≥0.92, p=0) were attained only when these indices were compared at the anthesis and the peak crop growth stages (e.g., July and August shown).
When all the temporal spectro-biophysical variables were considered together for the entire growing period of cotton, positive logarithmic and power relationships between NDVI and crht, gbm and lai was observed ( fig. 10 A, B, D). The saturation of NDVI due to large amounts of gbm and lai was evident [Brandão et al., 2015;Gutierrez et al., 2012] by the low NDVI changes at high biomass content (gbm>2.0 kg) and leaf area index value (lai>2.3 m 2 m -2 ) at the peak growth stage. At this point, NDVI tended to reach an asymptote at values between 0.81 and 0.88. In contrast, the linear relationship between NDVI and fpar was negative where NDVI increasing per decrease of fpar ( fig. 10 C). In general, the spectral indices showed a logarithmic function relationship with crht, power function relationship with gbm and lai, and linear function relationship with fpar during the entire growing period of cotton (table 5). Among 7 VIs tested in this study, the NDVI/SAVI and GCI explained 88 and 91 % of variation in crht, respectively. These three indices also well explained gbm variation (R 2 =0.86). The TVI was slightly better explained fpar than NDVI and SAVI (all R 2 >0.87). The NDVI, SAVI and TCG explained 82 % of variation in lai. Obtained regression equations with higher R 2 (table 5) can be used to predict cotton biophysical variables. However, crop types must be first discriminated before individual spectro-biophysical quantitative relationships are applied to spatially map within and between farm variability in every farm and every pixel within a farm [Thenkabail, 2003].

Yield related with VIs
Among all VIs, GCI, NDGI and RVI were found to be the best predictor of cotton yield during August, explaining 76−79 % variability (p<0.001, table 6). The VIs-cryd relationships can be improved certainly if crop yield data from the farm "Iroda-Ogiljon" (e.g., C3M1KV, C3M2KV and C3M3KV) is omitted in the regression analysis. This is because, although, cotton at this field was better developed and had relatively higher density, biomass, LAI (refer to fig. 3) and thus higher VIs, the yield was rather low (e.g., 1.5−1.8 t ha -1 ) compared to other fields (refer to fig. 6).

CONCLUSIONS
It is interesting to mention that yield/production forecasting depends upon the data collection technique from ground-based field visits that constituted sample surveys based on crop harvesting experiments. These yield surveys are extensive as plot yield data are collected through stratified multistage random sample techniques. From the data obtained in this way yields can be forecasted at the regional and national level. However, such a technique has three major drawbacks: (i) it is time-consuming, subjective, and prone to significant discrepancies as a result of insufficient ground observations that cause poor crop production assessment; (ii) the outcomes are usually made available to the government and public after several months of the harvesting of the crop; and (iii) it is costly, depending on the survey areas.
Currently, the ground-based data collection method is in practice in Uzbekistan and district/province branches of the Ministry of Agriculture and Water Resources (MAWR) and State Committee on statistics of the republic of Uzbekistan are responsible organizations for that. These organizations collect data at the basic administrative level/unit (i.e. massifs) and then aggregate at the district, region and country-levels.
In this context, remote sensing-based methods have already been proven as an effective alternative for mapping crop area, biophysical parameters and yield forecasting. The benefits of remote sensing technology include: (i) spatial coverage over a large geographic area; (ii) availability during all seasons; (iii) relatively low cost, since some optical images are freely available (i.e., MODIS, Landsat); (iv) efficient analysis; (v) they provide information in a timely manner; and (vi) they are capable of delineating detailed spatial distributions of areas under crop cultivation.
One of the most important issues is that regardless the employed method (i.e., ground or remote sensing-based) the user requires fast, reliable (accurate), less costly, and least labor-intensive ways; and also forecasting should take place prior to harvesting of the crop.
In this study, our objective was to quantify within and between field variability of vegetation status (i.e., crop height, biomass, fPAR, LAI) and yield using vegetation indices from historical and near-real-time Landsat-8 Operational Land Imager (OLI) Sensors. The best regression equations developed from the spectro-biophysical analysis can be used for mapping crop areas and forecasting its biophysical variables and production and thus can provide the basis of the use of remote sensing imagery for precision-farming applications.
Based on spectro-biophysical analysis, VIs derived from RS data on July and August (anthesis and peak growth stages of cotton) is more reliable to use for modeling cotton yield (seed and lint yields together). Therefore, field data collection is recommended to perform during these months taking into account visual observation of in-field crop condition; canopy cover under diverse strata (environmental condition) coupled with Landsat-8 data acquisition date. In addition, September 5-20 is the second important period (i.e., cotton pick-up) to conduct yield data collection for establishment of relationships between cotton yields with VIs (July−August).