Principal component regression analysis demonstrates the collinearity-free effect of sub estimated climatic variables on tree growth in the central Amazon

Introduction: Climatic variables show a seasonal pattern in the central Amazon, but the intra-annual variability effect on tree growth is still unclear. For variables such as relative humidity (RH) and air vapor pressure deficit (VPD), whose individual effects on tree growth can be underestimated, we hypothesize that such influences can be detected by removing the effect of collinearity between regressors. Objective: This study aimed to determine the collinearity-free effect of climatic variability on tree growth in the central Amazon. Methods: Monthly radial growth was measured in 325 trees from January 2013 to December 2017. Irradiance, air temperature, rainfall, RH, and VPD data were also recorded. Principal Component Regression was used to assess the effect of micrometeorological variability on tree growth over time. For comparison, standard Multiple Linear Regression (MLR) was also used for data analysis. Results: Tree growth increased with increasing rainfall and relative humidity, but it decreased with rising maximum VPD, irradiance, and maximum temperature. Therefore, trees grew more slowly during the dry season, when irradiance, temperature and VPD were higher. Micrometeorological variability did not affect tree growth when MLR was applied. These findings indicate that ignoring the correlation between climatic variables can lead to imprecise results. Conclusions: A novelty of this study is to demonstrate the orthogonal effect of maximum VPD and minimum relative humidity on tree growth.

The Amazon rainforest has a significant impact on both water and carbon cycles, due to its enormous extension (~ 5.1 × 10 6 km 2 ) and the high amount of carbon stored in its vegetation, about 86 Pg (Saatchi, Houghton, Alvalá, Soares, & Yu, 2007). Tree growth can be defined as the increase of biomass through time and it is often estimated by measuring the increment of stem diameter over time -a proxy of biomass gains at the ecosystem level (Wagner, Rossi, Stahl, Bonal, & Herault, 2012;Wagner et al., 2014;Dias & Marenco, 2016;Antezana-Vera & Marenco, 2020). As tree growth is greatly affected by factors that affect photosynthesis, sun-induced fluorescence -a proxy of ecosystem photosynthesis has been used to estimate the effect of environmental factors on total carbon gain of the ecosystem (Lee et al., 2013;Yang et al., 2018;Green et al., 2020). DOI 10.15517/rbt.v69i2.44489 Tree growth can be affected by intrinsic factors (e.g., genetic make-up) and environmental factors, such as nutrient availability, irradiance, temperature, rainfall, and soil water content (SWC). The influence of environmental factors on tropical tree growth has been widely studied (Wagner et al., 2012;Mendes, Marenco, & Magalhães, 2013;Wagner et al., 2014;Dias & Marenco, 2016;Méndez, 2018). However, the drivers of tree growth are rather difficult to elucidate because they are often correlated (Bowman, Brienen, Gloor, Phillips, & Prior, 2013;Wagner et al., 2014). Therefore, the effects of climatic parameters, such as temperature, rainfall or SWC on tree growth, are still under investigation in the Amazon (Laurance et al., 2009;Wagner et al., 2014;Dias & Marenco, 2016). Rainfall and SWC seem to be the major factors that affect tree growth in the Amazon region, but there is still under debate whether Amazonian trees grow faster in the wet season than in the dry season. Although in most studies tree growth or ecosystem photosynthesis seems to decrease in the dry season (Méndez, 2018;Wagner et al., 2014;Yang et al., 2018;Antezana-Vera & Marenco, 2020), the opposite effect has also been reported (Laurance et al., 2009;Green, et al., 2020). Whereas Laurance et al. (2009) reported that tree growth was fastest during the dry period and positively correlated with maximum temperatures (T max ), Wagner et al. (2014) found no significant effect of T max on tree growth.
Climatic parameters are often correlated, and hence collinearity can lead to imprecise results by increasing the Variance Inflation Factor (VIF), as an increase in VIF can lead to false non-significant effects. Moreover, because of collinearity, the sign of a regression coefficient may change (from positive to negative or vice versa, Montgomery, Peck, & Vining, 2012). This is important because the significance and sign of regression coefficients are crucial to understand the effect of climatic variables on tree growth. Principal Component Analysis (PCA) is commonly used to deal with the collinearity problem, whereby a new set of independent variables (orthogonal components) is computed from the original regressors. However, PCA's disadvantage is the lack of a direct association between a response variable and the extracted components. To overcome this difficulty, PCA's orthogonal components can be used to perform Principal Component Regression (PCR). An accurate estimate of the effect of climatic variability on tree growth is essential due to the influence of the Amazon forest on the global carbon balance and regional climate. Thus, this study aimed to determine the collinearity-free effect of climatic variability on tree growth in the central Amazon. We hypothesize that the influence of highly correlated climatic variables such as relative humidity and VPD on tree growth can only be detected after removing the effect of collinearity.

Study site and plant material:
The study was conducted from January 2013 to December 2017 (the experimental period) at the Tropical Forest Experiment Station (ZF2 Reserve), in central Amazonia, located 60 km North of Manaus (02°36'21" S & 60°08'11" W). The area is a terra-firme rainforest plateau at about 120 m above sea level. Annual rainfall is 2 420 mm, with a mild dry season, with the driest months from July through September (≤ 100 mm per month. The soil is an Oxisol with low fertility, clay texture and pH of 4.2 to 4.5 In this site, tree density is high, about 600 tree ha -1 (> 10 cm diameter at breast height-DBH), canopy height can reach 35-40 m, and most of trees have less than 30 cm in diameter, while leaf area index varies from 4.7 in dry season to 5.0 in the wet season. Mean wood density is about 0.75 g cm -3 , and species diversity is high (Dias & Marenco, 2016). At a site 30 km of Manaus, Prance, Rodrigues, and Silva (1976) recorded 179 species of trees in one hectare (≥ 15 cm DBH).
During the experimental period, air temperature (T), photosynthetically active radiation (PAR), RH, and rainfall data were daily recorded above the forest canopy, at the top of a 40-m-tall observation tower (02°35'20" S & 06°06'55" W). Temperature, RH and PAR data were logged at 15 min (PAR) or 30 min intervals (T and RH) with specific sensors (Humitter 50y,Vaisala,Ov,Finland;Lincoln,NE,USA) connected to a data logger (Li-1400, Li-Cor), while daily rainfall data were collected with a tipping bucket gauge (ECR-100, Em5b, Decagon Devices, Pullman, WA, USA). The PAR data were integrated over a 24-h period to obtain a daily value (mol m -2 day -1 ). We also computed VPD and potential evapotranspiration (EVT) and measured soil water content (SWC, %, v/v). VPD was obtained as VP sat -RH × VP sat , where VP sat is the saturation vapor pressure; VP sat (kPa) = 0.61365exp[17.502T /(240.97 + T )], being T (ºC) the air temperature (Buck, 1981). EVT was obtained as: EVT = 0.0023R a × (T mean + 17.8) (T max -T min ) 0.5 , where R a denotes the extraterrestrial radiation (Hargreaves & Samani, 1985). Undisturbed soil samples were collected at a depth of 100 to 200 mm every two weeks to determine SWC after drying the samples at 105 °C, and then a mean monthly value was obtained.
In this study we collected data from 325 trees from more than 48 species (Digital Appendix), which had a mean diameter at breast height (DBH, diameter at 1.3 m from the ground) of 23.1 ± 11.8 cm. From tree diameter, tree height was estimated to be 22.5 ± 5.2 m (Nogueira, Nelson, Fearnside, França, & Oliveira, 2008). In these trees we measured radial growth at breast height at monthly intervals over 60 months (2013-2017) using stainless steel dendrometer bands, which had been installed three years before the beginning of the study.

Statistical analyses:
To assess the effects of the monthly microclimatic variability on tree growth Principal Component Regression (PCR) was used. We used this approach to remove the effect of collinearity among climatic variables. In this analysis, we used detrended tree growth (T GC , hereafter referred to as simply tree growth) instead of undetrended tree growth (T GR , i.e., raw data), because a timerelated trend in growth data can affect PCR results (Monserud & Marshall, 2001). This step was accomplished by using a first-order autoregression (Montgomery et al., 2012). Then the tree growth of the whole data set (N = 325) was randomly split into two subsets, one with 75 % of the trees (244 trees) was used to estimate the regression coefficients, and the remaining (25 %, i.e., 81 trees) was used for validation. Prior to PCR analysis the climatic data were standardized. In the PCR model, instead of including all the examined factors, we only used those that combined explained most of the variance (i.e., eigenvalues equal or greater than one). Also, for comparison the significance of the regression coefficients based on standard Multiple Linear Regression (MLR) were also computed. A MLR model can be represented by (Montgomery et al., 2012): (Equation 1) In Equation 1, y i denotes the dependent variable, x i the regressor, ß o the intercept, ß j the slope of the regression, and ϵ the error term, being ß o given by: (Equation 2) For standardized regressors, with mean x̄j and standard deviation s j , y i and ß o become: Being the mean square error (MSE = s 2 ) an estimator of σ 2 , and s = √(MSE). When the regressors are highly correlated principal components can be used for transforming those regressors into a new set of uncorrelated variables (orthogonal variables with each other). In terms of standardized regressors, the PCR can be computed, as follows (Montgomery et al., 2012): In Equation 17, T is a matrix whose columns represent eigenvectors (derived from X data), while in Equation 20, the columns of Z represent a new set of orthogonal scores (i.e., the z-scores), which are termed principal components (Montgomery et al., 2012). The α̂ coefficients (Equation 21) and the covariance of α̂ (Equation22) are given by: In Equation 27, the MSE is obtained as the regression of Y on the u-principal components retained in the reduced model, while t jm denotes the j th element of the eigenvector t m (m = 1 …u). The significance of the principal component estimator (b pc ) can be tested on individual coefficients by using the statistic t n-k-1 , where k represents the number of principal components in the reduced model, as described in Equation 28. (Equation 28) Statistical analyses were carried out using Statistica 7.0 (Stat Soft Inc., 2004).
The undetrended radial tree growth was 0.105 ± 0.11 mm per month (N = 325 trees), with a lower radial increment across the driest season (Fig. 1A). A preliminary analysis showed that by including all the 13 factors in the PCR model (full model), no effect on tree growth was observed, even when the regression explained 23.7 % of the total variance (F (13,46) = 1.10, P = 0.382, R 2 = 0.237). In fact, the full PCR model corresponds to the MLR model of tree growth on all the climatic variables (full model MLR). The principal component analysis (PCA) showed that the first four factors out of the 13 factors extracted by PCA (in bold face in Fig. 2) together accounted for 92.9 % of the total variance and had eigenvalues higher than one (λ j = 8.16, 1.86, 1.05, and 1.01). Whereas the values of λ 5 to λ 13 were lower than 1.0 (Fig.  2). Therefore, we retained the first four factors (Kaiser criterion) and used their corresponding eigenvectors to obtain the z j scores, hereafter referred to as principal components (z j ). In comparison with the full PCR model, the significance of the four-principal component model was improved (F (4,55) = 2.45, P = 0.056, R 2 = 0.151). By reducing the complexity of the model, the amount of variance on tree growth explained by the regression model was also reduced (23.7 against 15.1 %). Moreover, the four-factor model showed that only the principal components z 1 and z 3 had a significant effect on tree growth, whereas z 2 and z 4 did not (i.e., z 1 : P = 0.03, z 2 : P = 0.46, z 3 : P = 0.04, and z 4 : P = 0.86). Therefore, only z 1 and z 3 were retained for further analysis and hereafter referred to as the reduced model. As expected, significance of the reduced model was further improved by retaining only the two significant components, i.e., z 1 and z 3 (F (2,57) = 4.73, P = 0.012, R 2 = 0.142), with both regression coefficients being significant (α 1 = -0.003927, P = 0.03) and α 3 = -0.010267, P = 0.04). As z 1 and z 3 were retained in the reduced PCR model, only Factor 1 and Factor 3 are shown (Fig. 2), and for further information tree growth (T GC ) is included in Fig. 2 as a supplementary variable. In Fig. 2, by taking T GC as a reference point, climatic variables were separated into three groups. The first group comprised RH min , RH mean , rainfall, and SWC; which together with T GC are in quadrant (III) on the factor plane. These results suggest a positive correlation between each of them and T GC . The second group (PAR, EVT, T max and VPD max ) is located in quadrant I (i.e., diagonally opposite to T GC ), and thereby indicating a negative correlation between the variables of this group and T GC . The third group included variables located in adjacent quadrants, i.e., quadrant II (RH max ) and quadrant IV (T min , T mean , VPD min and VPD mean ), and then indicating a low correlation between each of these variables and T GC . We further investigated the relationship between the climatic variables and T GC after removing the effect of collinearity (i.e., by PCR).
The PCR regression coefficients (α 1 and α 3 ) were used to compute the beta coefficients, b pc (in b pc , the subscript pc stands for principal components) and their SE (Equation 25 and  Equation 27). The coefficients b pc based on standardized regressors are shown in Table 1, while those coefficients (b j ) obtained by MLR are shown and Table 2.
After removing the effect of collinearity, we found that tree growth was significantly responsive to variation in PAR (x 1 ), rainfall (x 2 ), T max (x 3 ), RH mean (x 4 ), RH min (x 5 ), VPD max (x 6 ), SWC(x 7 ), and EVT (x 8 ). Whereas T min , T mean , RH max , VPD min , and VPD mean had no significant effect on tree growth (Table 1). Tree growth increased with a rise in rainfall, SWC, RH mean , and RH min , whereas it decreased with  Table 1. increasing PAR, T max , EVT, and VPD max , as shown in Equation 29 (based on standardized regressors, Table 1) and Equation 30, for regressors in the original scale. To validate the PCR model (Equation 30) we used growth data of 81 trees, which showed that the R 2 derived from the validation data set was even slightly higher than the R 2 of data used to build the model (R 2 = 0.12 vs. 0.17, Fig. 3).  Table 1. EVT: potential evapotranspiration, PAR: photosynthetically active radiation, T: temperature, T max : mean maximum T, T mean : mean T, T min : mean minimum T, RH: relative humidity, RH max : mean maximum RH, RH mean : mean RH, RH min : mean minimum RH, VPD: air vapor pressure deficit, VPD max : mean maximum VPD, VPD min : mean minimum VPD, VPD mean : VPD mean, and MSE: mean square error. Climatic data were standardized prior to statistical analysis. For T GC , N = 244.
Significant P values are in bold face.

(Equation 30)
By applying the standard MLR approach, we found that none of the climatic parameters modified tree growth (F (13,46) = 1.10, P = 0.382, R 2 = 0.237, Table 2). Furthermore, in comparison with the results obtained with the PCR reduced model (only factor 1 and factor 3), the MLR coefficients (b j ) had much larger SE (Table 2). For instance, the SE of RH mean and VPD mean were more than two orders of magnitude higher than those obtained by PCR, due to the effect of collinearity. Besides having larger SE, some coefficients (T max , RH min and EVT) had opposite sign. In retrospect, using the result from PCR and hence discarding from the MLR model those climatic variables with no significant effect on tree growth (Table 1) did not yield any significant regression coefficient (F (8,51) = 1.55, P = 0.16, R 2 = 0.19).

DISCUSSION
Most of the climatic variables assessed had a significant effect on tree growth. The exceptions were T min , T mean , RH max , VPDmin and VPD mean which did not modify tree growth. Thus, these results partially support our hypothesis, as highly correlated variables  such RH max , VPD min and VPD mean influenced tree growth. We found that PCR explained 12 % of the total variance (R 2 = 0.12), which is not unexpected, as many factors can affect tree growth (Bowman et al., 2013). For instance, Wagner et al. (2012) found that only about 9 % of the variation in tree growth can be attributed to seasonal climate variability, which is slightly lower than the proportion of total variance explained by climatic variability in our study. The standard MLR explained 23.7 % of total variance in tree growth. Nevertheless, due to the large standard error (SE) associated with each coefficient ( Table 2), none of the regression coefficients significantly affected tree growth. On the other hand, even smallmagnitude coefficients obtained by PCR, such as RH mean and RH min , showed a significant effect on tree growth. This finding supports our hypothesis, as the detrimental effect of collinearity becomes evident when the correlation between climatic variables was disregarded and the data subjected to MLR. The large SE of MLR coefficients undermined the predictive power of the MLR model, and therefore, the influence of the climatic variables on tree growth was underestimated. For instance, the VIF of RH mean , RH max , and VPD computed by MLR (Table 2) were over three orders of magnitude greater than that of genuinely independent orthogonal regressors (Table 1), which magnified the SE up to 200-300 times (e.g., RH max , VPD mean and VPD max ) as compared with the SE obtained by PCR. We found that some MLR coefficients (e.g., T max and RH min ) had opposite sign.
The misleading effect of collinearity can occur because the variance of a regression coefficient (say b 1 ) is inversely proportional to the amplitude of the regressor [i.e., var(b 1 ) = σ 2 /∑(x i -x̄) 2 ]. Hence, when the variance is so large, and the actual value of a coefficient is close to zero, a regression coefficient with opposite sign can result (Montgomery et al., 2012). This is remarkable because it can be concluded that a variable x j has a positive (or negative) effect on Y, when in fact the opposite is true. Tree growth increased with rising mean and minimum RH, whereas it decreased with increasing VPD max and EVT. Thus, by observing the VIF factor presented in Table 2, it is tempting to discard from the regression model not only RH but also VPD. Firstly, because a VIF value above 10 is an indicative of strong collinearity among regressors (Montgomery et al., 2012). Secondly, because it may be expected that the effects of these variables are already included within the effect of temperature. However, discarding these variables from the model may weaken its predictive strength as EVT, RH mean , RH min and VPD max had a truly independent effect on tree growth. Collinearity dramatically increases the VIF, making it difficult to quantify the individual contribution of a regressor with little but real independent effect on a dependent variable, such as tree growth (Montgomery et al., 2012;Bowman et al., 2013).
Tree growth was positively responsive to an increase in rainfall intensity, whereas T max and PAR had a negative effect on growth rates. The effect of T max on tree growth found in this study agrees with the results of Way and Oren (2010), who reported that tree growth of tropical species can be negatively affected by warming. On the other hand, our results disagree with those of Wagner et al. (2014) and Laurance et al. (2009). Wagner et al. (2014) reported no effect of T max , whereas Laurance et al. (2009) found a positive effect of maximum temperature on tree growth. This discrepancy can be ascribed to difference in environmental conditions during data collection. For instance, Green et al. (2020) reported that ecosystem photosynthesis increases in the central Amazon when VPD increased from 0.1 to 10 hPa. In tropical rainforests, the optimum temperature for photosynthesis is about 29 °C (Liu, 2020), with decreasing photosynthetic rates at higher temperatures. This can help explain the decline in tree growth with rising T max . Beside the effect of temperature on photosynthesis, a raise in temperature has also an effect on transpiration via the effect of temperature on water viscosity (Darcy´s Law). In fact, in this experimental site, EVT can increase in the dry season when temperatures are higher (Antezana-Vera & Marenco, 2020).
There are reports associating tree growth or ecosystem photosynthesis to variations in temporal rainfall variability in the Amazon region (Lee et al., 2013;Méndez, 2018;Yang et al., 2018) or VPD (Lee et al. 2013;Green et al., 2020). Some studies that aim to assess the effect of rainfall seasonality on tree growth in the Amazon have led to different results. Dias and Marenco (2016) and Silva et al., (2003) found no increase in tree growth during the wet season, whereas Wagner et al. (2014), Méndez (2018), and Antezana-Vera and Marenco (2020) reported that tree growth increased with an increase in rainfall intensity. Likewise, Lee et al. (2013) and Yang et al. (2018) reported a decline in photosynthesis-related activity during the dry season. Altogether these results indicate that the magnitude of the effect of drought on tree growth is related to the length of the dry season. In this study, we demonstrated that PCR could be a handy tool. We provide evidence that an increase in VPD max (from 17 hPa -wet season to 23 hPa in the dry season) leads to a reduction in tree growth. Interestingly, such effect was only observed after removing the effect of collinearity. Marenco et al. (2014) showed that photosynthesis of canopy leaves (22-27 m tall trees) is closely related to stomatal conductance (g s ). They reported that g s increased and reached its maximum value at a VPD of 16 hPa, and then it declined and became almost null at a VPD of 28 hPa. Likewise, Mendes and Marenco (2017) observed that g s increased with increasing VPD in the range of 5 to 10 hPa. These results show that the effect of VPD on photosynthesis depended on the level of atmospheric moisture. Green et al. (2020) reported that ecosystem photosynthesis can increase at VPD values lower than 10 hPa, Lee et al. (2013), on the other hand, estimated that ecosystem photosynthesis declined as VPD progressively increased from 3.5 hPa (wet season) to 32 hPa in the dry season, which is in agreement with the results found in our study. Solar radiation is intrinsically associated with tree growth via its effect on photosynthesis, and it has been reported that in tropical rainforests an increase in solar radiation can lead to an increase in tree growth (Wagner et al., 2014). On the contrary, we found that an increase in PAR leads to a decline in tree growth, which agrees with the results of Yang et al. (2018) and Méndez (2018). Yang et al. (2018) observed a decrease in solar-induced fluorescence during the drought of 2015-2016 in the Amazon region. Likewise, in a study carried out at the same experimental site, Antezana-Vera and Marenco (2020) found that transpiration significantly increased with an increase in PAR and VPD. An increase in transpiration rates does not mean an increase in g s and photosynthesis. In fact, most of the time, g s decreases as transpiration increases in response to an increase in VPD (Dai, Edwards, & Ku, 1992), which can explain the negative effect of PAR and VPD on tree growth reported in this study.
Our results are relevant because of the global importance of the Amazon forest and because of the effects of the ongoing climate changes, which have increased the temperature (about 0.16 °C per decade) and altered rainfall distribution, ranging from lower rainfall intensity (longer dry seasons) in Eastern and Southern Amazonia to higher rainfall intensity in the Northern Amazon (Marengo et al., 2018). The dry season is associated with a rise in solar radiation, temperature, and VPD (Lee et al., 2013;Green et al., 2020), which ultimately can lead to a decline in photosynthesis (Lee et al., 2013;Marenco et al., 2014;Yang et al., 2018). Because most of the climatic variables are correlated, assessing the collinearity-free effect is important to accurately quantify the climatic drivers of tree growth. Increased dry season length has been forecasted for some parts of the Amazon (Marengo et al., 2018), which ultimately may reduce tree growth, not only reducing soil water availability, but also by increasing VPD and reducing RH. Our results demonstrate that trees of the central Amazon grow more slowly during the dry season, not only due the effect of a drop in rainfall intensity, but also in response to the effect of an increase in maximum temperature, evapotranspiration, and maximum vapor pressure deficit, and a decline in mean and minimum relative humidity. To our knowledge this is the first time the collinearity-free effect of RH min , RH mean , EVT and VPD max on tree growth in the Amazon region has been evaluated. The novelty of this study is to demonstrate the orthogonal effect of VPD max and RH min on tree growth in the central Amazon, which contributes to enhance the current knowledge of the ecophysiology of Amazonian trees.
Ethical statement: authors declare that they all agree with this publication and made significant contributions; that there is no conflict of interest of any kind; and that we followed all pertinent ethical and legal procedures and requirements. All financial sources are fully and clearly stated in the acknowledgements section. A signed document has been filed in the journal archives.