# GEOG 566

May 21, 2017

### Geographically Weighted Regression Analysis of Tropical Forest

Filed under: 2017,Tutorial 2 2017 @ 10:10 pm
• Is crown width linearly related to height in space in the study area?
1. Name of the tool or approach that you used.
• I use Geographically Weighted Regression (GWR) tool in ArcMap to assess the linear relationship between crown width and height in space in the study area. Since the data has not been georeferenced, the Height is the sum of the tree height and the elevation (ground height).
• GWR constructs a separate equation for every feature in the dataset incorporating the dependent and explanatory variables of features falling within the bandwidth of each target feature. The shape and extent of the bandwidth is dependent on user input for the Kernel type, Bandwidth method, Distance, and Number of neighbors’ parameters with one restriction: when the number of neighboring features would exceed 1000, only the closest 1000 are incorporated into each local equation. GWR should be applied to datasets with several hundred features for best results. It is not an appropriate method for small datasets.

1. Brief description of steps you followed to complete the analysis.
• The first step is load the dataset into the ArcMap. As reminder, this dataset is the same dataset I used in exercise 2. The dataset is generated from UAV visual images. The UAV visual images were processed through some software. Agisoft photoscan was used to make a 3D point cloud based on the UAV visual images. Then, the 3D point cloud was used in Fusion software to derive forest structure measurements, such as number of trees, height, and canopy width. The result of this data processing is a data set in a table consisting of 804 detected trees with their height and crown width. Unfortunately, because the images were not georeferenced yet, the height in this tutorial is the sum of tree height and elevation.
• The second step is open the Geographically Weighted Regression (GWR) tool. It will appear like the figure 1 below. Then, we need to choose the Input features, dependent variable, and explanatory variable. We can choose more than one explanatory variable, but in this exercise I only choose one. My dependent variable is Height (Ground Height + Tree Height), and my explanatory variable is crown width.

Figure 1. Geographically Weighed Regression Tool.

• The next step, we can choose the kernel type, bandwidth method, and number of neighbors. By default, the kernel type is “Fixed”; the Bandwidth method is AICc; Number of neighbors is 30. I decided to use the default format because I think it is already appropriate for my dataset. As it is for my dataset, it is especially important to select Fixed for Kernel type whenever distance is a component of the analysis.
• To finish the analysis just click “Ok”. We can see the whole result by looking at the attribute table (see figure 2). ArcMap will show the Residual Standard Deviation (RSD) for each feature. It is shown in a new layer with different colors which show the level or range of the RSD (see figure 3).

Figure 2. Attribute table of Geographically Weighed Regression result.

Figure 3. Distribution of Residual Standard Deviation in study area.

• However, it is more interesting and useful to see the Coefficient of the explanatory variable instead of Residual Standard Deviation (RSD). Therefore, we need to make some changes in the layer properties. To open the layer properties, right click on the layer and choose “properties”. In the layer properties, particularly in the “Symbology”, we can change the field value from Residual Standard Deviation (RSD) into the Coefficient of the explanatory variable (see figure 4). We also can change the number of classes and color. In my case, I choose three classes to distinguish features (trees) that have positive linear relationship, negative linear relationship, and trees that have coefficient (linear relationship) close to zero. The layer after adjustment can be seen in the figure 5.

Figure 4. Layer properties:Symbology.

Figure 5. Distribution of coefficient of the explanatory variable.

1. Brief description of results you obtained.
• The result can be seen in Figure 2, 3, 4, and 5. The attribute table shown in the figure 2 consists of all the value related to regression analysis. There are observed and predicted value, coefficient of the explanatory variable, intercept, standard error, residual standard deviation, etc. From the result, in general, Crown width and Height is positively related. Which means there is increase in the Height for every one unit increase of Crown width. In other words, the bigger the crown the higher the tree will be.
• However, if we see the result for each individual feature (tree), some of the trees have positive linear relationship between Crown width and Height (Ground height + Tree height), and some other trees have negative linear relationship. The distribution of trees that have positive and negative linear relationship can be seen in Figure 5. The red points indicate trees that have negative linear relationship between Crown width and Height, which means trees with big crown will have lower height. On the other hand, blue points indicate trees that have positive linear relationship, which means trees with big crown will have higher height. While the white points indicate trees that have either positive or negative linear relationship, but their coefficients are close to 0.
• Differences in linear relationship (positive and negative linear relationship) in this case might be happened due to some factors, such as different trees species, different elevation, or error factor from the Fusion software analysis. Tropical forest has hundreds different trees species that have different characteristic. Some of the trees have big crown and high trunk, and some other have big crown and short trunk. In addition, different elevation can give significant effect because the Height data used in this case is the total of ground height (elevation) and tree height. Trees with positive linear relationship might be distributed in the area with higher elevation (hill, mount, or peak). On the other hand, trees with negative linear relationship might be distributed in the area with lower elevation (watershed or valley). Trees with coefficient close to zero might be occurred because of the data that has not been georeferenced and the algorithm used in Geographically Weighted Regression analysis that included data from the neighbor features (trees). That can affect the value of coefficient in linear relationship.
• In general, the R-squared is quite low, with most of the features (trees) have R-squared lower than 0.2. To improve the regression analysis, I think I need to georeference the data. The Height which is the sum of the ground height (elevation) and tree height can affect the regression model (intercept, coefficient, etc) between Crown width and Height. I also can add additional explanatory variable like “tree species” to increase the accuracy of the linear model.
1. Critique of the method – what was useful, what was not?
• The method was really useful to generate regression model for all feature (trees). It helps to understand the distribution of the trees with different coefficient or other values, such as standard error and residual standard deviation because there is a new layer as an output that can show those values in space in the study area.
• However, dependent and explanatory variables should be numeric fields containing a variety of values. Linear regression methods, like GWR, are not appropriate for predicting binary outcomes (e.g., all of the values for the dependent variable are either 1 or 0). In addition, the regression model will be misspecified if it is missing a key explanatory variable, for example in my case is elevation and tree species.

Source

http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-statistics-toolbox/geographically-weighted-regression.htm