GEOG 566






         Advanced spatial statistics and GIScience

May 5, 2017

Tutorial 1: Use of the Optimized Hot Spot Analysis for explaining patterns of Socio-demographic and spatial variables of survey responses regarding flood-safety in neighborhoods

Filed under: Tutorial 1 @ 11:52 pm
  • 1. Question Asked

    How socio-demographic and spatial variables explain patterns of survey responses about perceptions and attitudes regarding flood-safety in neighborhoods?

    Location of survey responses and variable values (variable: “perception of flooding reaching your home”) are shown in Figure 1.1.

    Figure 1.1. Location of survey responses and variable values (variable: “perception of flooding reaching your home”)

    2. Name of the tool or approach used

    Various tools are used to accomplish the objective of setting up inputs for the optimized hot spot analysis. The motivation of optimizing this analysis is because I was not able to identify any clear spatial pattern (I was getting trivial results) on my initial hot spot analysis without optimizing it.

Four tools are used to accomplish our objective:

  • For finding out the “Analysis Field” for optimizing calculations
    • Generate Near table in Arc Map

     

  • For missing data:
    • Replace all missing values in numerical variables with the mean using R or/and
    • Replace all missing values in categorical variables with the median using R

     

  • For the spatial analysis I am using:
    • Optimized Hot Spot Analysis in ArcMap

     

  • For creating a prediction surface map using results from the Hot Spot Analysis:
    • Kriging in ArcMap

3. A brief description of steps followed to complete the analysis

I followed the next steps:

  • For finding out an “Analysis Field” for optimizing calculations

Find distances from neighborhood lot properties to near streams

  • Use Near features tool in ArcMap. For “Input Feature” use the lot property shape file. For “Near Features” use the streams shape file.

  • In the output table, Identify “IN_FID”, which is the ID of the input feature(lot property ID)

  • Join the output table to the input features table based on “OBJECTID” of the input feature and “IN_FID” of the output table.

  • Now you have the distances in the “Input Feature” table. Next, export the table and copy variables of interest as one more value of the Survey dataset.
  • For missing data:

I have used a very simple concept: Replace all missing values in numerical variables with the mean and/or, replace all missing values in categorical variables with the median. The task for analyzing 108 variables was coded in R. The coding steps, showing only for some variables, is as follows:

  • For the spatial analysis:

Now we are ready to perform the hot spot analysis using the “Optimized Hot Spot Analysis” tool of ArcMap. As you can see in the figure below, there is an “Analysis Field (optional)” field, which in our case is quite useful.

For my problem:

Input Features: categorical survey responses with its corresponding location.

Analysis field (optional): corresponds to the properties distance to the water streams. This variable was calculated in 3.1. This is a continuous variable.

  • For creating a prediction surface map:

For this purpose, I have used results from the Hot Spot Analysis in the “Kriging” tool of ArcMap.

For my problem:

Input point features: Feature results from the Hot Spot Analysis.

Z value field: corresponds to the “GiZ scores” results from the Hot Spot Analysis.

4. Brief description of obtained result

The following results correspond to one variable of the survey responses (the survey contains 104 categorical variables to be analyzed).

Three scenarios of spatial correlation have been analyzed.

 Scenario 1: Patterns of resident’s perceptions of flooding reaching their home correlated to the distance of all streams (Willamette River, Marys River, Millrace) surrounding near the neighborhood. The results are shown in Figure 4.1

Figure 4.1. Patterns of resident’s perceptions of flooding reaching their home correlated to the distance of all streams (Willamette River, Marys River, Millrace) surrounding near the neighborhood.

The hot spot analysis yields patterns formation correlated to the distance of all streams (Willamette River, Marys River, Millrace) surrounding near the neighborhood.

Scenario 2: Patterns of resident’s perceptions of flooding reaching their home correlated to the distance of two major rivers (Willamette River and Marys River) surrounding near the neighborhood. The results are shown in Figure 4.2

Figure 4.2. Patterns of resident’s perceptions of flooding reaching their home correlated to the distance of two major rivers (Willamette River and Marys River) surrounding near the neighborhood.

The hot spot analysis yields patterns formation correlated to the distance of two major rivers (Willamette River and Marys River) surrounding near the neighborhood.

Scenario 3: Patterns of resident’s perceptions of flooding reaching their home correlated to the distance of a seasonal stream (Millrace) crossing the neighborhood. The results are shown in Figure 4.3

Figure 4.3. Patterns of resident’s perceptions of flooding reaching their home correlated to the distance of a seasonal stream (Millrace) crossing the neighborhood.

The hot spot analysis yields patterns formation correlated to the distance of a seasonal stream (Millrace) crossing the neighborhood.

5. Critique of the method

  • Generate Near table:

This method is useful for finding the shortest distance between two features. This information was quite useful for my analysis.

  • Replace missing data:

Missing data was one of the issues I encountered analyzing my data. I have first tried to remove missing values using the “VIM” package in R, which uses multiple imputation methodologies, but this method didn’t work out in my case. I was getting messages of problems trying to invert matrices. This problem can be attributed to the small range of values of the categorical variables vectors. So, I have used a very simple concept: Replace all missing values in numerical variables with the mean and/or, replace all missing values in categorical variables with the median (IBM Knowledge, Center  n.d.). This helped me a lot.

  • Optimized Hot Spot Analysis:

For my problem, found more useful the “Optimized Hot spot Analysis” tool rather than the “Hot Spot Analysis (Getis-Ord Gi*)” tool because the “Analysis field” allowed me to find and optimize clusters formation in my data.

  • Kriging:

This ArcMap tool allowed me mapping cluster formation based on the hot spot analysis outputs. This tool allow better visualization of spatial patches.

References

Getis, A., & Ord, J. K. (2010). The Analysis of Spatial Association by Use of Distance Statistics. Geographical Analysis, 24(3), 189–206. https://doi.org/10.1111/j.1538-4632.1992.tb00261.x

IBM Knowledge Center – Estimation Methods for Replacing Missing Values. (n.d.). Retrieved May 8, 2017, from https://www.ibm.com/support/knowledgecenter/en/SSLVMB_20.0.0/com.ibm.spss.statistics.help/replace_missing_values_estimation_methods.htm

 

Print Friendly


No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

© 2017 GEOG 566   Powered by WordPress MU    Hosted by blogs.oregonstate.edu