GEOG 566






         Advanced spatial statistics and GIScience

June 11, 2018

Exploring recreational movement behavior through hidden Markov models

Filed under: 2018,Final Project @ 3:55 pm

Research Question Asked

Glacier Bay National Park and Preserve (GLBA), located in southeast Alaska, contains over 2.7 million acres of federally designated terrestrial and marine wilderness (National Park Service, 2015). Recreation users access GLBA Wilderness primarily by watercraft; the park lacks formal trail networks in its wilderness and terrestrial connectivity is fragmented by the park’s water resources. First designated as wilderness in 1980 through the Alaska National Interest Lands Conservation Act, management of the park’s wilderness has been guided by a 1989 Wilderness Management Plan (National Park Service, 1989). Much has changed in Alaska and GLBA since that time, and the park is currently engaged in updating its Wilderness Management Plan to adapt its management practices to modern management contexts. Park managers are particularly interested in developing a better understanding the wilderness experiences of water-based backcountry overnight users in GLBA Wilderness. As such, a dataset of global-positioning system (GPS) tracks of water-based visitor travel patterns were collected during the summer 2017 use season to record the spatial and temporal behaviors of recreationists in GLBA Wilderness.

The movement ecology paradigm provides a useful, organizing framework for understanding and conducting path analysis on GPS tracks of recreationists in GLBA Wilderness. Formally proposed in 2008, the movement ecology paradigm was designed to provide an overarching framework to guide research related to the study of organismal movement, with specific emphasis on guiding questions of why organisms move and how movement occurs through the lens of space and time (Nathan et al., 2008). The framework emphasizes understanding components of the movement, looking for patterns among those components, and understanding meaning behind movement through the underlying patterns (Figure 1). Ultimately, the target understanding is the movement path itself, which can be understood through quantification of the external factors, internal factors, and capacity for movement and/or navigation by the moving organism (Figure 2, Nathan et al., 2008). Through employing a movement ecology approach to the study of overnight kayaker movements in a protected area, individual movement tracks can be broken down into relevant components, the components of the path can be studied for patterns, and ultimately internal and external factors can be explored for influence or explanation of the movement path.

The following two aspects of the movement ecology framework were the focus of this study:

Movement States (Figure 1)– The movement ecology framework operationalizes movement as a series of step lengths and turning angles that can be used to identify underlying behavioral states. This follows the assumption that organisms move for specific reasons, and those reasons are reflected in the distance and direction of travel an organism travels in a set amount of time. The focus of this study was to transform raw GPS data into a path of step lengths and turning angles and to identify underlying behavioral states using step length and turning angle data.

Figure 1. Organizing framework for path analysis approach for this study (figure from Nathan et al., 2008): A) Understanding movement as a series of movement steps and associated turning angles, and B) Understanding values in the series of step lengths and turning angles as characteristic of behavioral states.

External Factors Affecting Movement (Figure 2) – According to Nathan et al. 2008’s framework, external factors are one of the factors influencing the movement path. In this study, external factors were operationalized as landscape-level features that may influence where a recreation travels in Wilderness and the type of behavior in which the recreationist engages. Two environmental variables, distance to shore and bathymetry were explored for influence on the movement paths.

Figure 2. Movement ecology framework (figure from Nathan et al., 2008). Of focus in this study is the exploration to two external factors, bathymetry and distance to shoreline, and their relationship to the movement path and associated behavioral states.

To apply the above-mentioned aspects of the movement ecology framework to my study of recreationist behavior in GLBA Wilderness, the following research questions were asked:

  • What are the mean step length and mean turning angle values for emergent behavioral states observed among the movement patterns of recreational kayakers?
  • How does bathymetry influence the transition probability between emergent behavioral states?
  • How does distance to shoreline influence the transition between emergent behavioral states?
  • Which external factor, distance to shoreline or bathymetry, has more explanatory power for transition probabilities in emergent behavioral states?

Description of Datasets Used

Dependent Variables Dataset:

GPS Dataset: The dataset for analysis was a test group of five GPS tracks taken from a sample of 38 GPS tracks collected during the summer of 2017 (Figure 3). Recreation grade, personal GPS units were administered to a sample of recreationists entering GLBA wilderness via personal kayak, June through August 2017. Study participants were asked to carry the GPS unit for the duration of their trip and return the unit at the end of their trip. GPS units continuously recorded movement throughout the trip.

Temporal Resolution and Extent: Units recorded a GPS point at various intervals, determined as a function of speed of travel. When speed was recorded at 0 miles per hour (MPH), the GPS units recorded an X,Y location point every 60 seconds. When speed was recorded at 1 MPH, the GPS units recorded an X,Y point every 15 seconds. When speed was 2 MPH or greater, the units recorded an X,Y GPS point every 8 seconds. The test group of GPS tracks were collected on trips taken in late June and early July 2017. The recorded tracks had between two and four days of data. For the purposes of this analysis, each of the test batch tracks were down-sampled such that data were aggregated into one-minute time bins, with X and Y data being averaged across the minute. In this way, the temporal resolution of the data during the analysis phase was 1-minute.

Spatial Resolution and Extent: At each time interval (described above), the GPS units recorded X and Y coordinates. Coordinates were recorded in decimal degrees. The geographic coordinate system for the data is GCS_WGS_1984. For analysis, the data were projected into the NAD_1983_UTM Coordinate System with a Zone 8N projection. The spatial extent for the dataset is the park boundary for GLBA.

Independent Variables Datasets:

Bathymetry: Bathymetry was incorporated using a 25-meter raster layer of the bathymetry underlying GLBA’s marine Wilderness. The bathymetry data layer was accessed from the publicly available National Park Service data clearinghouse available at the following link www.irma.nps.gov. The downloaded data layer was added into an existing base map in ArcMap 10.3 that contained the analysis area (Glacier Bay National Park Wilderness) and the aggregated point shapefile of test batch movement data. The Extract Values to Points (Spatial Analyst) tool in ArcGIS was used to join raster cell values underlying the movement data together for analysis.

Distance to Shoreline Dataset: A landcover vector dataset was accessed from the publicly available National Park Service data clearinghouse available at the following link www.irma.nps.gov. The downloaded data layer was added into an existing base map in ArcMap 10.3 that contained the analysis area (Glacier Bay National Park Wilderness) and the aggregated point shapefile of test batch movement data. Using the Joins and Relates option available in ArcGIS, the point-based GPS data were joined to the polygon (landcover) data. During the joining process, the distance from the landcover layer (the shoreline) was measured from each point, and added to the dataset as a unique field for each point. In this way, the distance from shoreline, in meters, was calculated for each GPS data point for use in analysis.

Figure 3. Map displaying datasets used in analysis. The yellow points on the map represented the aggregated shapefile of movement data used in the analysis. The blue area represents the bathymetry data layer. The distance from each yellow point to the nearest land cover class was calculated through a spatial join to operationalize the distance to shore dataset.

Hypotheses

For each research question, the following hypotheses were developed:

  • What are the mean step length and mean turning angle values for emergent behavioral states observed among the movement patterns of recreational kayakers?
    • Two behavioral states will emerge from analysis of the step length and turning angle movement data, one state representing a movement-oriented state in which step lengths are longer and turning angles are narrower and the second state representing a resting-oriented state in which step lengths are shorter and turning angles are wider. Specific values for step length and turning angle for each behavioral state were not hypothesized.
  • How does bathymetry influence the transition probability between emergent behavioral states?
    • As bathymetry increases (i.e., as the depth of the ocean increases), kayakers will be more likely to be in a movement-oriented state rather than a resting-oriented state.
  • How does distance to shoreline influence the transition between emergent behavioral states?
    • As distance to shoreline increases (i.e., as the kayaker is further away from the shoreline), kayakers will be more likely to be in a movement-oriented state rather than a resting-oriented state.
  • Which external factor, distance to shoreline or bathymetry, has more explanatory power for transition probabilities in emergent behavioral states?
    • Distance to shoreline is likely to have greater explanatory power than bathymetry. This hypothesis is sort of an off-the-cuff assumption that because kayakers can see the shoreline and landscape features their behavioral state is more likely to be influenced by distance to shoreline than bathymetry, which would be a harder environmental variable to perceive while kayaking. The hypothesis is therefore based on assumptions about human-perception of environmental characteristics.

Approaches

The overarching analytic approach for this study was use of hidden Markov models for operationalizing behavioral states within the test batch of movement paths (Langrock, et al., 2012; Michelot, et al., 2016). An R package, moveHMM (Michelot, Langrock, & Patterson, 2016), was the primary analytic tool used. The hidden Markov model approach requires that measured data, in this case time-stamped X and Y coordinates of movement actually represent movement data. The literature on the application of hidden Markov models for movement behavior suggests this requirement can be met by assuming that spatial inaccuracy does not exist within the data (i.e., the measured coordinates represent actual movement behaviors) and by regular time-stamped sampling of the GPS data (i.e., no missing data). Through assuming that the measured data represents a known state (in this case, movement), the hidden Markov model uses patterns in the measured data to reveal the “hidden” underlying states in the data. In this application of hidden Markov models, the hidden states being modeled are two behavioral states defined by combinations of co-occurring step length and turning angle movements. Additionally, covariates can be explored to understand how a co-occurring environmental variable that is changing in space and time with the step length and turning angle movement data may or may not correlate with behavioral shifts between states through hidden Markov models. Data processing and analytic approaches comprised three main phases, described below.

Phase 1 Summary: Relating Environmental Covariates to Step Length and Turning Angle Data (Exercise 2)

ArcGIS tools were used to spatially join environmental covariates (bathymetry and distance to shore) datasets to the point data prior to generating step length and turning angles. ArcGIS was also used as a data visualization tool to generate an overall understanding of the spatial extent of datasets being used.

Phase 2 Summary: Generation of Step Lengths and Turning Angles from GPS Data (Exercise 1)

The “prepData” function in the moveHMM R package was used to convert the series of X and Y coordinates into a series of step length, turning angle, and averaged covariate values. The “prepData” function requires that the GPS data be regularly sampled, and that each observation has a unique numeric ID code associated with the data in the data frame – data processing and summarizing tools in R were used to meet these requirements. Additionally, an R function was used to convert data from a latitude and longitude geographic coordinate system to a projected, UTM coordinate system to generate meaningful step length values of meters through the prepData tool.

Phase 3 Summary: Fitting the moveHMM Models and Evaluating Results (Exercise 3, Parts 1 and 2)

The “fitHMM” function was used from the moveHMM R package to run the hidden Markov models, generate behavioral probabilities, and run an AIC analysis on the fit models.

Results

Step Length and Turning Angle Generation:

The step length and turning angle histograms suggest that across the five test tracks processed using the moveHMM tool, the majority of step lengths across the tracks were between 600-700 meters per minute and kayakers generally traveled in a straight direction turning infrequently. The step lengths were greater than originally anticipated, and after further investigation I learned this is likely because kayakers have the option of taking a day boat back to the visitor center from their backcountry location rather than paddling back to the visitor center. This new information was a surprise to receive so late in the analysis, but likely explains why the step lengths of 600-700 meters per minute occur in the dataset. Future exploration of these techniques on this dataset will need to account for this discovery, likely through elimination of the day boat portion of the GPS track from analysis. The 0-100 meter per minute step length bin likely represents stoppage time. The disparity between the two step length categories suggests that in the case of water-based recreationists, step length may be a good metric for examining changes in behavioral state in future bivariate analyses.

Turning angle histograms suggest that for the most part, kayakers are traveling in a relatively straight direction with little variation away from that direction. When turning movements are made, they tend to turn toward the left. This appears to be the result of making a loop trip in which kayakers initiate their trips following the eastern coast of GLBA inlet and finish their trips following the western coast of GLBA inlet.

Figure 4. Histograms of generating turning angles and step lengths for movement path two.

Best Fitting Models:

Two models were fit to the step length and turning angle data using the moveHMM tool: a two-behavioral state model with distance to shore as a covariate for behavior transition and a two-behavioral state model with bathymetry as a covariate for behavior transition. The decision to model a two-behavioral state model came from review of the step length and turning angle histograms and conversation with my research advisors, instructor, and classmates. Figure 5 reports the model outputs for these model runs.

Figure 5. Results from the model outputs for the best fitting parameters with distance to shore as a covariate (left panel) and bathymetry as a covariate (right panel). The distance to shore model suggests a two-state model, with one state being characterized by shorter step lengths (mean = approximately 1 meter) and wider turning angles (mean = 3.1 degrees) and a second state being characterized by longer step lengths (mean = approximately 224 meters) and narrow turning angles (mean = -0.014 degrees).The bathymetry model suggests a two-state model, with one state being characterized by shorter step lengths (mean = approximately 2.5 meters) and wider turning angles (mean = 3.1 degrees) and a second state being characterized by longer step lengths (mean = approximately 262 meters) and narrow turning angles (mean = -0.007 degrees).

Visual Presentation of Results and AIC Analysis

Results of the AIC analysis to determine the favored model suggest that model 2, the bathymetry model is favored over model 1, the distance to shore model, given AIC values of 72708.23 and 82097.23 respectively. Given these results, the remaining visualizations and interpretation of results is provided for Model 2, with bathymetry as a covariate.

Figure 6. Density histogram of step length states for the Bathymetry Model. The density distribution and model fit curves to not suggest practical significant visual differences in the step length distributions for states 1 and 2. However, the figure suggests that the distribution for state 2, the movement oriented state, has a higher density across step lengths from slightly greater than 0 to approximately 300 meters in length than state 1, which peaks right around a step length of 0.

Figure 7. Density histogram of turning angle states for the Bathymetry Model. Turning angles for State 2 cluster around 0, which is expected given the narrow turning angle movement mean reported for this state. The curve of State 1 is unexpected, given that the mean of the turning angle movement reported for this state is around 3.

Figure 8. Distribution of behavioral states across time for track ID 2. The figures show that the majority of the movement track is spent in state 2, the moving state, and that state 1, the resting state, occurs infrequently throughout the movement track.

Figure 9. Transition probability matrix for influence of bathymetry on transition between or among behavioral states. The four transition probability plots show that as bathymetry increases (water depth increases), the likelihood of staying in state 1, the resting state, decreases dramatically at a small increase in bathymetry and then remains at 0. Similarly, at small increases in bathymetry, the probability of transitions from state 1, a resting state, to state 2, a moving state increases rapidly and then states at a probability of 100%.

Figure 10. Displays three separate outputs for Track 2 that begin to tell a story of what may be going on in the data. A) Example output for location of behavioral states in space for track ID 2. In theory, the figure would show blue to characterize portion of the track during which the individual is in state 2 and orange for portions for the track during which the individual is in state 1. At this scale, state 1 is hard to see, but there are small clusters around the beginning and end of the trip that show some orange track pieces. This figure shows that for individual 2, the majority of the track is in state 2 behavior rather than state 1 behavior. B) A display of the track (the orange color has no relevance) overlayed on a satellite image of the surrounding area. The few instances of state 1 resting behavior in the track co-occur in space with access to a glacier, suggesting that potential landscape features other than bathymetry and distance to shore may be better predictors of behavior state changes. C) Temporal display of track three, with transition in color through the track displaying the passage of time. The color gradient changes somewhat rapidly in a small amount of space where the state 1 behavior occurs.

For each research question, the following hypotheses were developed:

  • What are the mean step length and mean turning angle values for emergent behavioral states observed among the movement patterns of recreational kayakers?
    • Result: A two-behavioral state model was developed, with the following parameter estimates for step length and turning angle by state derived from the best fitting model, the bathymetry model: The bathymetry model suggests a two-state model, with one state being characterized by shorter step lengths (mean = approximately 2.5 meters) and wider turning angles (mean = 3.1 degrees) and a second state being characterized by longer step lengths (mean = approximately 262 meters) and narrow turning angles (mean = -0.007 degrees).

 

  • How does bathymetry influence the transition probability between emergent behavioral states?
    • Result: As bathymetry increases, the likelihood of staying in a stationary behavioral state, if already in a stationary behavioral state, decreases rapidly. Similarly, the likelihood of transitioning between a stationary state and a movement-oriented state increases rapidly and abruptly as bathymetry increases.

 

  • How does distance to shoreline influence the transition between emergent behavioral states?
    • Result: As distance to shore increases, the likelihood of staying in a stationary behavioral state, if already in a stationary behavioral state, decreases rapidly. Similarly, the likelihood of transitioning between a stationary state and a movement-oriented state increases rapidly and abruptly as bathymetry increases. (Results not pictured in this exercise)

 

  • Which external factor, distance to shoreline or bathymetry, has more explanatory power for transition probabilities in emergent behavioral states?
    • Result: Through an AIC analysis comparing the Bathymetry Model and the Distance to Shore Model, the Bathymetry Model was favored over the Distance to Shore Model as the preferred model for modeling two-state movement behavior. This result is counter to my hypothesis that distance to shore would be the favored model.

Despite these findings, I am not overly confident in the model outputs and conclude that additional work is needed to fully apply and understand the use of hidden Markov models for understanding movement behavior. First, the sample of tracks modeled is only representative of 5 movement patterns, and in reviewing these movement patterns they are quite distinct from each other. Additionally, imbedded in the data is the potential for tracks collected while recreationists were taking the dayboat rather than while independently kayaking. This fact not only impacts the step length and turning angle but also the underlying motivations related to external factors that served as the original premise for this work. Additionally, I think the model would perform better on a larger sample of data to help iron-out some of the disparities in the parameter estimates. Finally, the modeling exercise presented assumes that one distribution for each state is appropriate for the population, therefore, any variability in the individual track data is minimized in the model. This is only a helpful approach if the researcher believes that all actors behave in a similar capacity – this is a big assumption for my data and given this modeling exercise I do not necessarily think it holds.

Significance of results to science and resource managers

Conceptually, I think the application of hidden Markov models and the use of the moveHMM tool has great potential for exploring movement data outside the realm of animal-based movement, particularly for natural resources managers. The study of human movement in recreation settings through the collection of GPS track data is a relatively new methodological development. As such, analytic methods applied to spatial data in the social sciences to date have focused on providing descriptive summaries of track lengths and travel times and summarizing the data in aggregate. Using a movement ecology approach provides a lens through which to understand individual movement patterns and to look at variation in both space and time within a track. These methods also have the potential to help researcher and managers understand how landscape level features may influence movement patterns and what the characteristics of movement patterns are in certain places and at certain times. As researchers and managers begin to build empirical relationships between behavioral states and landscape level features, managers can work to engage in more directed landscape restoration, visitor use management, and park planning.

Learning: What did you learn about software a) Arc-Info, b) Modelbuilder and/or GIS programming in Python, c) R, d) other?

Through this class, I worked primarily in ArcGIS and R, exploring new tools and developing additional skills in both software packages. Through working in ArcGIS, I explored tools for completing spatial joins with both raster and vector data, data manipulation and formatting challenges and limits, and began to develop a rudimentary process for working between ArcGIS and R. Through the course, I moved beyond my prior knowledge of ArcGIS, using tools through the course exercises that I had not previously used such as the generate random points tool, extract values to points tool, spatial join with distance calculation, and mosaic raster tool.

The majority of my work in this class was performed in R, an analytic programming language in which my own prior experience was through completing homework for STAT 511 taken last term. My proficiency in R grew immensely through this course. Skills I gained through this course include opening, manipulating, and saving excel and csv data files in R, map-based data visualization tools, data summary tools, and several movement based analytic packages including adelhabitat and moveHMM. A central learning opportunity for me in this course was to be exposed to the various ways that R can be used to analyze and manipulate spatial data. Moreover, prior to this class, I had no concept of the depth of the spatial analytical packages available through R. I was introduced to those packages through recommendations by classmates, and the experience has changed how I will approach analysis in the future.

I did not work with Modelbuilder in the course, nor did I work with Python. I have worked with those programs in the past and was happy to develop new skills in R and additional skills in ArcGIS.

Learning: What did you learn about statistics, including a) hotspot, b) spatial autocorrelation, c) regression, and d) multi-variate methods?

Through the presentations of my classmates and through my own project work in this course I expanded my understanding of how to apply statistical concepts to spatial data analysis. Through several presentations given my classmates through Tutorial exercises, I have developed a working knowledge of autocorrelation, including what the analysis seeks to identify, how the term “lag” is operationalized in the analysis, and tools/packages for completing autocorrelation analysis. To date, this concept has been a term that has eluded me, and while I did not work directly with autocorrelation analysis for my own project work, I feel the exposure gained in this class has helped me move forward in understanding what an autocorrelation analysis seeks to accomplish.

For multi-variate statistical methods, I took a deep dive into exploring hidden Markov models as a mechanism for understanding transitions among behavioral states and how the potential for relationships with environmental covariates. I also ran an AIC analysis to identify a favored HMM. I also was exposed to new thinking about how to explore relationships between landscape variables through Exercise 2, beginning to develop a new way of thinking about how landscape level variables are related and why I might be interested in knowing that relationship when looking at the GPS tracks of use. I also was excited to be exposed to geographic weighted regression through Sam’s presentation, and look forward to exploring this analytic tool in the future. I liked the ability of the tool to bring in the spatial component of the data as a sort of third dimension to the analysis.

More than anything, this course has introduced me to new ways of thinking about spatial data generally, and exposed me to the wide range of possibilities available for future analyses. One of the greatest values of the course has been hearing the presentations of my classmates and learning about their research problems, questions, and tools used.

References

Langrock, R., King, R., Matthiopoulos, J., Thomas, L., Fortin, D., & Morales, J. (2012). Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions. Ecology 93(11): 2336-2342.

Michelot, T., Langrock, R., & Patternson, T. (2017). An R package for the analysis of animal movement data. [online]. https://cran.r-project.org/web/packages/moveHMM/vignettes/moveHMM-guide.pdf.

Michelot, T., Langrock, R., & Patternson, T. (2016). MoveHMM: An R package for the statistical modelling of animal movement data using hidden Markov models. Methods in Ecology and Evolution 7: 1308-1315.

Nathan, R., Getz, W.M., Revilla, E., Holyoak, R., Saltz, D., & Smouse, P.E. (2008). A movement ecology paradigm for unifying organismal movement research. PNAS 105(49): 19052-19059.

National Park Service. (1989). Wilderness Visitor Use Management Plan: Glacier Bay National Park and Preserve.

National Park Service. (2015). Glacier Bay: Wilderness Character Narrative. Available: https://www.nps.gov/glba/learn/news/wilderness-character-narrative-released.htm.

Print Friendly, PDF & Email


3 Comments »

  1.   leatherl — June 15, 2018 @ 2:12 pm    

    Such a cool application of this tool! I’m curious what other explanatory variables you might have available to include, or ideally *want* to include both qualitatively and quantitatively– especially because movement ecology is designed to understand the behavior of animals that can’t tell us their motivations! Do the GPS tracks have accompanying interviews or similar that could be used to validate your analysis (e.g., “I wanted to stay in sight of the shoreline”)? Are there fine-scale weather data that could also be paired with the movement tracks? e.g., high wind speed and rain might result in more turn angles close to each other as a kayaker fights a storm, or movement stopping as a kayaker rides out the storm on shore.

  2.   jonesju — June 15, 2018 @ 7:14 am    

    Very good work. I can see that the HMM transition probability model would be a good approach, but you need to debug your efforts. Why did you choose two states of 1 m vs. 100 m? Why not 0 to 100 and 100 to 500m or something that relates better to actual paddling rates? Turning angle distributions were asymmetrical, so turning was mostly to the left (paddler was going counterclockwise around the lake). I hope you pursue the HMM approach, and to identify areas where you would expect paddlers to move faster/slower, as you have done with your excellent visualizations. Keep up the good work!

  3.   swanssam — June 15, 2018 @ 6:49 am    

    Nicely done! I’m curious if previous work has shown that bathymetry and/or distance to shoreline play an important role in recreational kayaker’s behavior, and what other potential variables could be driving the behavior you observed? For example, National Parks often have several famous landscape features that visitors are attracted to (e.g. Half Dome in Yosemite, Old Faithful in Yellowstone). Is there a similar landscape, view, or feature of Glacier Bay that could be causing kayakers to travel faster or slower? Anyways, great job and good luck in your future endeavors!

RSS feed for comments on this post. TrackBack URI

Leave a comment

You must be logged in to post a comment.

© 2018 GEOG 566   Powered by WordPress MU    Hosted by blogs.oregonstate.edu