GEOG 566






         Advanced spatial statistics and GIScience

June 11, 2018

The Hidden Behavioral States May Still Be Hidden: Exploring the Applicability of Hidden Markov Models and Environmental Covariates for Modeling Movement Data (Exercise 3 Part 2)

Filed under: 2018,Exercise/Tutorial 3 2018 @ 10:10 am

Question Asked

Overall, my aim for Exercise 3 work was to determine the extent to which environmental covariates (of the type explored in Exercise 2), could be related to spatially explicit behavioral states defined by step length and turning angle data generated from GPS tracks of movement in Exercise 1. As described in my Exercise 3 Part 1 blogpost, to operationalize the behavioral states, paired step length and turning angle measurements were generated from the raw GPS tracks, as described in my Exercise 1 blogpost. Histograms of both step length and turning angle distributions for five sample tracks revealed that two states may be emerging from the data: 1) a state characterized by small step lengths and wide turning angles and 2) a state characterized by large step lengths and very narrow (near zero) turning angles. The emergence of these two potential behaviors from the visual inspection of the histograms is characteristic of the behavioral states used to describe the movement behaviors of animals, to which hidden Markov model approaches have been applied.

Through coursework and collaboration in this class, my lab mate Jenna and I discovered the moveHMM R data analysis package, which uses hidden Markov model statistical theory to fit a two-state model of behavior driven by step length and turning angle observations. The hidden Markov model approach requires that measured data, in this case time-stamped X and Y coordinates of movement actually represent movement data. The literature on the application of hidden Markov models for movement behavior suggests this requirement can be met by assuming that spatial inaccuracy does not exist within the data (i.e., the measured coordinates represent actual movement behaviors) and by regular time-stamped sampling of the GPS data (i.e., no missing data). Through assuming that the measured data represents a known state (in this case, movement), the hidden Markov model uses patterns in the measured data to reveal the “hidden” underlying states in the data. In this application of hidden Markov models, the hidden states being modeled are two behavioral states defined by combinations of co-occurring step length and turning angle movements. This approach is well documented in the movement ecology literature for understanding transitions between two states of movement in animals, using tracking approaches such as GPS or telemetry. Given its applicability for the study of the movement of animals, Jenna and I thought it would be an interesting approach for understanding the movement behavior of people on the landscape.

Additionally, covariates can be explored to understand how a co-occurring environmental variable that is changing in space and time with the step length and turning angle movement data may or may not correlate with behavioral shifts between states. In this way, landscape-level environmental data of the type explored in Exercise 2 (i.e., vegetation type and elevation) can be related to spatially-explicit behavioral data. It should be noted that for Exercise 3, bathymetry and distance to shoreline were the environmental variables used in this exercise. I previously explored relationships between land-based vegetation cover class and elevation as environmental variables in Exercise 2; however, I was able to get the bathymetry data originally intended for Exercise 2 exploration to work with my dataset in ArcGIS. Therefore, instead of continued exploration of land cover vegetation class and elevation I will be exploring distance to shoreline and bathymetry as behavioral covariates in Exercise 3 Part 2 as these data co-occurred in space with the GPS movement data.

Given the above background, my research questions are as follows:

  1. How does bathymetry influence the transition probability between a movement-based behavioral state and a stationary behavioral state?
  2. How does distance to shoreline influence the transition between a movement-based behavioral state and a stationary behavioral state?
  3. Which covariate, if either, has more explanatory power?

Tool/Approach Used

I used the R package moveHMM (Michelot, Langrock, & Patterson, 2016) to run the hidden Markov models, generate behavioral probabilities, and run an AIC analysis on my fit models. For data preparation, I used the fitdistrplus R package to define parameter estimates and distributions (see Exercise 3 Part 1 post) and ArcGIS spatial analyst tools (see Exercise 2 post) to relate bathymetry and distance to shoreline data to each step length and turning angle observation in the dataset.

Description of Steps Used to Complete the Analysis

Many of the initial preparatory data wrangling and formatting steps used to set up the Exercise 3 Part 2 analysis are documented in blogposts for Exercises 1, 2, and 3 Part 1. The below list describes steps that were taken as part of this analysis that have been previously described in other posts. New analytic steps are subsequently described:

Previously completed workflows:

See Exercise 1 blogpost for a description of the steps used to generate step length and turning angles from raw GPS data for this analysis using the “prepData” function in moveHMM.

See Exercise 2 blogpost for a description of how raster-based elevation data were related to point-based movement data using the Extract Values to Points tool in ArcGIS. Similarly, see Exercise 2 blogpost for a description of how distance to shoreline was calculated for each point-based movement data observation using the Spatial Joins tool in ArcGIS.

See Exercise 3 Part 1 blogpost for a description of how initial distribution and parameter estimates were generated for step length and turning angle inputs for the moveHMM tool.

New analytic workflows:

New steps completed to run the moveHMM tool and explore model fits, visualize results, and calculate model AIC values are as follows. The below-described steps are adapted from published moveHMM workflows (Michelot, Langrock, & Patterson, 2017; Michelot, Langrock, & Patterson, 2016).

  1. Prior to fitting the moveHMM model, define distribution parameters for step length and turning angle. The default distribution for step length is gamma and the default distribution for turning angle is von Mises. If other distributions are used, they must be defined in the fitHMM function.
  2. Run model using the “fitHMM” function. Define input data (must be generated through prepData function), number of behavioral states, distribution parameters (define prior and then call in command), and covariates. This step generates numeric and text output reporting the model parameter estimates for each behavioral state.
  3. Generate a visual summary of the model, including density histograms for step length and turning angle, transition probabilities for behavioral states, and spatially explicit behavioral state occurrences through “plot(model)” function.
  4. Generate visual summaries of state transitions through “plotStates(model)” for each track.
  5. Run AIC on model(s) to provide a measure for which model is statistically favored.

Description of Results Obtained

Overall, the results obtained using this tool varied wildly. The below presented results are those that I think best represent the data and shed light on the research questions that I posed at the beginning of the exercise. Results are presented for each of the above listed analysis steps.

Results: Define distribution parameters for step length and turning angle and run model

This part of the exercise turned out to be more difficult than anticipated. I had hoped that by doing a thorough exploration of my data, both in aggregate and segmented into data-drive two-state bins, that the parameter estimates defined by the movement data would ultimately map well within the moveHMM analytic environment. Unfortunately, this was not the case. The first round of parameter estimates I tried in the model fitting process included the parameter estimates derived from the gamma and wrapped Cauchey distributions identified in Exercise 3 Part 1. The result of these parameter estimates was a model that only included 1 state (state 2), and had standard deviation estimates for the model parameters of infinity. These results were not expected, and suggested that the model was not a good fit for the data.

Given that the original, data driven estimates did not result in a well-fitting model, I began trying various combinations of parameter estimates seen in the literature and developed through my own knowledge of travel rates. Additionally, I reverted to using the two default distributions in the moveHMM tool: gamma for step length and von Mises for turning angle as the tool seemed to perform better with those distributions. Table 1 reports a set of initial parameter estimates used derived from the fitdistrplus tool and a second set of parameter estimates developed through trial and error ultimately used in model development.

Table 1. Parameter estimates for two model fitting trials.

Figure 1. Model outputs for the initial parameters (left panel). Results produced from the initial parameters resulted in only behavioral state (State 1) with a mean value of 0 for State 2, which essentially indicates a non-state as the gamma distribution is a positive distribution and thereby does not include negative parameter values.

Given the poor results from the initial parameters, the best fitting model parameters were used to fit two models for exploration in this exercise: a two-behavioral state model with distance to shore as a covariate for behavior transition and a two-behavioral state model with bathymetry as a covariate for behavior transition. Figure 2 reports the model outputs for these model runs.

Figure 2. Results from the model outputs for the best fitting parameters with distance to shore as a covariate (left panel) and bathymetry as a covariate (right panel). The distance to shore model suggests a two-state model, with one state being characterized by shorter step lengths (mean = approximately 1 meter) and wider turning angles (mean = 3.1 degrees) and a second state being characterized by longer step lengths (mean = approximately 224 meters) and narrow turning angles (mean = -0.014 degrees).The bathymetry model suggests a two-state model, with one state being characterized by shorter step lengths (mean = approximately 2.5 meters) and wider turning angles (mean = 3.1 degrees) and a second state being characterized by longer step lengths (mean = approximately 262 meters) and narrow turning angles (mean = -0.007 degrees).

Visual Presentation of Results and AIC Analysis

Results of the AIC analysis to determine the favored model suggest that model 2, the bathymetry model is favored over model 1, the distance to shore model, given AIC values of 72708.23 and 82097.23 respectively. Given these results, the remaining visualizations and interpretation of results is provided for Model 2, with bathymetry as a covariate.

Figure 3. Density histogram of step length states for the Bathymetry Model. The figure suggests that the distribution for state 2, the movement oriented state, has a higher density across step lengths from slightly greater than 0 to approximately 300 meters in length than state 1, which peaks right around a step length of 0.

 

Figure 4. Density histogram of turning angle states for the Bathymetry Model. Turning angles for State 2 cluster around 0, which is expected given the narrow turning angle movement mean reported for this state. The curve of State 1 is unexpected, given that the mean of the turning angle movement reported for this state is around 3.

Figure 5. Example output for location of behavioral states in space for track ID 2. In theory, the figure would show blue to characterize portion of the track during which the individual is in state 2 and orange for portions for the track during which the individual is in state 1. At this scale, state 1 is hard to see, but there are small clusters around the beginning and end of the trip that show some orange track pieces. This figure shows that for individual 2, the majority of the track is in state 2 behavior rather than state 1 behavior.

Figure 6. Distribution of behavioral states across time for track ID 2. The figures show that the majority of the movement track is spent in state 2, the moving state, and that state 1, the resting state, occurs infrequently throughout the movement track.

Figure 7. Transition probability matrix for influence of bathymetry on transition between or among behavioral states. The four transition probability plots show that as bathymetry increases (water depth increases), the likelihood of staying in state 1, the resting state, decreases dramatically at a small increase in bathymetry and then remains at 0. Similarly, at small increases in bathymetry, the probability of transitions from state 1, a resting state, to state 2, a moving state increases rapidly and then states at a probability of 100%.

Returning to my research questions for this exercise, I have reached the following conclusions:

  1. How does bathymetry influence the transition probability between a movement-based behavioral state and a stationary behavioral state?

As bathymetry increases, the likelihood of staying in a stationary behavioral state, if already in a stationary behavioral state, decreases rapidly. Similarly, the likelihood of transitioning between a stationary state and a movement-oriented state increases rapidly and abruptly as bathymetry increases.

  1. How does distance to shoreline influence the transition between a movement-based behavioral state and a stationary behavioral state?

(Results not pictured in this exercise) As distance to shore increases, the likelihood of staying in a stationary behavioral state, if already in a stationary behavioral state, decreases rapidly. Similarly, the likelihood of transitioning between a stationary state and a movement-oriented state increases rapidly and abruptly as bathymetry increases.

  1. Which covariate, if either, has more explanatory power?

Through an AIC analysis comparing the Bathymetry Model and the Distance to Shore Model, the Bathymetry Model was favored over the Distance to Shore Model as the preferred model for modeling two-state movement behavior.

Despite being able to answer the three research questions I originally posed, I am not yet confident to conclude that these results, or the modeled two-state behaviors, are valid in their current estimations. The critique below identifies why I am cautious of the presented results.

Critique of Method

Conceptually, I think the application of hidden Markov models and the use of the moveHMM tool has great potential for exploring movement data outside the realm of animal-based movement. The types of analyses that the tool is capable of is exciting, particularly given the ability to relate covariates to the movement data. However, I have several critiques of the method that arose through my experimentation in Exercise 3:

  1. Additional guidance is needed for establishing null distributions and parameters in the model – the moveHMM package documentation states that establishing the movement parameters themselves for each behavioral state, prior to running the model, is the most important component of the modeling process. Given that caveat to running the model, it seems that a high degree of familiarity with the movement data, and the population under study, is needed in order to run the two-state behavioral model. Therefore, the model in and of itself does not seem overly exploratory in nature by rather a tool to identify the transition probabilities between two known states – not how to identify the behavioral states. The literature does not present hidden Markov models in this light, and I think that the amount and degree of background knowledge on the expected model outcomes should be discussed more candidly in the literature on the application of hidden Markov models to movement data. It essentially felt like I was trying random combinations of parameters in order to get the models to run. Part of the struggle is certainly a function of being a beginner to this type of modeling.
  2. Better error messaging needed in moveHMM – When working with the tool, the errors returned were difficult to interpret and, given the newness of the tool itself, there is not yet an established online community providing responses to hiccups in the tool itself. Therefore, when I was getting errors about value estimates outside of the parameters or missing fields, I had difficulty identifying exactly where I was going wrong, given that I was using the published vignettes as guides for working with my own data. More descriptive error handling guidance would be helpful in using this tool again in the future.
  3. Small sample size and combined sample leading to erratic results – The sample of tracks modeled is only representative of 5 movement patterns, and in reviewing these movement patterns they are quite distinct from each other. I think the model would perform better on a larger sample of data to help iron-out some of the disparities in the parameter estimates. Additionally, the modeling exercise presented assumes that one distribution for each state is appropriate for the population, therefore, any variability in the individual track data is minimized in the model. This is only a helpful approach if the researcher believes that all actors behave in a similar capacity – this is a big assumption for my data and given this modeling exercise I do not necessarily think it holds.

For these reasons, the numeric results of the material presented should not be interpreted literally, but rather more as an exercise in how the tool could be applied in the future with additional data and a involved initial understanding of the behaviors of interest.

Print Friendly, PDF & Email


No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

© 2018 GEOG 566   Powered by WordPress MU    Hosted by blogs.oregonstate.edu