GEOG 566

         Advanced spatial statistics and GIScience

Archive for My Spatial Problem 2017

April 24, 2017

Japanese tsunami marine debris biota

Filed under: 2017,Final Project,My Spatial Problem 2017 @ 12:49 pm


Research problem: Japanese Tsunami Marine Debris species

Six years ago, the devastating Tohoku earthquake and tsunami struck the coast of Japan. Since then, it has become evident that hundreds of coastal species from Japan have crossed the Pacific Ocean on tsunami debris, including species that have become invasive and been known to cause ecosystem/economic damage elsewhere. As of January 2017, scientists have documented the arrival of >650 debris items, referred to as Japanese Tsunami Marine Debris Biofouling Items (JTMD-BF) 1-651. Debris items include docks, buoys, boats, pallets, and wooden structures. These items were identified as JTMD if it 1) had clear identification such as a serial or registration number that was linked to an object lost during the tsunami of 2011; 2) had clear biological evidence of originating primarily from the Tohoku coast of Japan; or 3) a combination of these factors. The BF items are a subset of all the debris lost, as some debris items that were known to be lost from Japan remain undocumented – it is possible they ended up in remote locations.  A huge effort by taxonomists to identify the species on JTMD has generated a comprehensive species list. Currently, there are around 300 taxa that have been collected on JTMD from North American and Hawai`ian coastlines since 2012. I am interested in looking at several spatial aspects of the dataset as research questions. First of all, I want to explore the geographic distributions of the species with invasion history compared to those without invasion history. I also want to look at a few species of interest, and map their geographic distributions to compare (both native and non-native on the same map). Lastly, I want to explore the different methods of transport, or vectors, that have been documented for 31 of the JTMD species, to better understand different dispersal mechanisms and patterns. 

Dataset: JTMD Database

My work is part of an international effort to evaluate the risks associated with JTMD and associated species.  I contributed to the development of a database of life history, distributional, and environmental attributes of many JTMD species. This database is being used for reference and analysis, and can aid in efforts to assess the risk associated with JTMD species, and in determining which species are of highest concern for establishment in certain areas. The database contains a little over 100 JTMD species, with 26 attributes per species. It has information for native geographic regions and non-native (if applicable) geographic regions, as well as vector information, meaning the methods of transport that species have used to disperse to different regions of the world.  The regions are classified as the Marine Ecoregions of the World. It also has survival temperature and salinity regimes, reproductive information, habitat, depth, trophic level, abundance data, and more. The database does not include temporal information, but these species arrived on debris items in pulses for years after the tsunami event, and that is another interesting aspect of the phenomenon.

Hypotheses: While the JTMD species are largely native to the Northwest Pacific (near Japan), they are also native to regions from all over the world. I hypothesize that most species without invasion history are native to the NW Pacific, and those with invasion history are native to regions all over the world. I expect to see this represented when their native regions are mapped. For the species that have been found outside their native range, I expect to see that some are only just outside native ranges in a few areas, but some with a history of invasion are more globally distributed.

For the individual maps of species of interest, I am not sure how they will compare, but I expect some to be more global, and some to be more localized, and only invasive in one realm.

For the vector types, I expect to see a lot of overlap of geographic regions between different types of vectors, because there is a lot of overlap in documentation. What I mean by overlap is that most species that were introduced to a certain region via ballast water were also introduced via aquaculture/fisheries trade, for example, and that can result in similar looking maps for different vector types.

Results: Please refer to these links for some more detail, and clearer maps on the JTMD species native region distributions: Reva Gillman_geog566exercise1 JTMD species of interest individual range maps: Reva Gillman_Exercise2 , and lastly JTMD Vector distribution: Reva Gillman_exercise3_geog566


JTMD species native regions

The following map shows the total number of JTMD species native to each region, with a legend on the side specifying what range each color represents.

Results: The most prevalent region that JTMD species are native to is the Temperate Northern Pacific. This seems obvious, as JTMD originated from Japan, so we expect most species to be native to that same region of the globe. Next most prevalent native region for JTMD species is Eastern Indo-Pacific, the region southeast of Japan. However, after that the native regions that are prevalent for JTMD species begin to span the globe: Temperate Northern Atlantic, and Tropical Eastern Pacific. At the other end is the least prevalent region: the Southern Ocean, only one JTMD species is native to this cold southern region.




Looking at Individual Species geographic ranges

What are the geographic distributions of some of the ‘riskier’ species that arrived on Japanese Tsunami Marine Debris (JTMD)? What do their native and non-native ranges look like, and how do they compare with each other?

First of all, I had to choose 4 species with clear invasion history out of the 104 JTMD species, to focus on. Out of the 31 with invasion history, I chose Asterias amurensis (seastar) and Hemigrapsus sanguineus (Japanese shore crab) since they are known to be an issue in other regions. I then chose two species with large non-native spread: Crassostrea gigas (Japanese oyster), an economically important aquaculture species, and Teredo navalis, the shipworm with an almost global distribution that has burrowed in wooden ships all over the globe for 100’s of years.

The approach I used was within ArcMap. I manipulated the Layer Properties in ArcGIS ArcMap 10.4.1 in order to manipulate the polygons in the shape file, to make them appear different colors according to which realms were documented as native or non-native regions for each species.

Crassostrea gigas geographic distribution



Teredo navalis geographic distribution

JTMD species vector distribution:

The maps below show the number of species that were introduced to each realm via each vector type. As you can see by looking at the first map for natural dispersal, the most prevalent regions are Temperate Northern Atlantic, and Temperate Northern Pacific. For the second map of moveable structures, the most prevalent regions are Temperate Northern Pacific, and Southern Australia/ New Zealand.

For solid ballast, the most prevalent region for this vector is Southern Australia/ New Zealand. For recreation, the most prevalent regions are Temperate Northern Atlantic, along with Southern Australia/ New Zealand.

For the method of transport of ballast water, the most prevalent regions are Temperate Northern Pacific, Temperate Northern Atlantic, along with Southern Australia/ New Zealand, and Southern Africa.  For aquaculture and fisheries trade, the most prevalent regions are Temperate Northern Pacific, Temperate Northern Atlantic, along with Southern Australia/ New Zealand.

Approaches: The approach I used was within ArcMap. I chose to visually represent native region frequency of JTMD species within 12 different coastal geographic regions of the world. For the second part of individual species maps, I manipulated the Layer Properties in ArcGIS ArcMap 10.4.1 in order to manipulate the polygons in the shape file, to make them appear different colors according to which realms were documented as native or non-native regions for each species. For the last part of my project, I visually represented vector distribution frequency of JTMD species within 12 different coastal geographic regions of the world, with a different map for each vector type.

Significance: Human-mediated transport of marine species across the globe through ballast water and hull fouling has been a concern for some time. JTMD is unique in comparison to these marine vectors, in that it can transport large numbers of marine species across ocean basins. Shipping routes are direct and arrive in known locations and at measurable frequencies whereas JTMD, which is propelled by winds and currents and travels at much slower speeds than ships, can arrive almost anywhere at any time. JTMD is potentially the transport vector with the most random distribution yet described. Due to the slow rates of transport by currents rather than propulsion, the effects of drag and dislodgement are reduced on JTMD in comparison with ship hull Furthermore, JTMD transports large numbers of adults, rather than larval stages that are more common in ballast water. As of January 2017, only one JTMD species, the striped beakfish Oplegnathus fasciatus, has been observed free-living in along the west coast of North America (in Oregon and Washington). At this time, we do not know if any of these JTMD species will become established outside of their current distributional range as a result of the earthquake and tsunami. But by studying the JTMD species invasion histories, and geographic distributions of both native and non-native areas, we can better understand this large dispersal event, and inform vector management response for future events.

The maps that were created during this class are useful for my research, and are really nice ways to visualize the native regions of the JTMD species, the distribution of the JTMD species, and the vector distribution of the JTMD species. I can potentially use these maps for reports and thesis work on JTMD species in the future.

My Learning: I started out the class as a beginner in spatial analysis, and with no experience in GIS,  or Arc-Info. Being a beginner, I had to do a lot of self-exploration, tutorial watching, and asking for help from peers and teachers. I am happy to report that I gained a lot of comfort using ArcMap, and can now get a shape file into ArcMap, get data as another layer over the shape file, and join data, manipulate the layer properties, and create maps with legends in ArcMap. I am happy with these skills, and I feel that I learned a lot about beginner ArcMap usage, and feel more comfortable talking about GIS mapping with others now.

Statistical knowledge: I didn’t use statistical analysis with my project, as my data was for very large realms of the marine regions of the ocean, and didn’t lend itself to statistical analysis, so I did exploratory analysis and visualizing as means of analysis. However, by talking to others in the class, and learning about what their methods were, I learned some new methods for spatial analysis, like variograms, that I can potentially use in the future.

Comments: I got very intuitive comments from students on my tutorials. One suggestion was to look at frequency of occurrence for each debris item, and map that, which I wasn’t able to incorporate because I don’t have the necessary data, but it was a good suggestion if we did have it. I also used the comments from Julia on my exercises to guide my exploration for future exercises. For example one of Julia’s suggestions was to map the number of species introduced to each region by a vector, and that is exactly what I ended up doing to visualize the vector distribution of JTMD species.



April 7, 2017

Topographic constraints on climate drive range shifts in Great Basin small mammals

Filed under: My Spatial Problem 2017 @ 8:55 pm

A description of the research question that you are exploring.

In 2015, Elsen and Tingley published “Global mountain topography and the fate of montane species under climate change,” in the journal Nature – climate change. They combined two data sets, the global data set of mountain ranges from Natural Earth’s physical vectors and a high-resolution near-global DEM (SRTM30) to evaluate the assumption that available non-vertical surface area decreases with increasing elevation in mountain ranges. This is an important question in community ecology and conservation as it applies directly to a known relationship between species diversity and area established by the field of island biogeography and more widely applied in recent decades. Specifically, the relationship states that physical space increases, the number of species that space can hold increases. Elsen and Tingly addressed this question across 182 of the world’s largest mountain ranges. However, they did not apply their analytical methods to any of the mountains in the basin and range province, except for the Sierra Nevada Mountain range which make its western border and the Rocky Mountains on its eastern border. The basin and range in comprised of north-south trending mountain ranges and have been the focus of several studies addressing questions relevant to island biogeography. Additionally, modern (2009 – 2016) and historical (1929-1931) small mammal surveys have established small mammal community composition for a number of these mountain ranges. In order to better understand the dynamics underlying the distribution of small mammal species along elevation gradients in these ranges I would like to perform similar analysis to Elsen and Tingley on three ranges for which historical and modern small mammal communities have been studied, the Toiyabe mountain range, Ruby Mountain range, and the Snake range. Specifically, these analyses would address the question; to which hypsographic classification are species range shifts subject? I hope to better contextualize the range dynamics of species in response to climate vegetation changes. Furthermore, understanding how available area changes with elevation in these ranges will enable better predictions about a species ability to remain within its thermal envelope by moving up slope.


A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

Predominantly, I plan to use the SRTM30 DEM in combination with land-sat data to evaluate area as a function of elevation. I am not sure which dataset will be best to use for vegetation analysis. I have spatially explicit data on the distribution of small mammal species along elevation. However, small mammal species data is not continuous along the elevation gradient, or within habitat/elevation bands.

The following description of the SRTM30 dataset has been copied from


SRTM30_PLUS V1.0 November 11, 2004

INTRODUCTION: This data consists of 33 files of global topography in the same format as the SRTM30 products distributed by the USGS EROS data center. The grid resolution is 30 seconds which is roughly one kilometer. Land data are based on the 1-km averages of topography derived from the USGS SRTM30 gridded DEM data product created with data from the NASA Shuttle Radar Topography Mission. GTOPO30 data are used for high latitudes where SRTM data are not available. Ocean data are based on the Smith and Sandwell global 2-minute grid between latitudes +/- 72 degrees. Higher resolution grids have been added from the LDEO Ridge Multibeam Synthesis Project and the NGDC Coastal Multibeam Data. Arctic bathymetry is from the International Bathymetric Chart of the Oceans (IBCAO) [Jakobsson et al., 2003]. All data are derived from public domain sources and these data are also in the public domain. The pixel-registered data are stored in 33 files with names corresponding to the upper left corner of the array shown below.

The USGS SRTM30 data and documentation is available at


Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

Elsen and Tingley produced four categories of hypsographic classification based on the relationship between area and elevation for mountain ranges. These categories are Diamond, Hourglass, Inverse pyramid, and pyramid, these categories describe mountains that increase then decrease in area with elevation, decrease then increase, increase, or decrease, respectively. They found that approximately 68% of the 182 ranges they analyzed were not categorized by the pyramid shape category. Importantly, the pyramid category describes “the dominant assumption in ecology and conservation that area decreases monotonically with elevation from a mountain range’s base (Tingley and Elsen 2015).” Despite this finding, the basin and range province is the result of 40 million years of geologic and geomorphic processes, which include continental rifting, erosion, mantle uplift, Pleistocene pluvial lake bank erosion, glaciation, and Holocene warming and drying. Specifically of these forces, I expect that the listric normal faulting that characterizes the Basin and Range to have resulted in pyramid shaped mountains. I also expect that ranges will be spatially auto-correlated, with respect to the skew of east versus west facing slopes.


Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data. 

I would like to learn how to overlay mountain range polygons atop a hi-resolution digital elevation model. I will need to learn how to use the R-package “Raster” to make the hypsographic curves described above and by Elsen and Tingley. I would also like to use Arc to make maps.


Expected outcome:

I would like elucidate the area-elevation relationship to which small mammal species, and all species are subject to in the mountains of the Great Basin. Ideally, I would also love to evaluate available area for habitat type based on the distribution of vegetation type and % cover along elevation. In this regard, I would like to produce maps of habitat types and graphs that illustrate the percent cover of a habitat type as a percent of total available space on the mountain. It would also be nice to show which microhabitat features are associated with climate/ weather measurement stations in place on these mountain ranges. In addition, I would be excited if I could make predictions about the constraints on a species ability to move upslope based on available area, this may be represented as a plot of species richness against a combined elevation-area score.



The Great Basin has a unique geological and physographic history in North America and has an experienced accelerated pace of human impact throughout the Anthropocene. The combination of human land-use practices and climate change has significantly altered small mammal community composition, pushing it outside of its range of natural variability. Understanding the physical constraints to a species ability to persist on the landscape is a critical to predicting how shifting community dynamics are constrained by the physical environment.


Your level of preparation:

I feel comfortable using R and exploring new packages and I am typically able to find the support I need when I am trying to understand or apply the tools available in a new R package.  I have limited experience with Arc-Info and/or Arc-GIS, however I do have knowledge of a number of the basic rules to follow when using these tools. I have extremely limited experience with Python.

Socio-Demographic and People’s Intentions Relationship in a Neighborhood System Adapting to Floods

Filed under: 2017,My Spatial Problem 2017 @ 1:58 pm

Research question.

For this problem I want to answer:

How socio-demographic and spatial variables explain patterns of survey responses about attitudes regarding flood-safety in neighborhoods?  

Description of the dataset.

The dataset for the study has the following characteristics:

  • Data is obtained from Household voluntary survey
  • Convenience sampling. Households located within the 100 and 500  FEMA’s Flood hazard Map. All  at once.
  • Each participant answers socio-demographic and predefined intentions’ questionnaire
  • A printed coded survey questionnaire was mailed out to residents. The code is used to identify resident addresses.
  • 103 variables have been collected
  • Most of the variables are categorical

      An example of a typical variable to be analyzed:

  • Variable: Suppose your current home was to flood, how confident are you in the following possible conditions? – I will be able to evacuate my home before flooding begins (This variable is expressed in 5 categories of discrete values without hierarchy):
  • Categories:
    • Very confident
    • Confident
    • Neutral
    • Somewhat confident
    • Not confident at all

The spatial data consists of a map identifying land properties within the boundaries of the 100 year and 500 year flood hazard Fema’s map has been developed, as shown in Figure 1. The survey has been mailed out to randomly selected properties withing this map boundaries.

Figure 1. Map for South Corvallis affected properties according to FEMA’s 100 year and 500 year flood categories.


Attitudes regarding flood-safety in neighborhoods are clustered according to socio-demographic and spatial factors.


Principal Component Analysis (PCA) and Factor Analysis (FA) is applied to analyze the collected data.

Figure 2. Principal component (

For Geographic pattern and cluster analysis I will test:

  • Average Nearest Neighbor
  • Spatial Autocorrelation (Moran’s I)
  • High/Low Clustering (Getis-Ord General G)
  • Cluster and Outlier Analysis (Anselin Local Moran’s I)
  • Hot Spot Analysis (Getis-Ord Gi*)

Expected outcome:

I would like to find statistical relationships between the categorical variables collected from the survey according to its spatial location that define patterns formation. And also, maps of these relationships within the 100 year and 500 Fema’s Map of Figure 1.   


This research will contribute to policy and decision making for neighborhood adaptation to climate change. Patterns identification of attitudes regarding flood-safety in neighborhoods is important for planning adaptation to different flooding scenarios in order to minimize personal risks.

Level of preparation:

a) Arc-Info: Intermediate

b) Model Builder and/or GIS programming in Python: Intermediate

c) R, or other relevant  spatial analysis software: Beginner-Intermediate

Spreading of Red Blotch disease in vineyards

Filed under: 2017,My Spatial Problem 2017 @ 1:37 pm


Spatial patterns in disease spreading in agriculture are important to understand. For example, what causes the infection and how to avoid further spreading. In this project, we focus on the Red Blotch virus in vineyards located in Oregon. Red Blotch affects the sugar content in the grapes, which changes the taste of the produced wine, and symptoms involve red blotches on the leaves. Using remote sensing we might be able to develop an early warning system and get a better spatial understanding on how this disease spreads. Nowadays, the virus is spotted when the red blotches appear, which is in general too late to remove the plant and avoid spreading. Another method is to apply an PCR, Polymerase Chain Reaction, test which can detect infection in the leaves by looking at its DNA or RNA. Both these methods are inefficient in terms of time, labor, and money. Using an Unmanned Aerial Vehicle, UAV, or better known as a drone, enables us to get a bird’s perspective on the vineyards with the flexibility of controlling the spatial resolution (flying higher or lower) and temporal resolution (fly whenever you want). This is the big advantageous of UAV’s over satellite based imagery.

The first goal in this research is to develop the early warning system, but closely related, and maybe even dependent on that, is the second goal of understanding the spatial distribution of the disease spread.

The research question for this class will be; Is there a spatial correlation in the spread of the Red Blotch disease in vineyards in study sites in Oregon?


Methods and Materials

We will be using UAV’s to acquire the aerial imagery. Two different sensors will be used, the multispectral and hyperspectral camera. These differ in the number of bands, respectively 5 and 270 bands. The hyperspectral bandwidths range between 400-1000nm. The advantageous of hyperspectral over multispectral is the precision of the bands. This gives us more in detailed spectral information on the vineyards. Hyperspectral is used for its ability to detect chlorophyll fluorescence, which is highly correlated to plant health.

For the multispectral we use the MicaSence Multispectral Camera, RedEdge. Which has a spatial resolution of 8 cm/pixel at an elevation of 400 ft. Depending on the elevation we can increase or decrease this resolution.

The hyperspectral sensor is the Nano-Hyperspec, Headwall Photonics. Depending on the above ground level the spatial resolution is about 5 cm.

Throughout the growing season, starting in May until October, we will be flying every month or twice a month over a couple of vineyards. The exact location is still to be determined.



I expect that there is a high spatial correlation in the spreading of the disease. The virus is transmitted by grafting, but also a vector can play a role as vineyards seem to infect neighbors.

For my main research goal, I expect that we will be able to develop an early detection method as some symptoms of the disease are preceded by physiological changes in the plants, which we will hopefully be able to detect.



In this class, I would like to learn more about the spatial correlation and how to deal with that. In previous research, I have dealt with comparisons between mean values from spatial data, assuming independence in measurements. However, when you are analyzing a problem that is spatially correlated you need to take that into account.


Expected outcome

For this class I will be producing a method to determine which vineyards are affected and which ones are at risk of being affected. Most likely, this will be in the form of map. For this risk map, we need to know the statistical relationship between infected and not infected plants, and how large the probability of infection is.



Red blotch can significantly decrease the sugar content in the grapes, which can change the taste of the wines. At the moment, farmers remove vineyard which are infected. The earlier we know which vineyards are infected, the earlier they can be removed, and new plants can be planted.


Level of Preparation

  1. Arc-info advanced
  2. Modelbuilder and/or GIS programming in Python novice, but taking GIS-science III, GIS programming in Python
  3. R advanced beginner, some experience from STATS 511 and 512
  4. Matlab advanced

Stream Geomorphic Change Detection and Analysis in an Oregon Coast Range Basin

Filed under: My Spatial Problem 2017 @ 1:07 pm


My master’s thesis involves quantifying geomorphic change in streams after the addition of a large wood jam for salmon habitat restoration purposes. Large wood has been shown to have many positive effects on salmon habitat, including creating pools and backwater areas that are essential for juvenile salmon survival. This past summer, I surveyed seven stream reaches within a single basin in the Oregon Coast range to capture the instream and floodplain topography before the addition of a large wood restoration structure. I plan to survey these sites again next summer to capture changes in topography caused by the interaction of wood, water, and sediment.

The overall goal of my thesis is to examine how upstream drainage area and bankfull width affect the amount of geomorphic change after large wood is added to a stream reach. Since I only have the first half of my data gathered at this point, I will utilize cross sectional data from adjacent LW sites gathered by a previous Masters student for my analysis, developing the methods that I will eventually employ with my own data once I gather the second round of survey data next summer.

The questions I am hoping to answer are:

  • What are the most effective methods for making an accurate raster interpolation from survey points?
  • What is the best way to quantify geomorphic change?
  • How does stream size affect the amount of geomorphic change caused by the addition of a LW jam?


To conduct my analysis, I will use topographic data that I gathered using a Total Station over two separate summers, 2015 and 2016. The 2015 survey represents the stream topography before the LW was added, while the 2016 represents the topography after the LW was added and has interacted with the stream bed over one winter high flow season.

These topographic data consist of XYZ values that make up stream cross sections. The cross sections were spaced one half bankfull width apart. Points on the cross section were taken at the vertical and horizontal inflection points. The points were collected at a relatively fine scale, with an average of one point collected per 0.25-1 meter. Points were collected at a higher density in areas with more variation in elevation, usually the stream channel while points were more spaced out on the floodplain surrounding the stream channel where the ground is usually flatter, with less change in elevation. These values were exported as a .csv to ArcGIS after the survey was completed.

The survey points were not georeferenced. Instead, a control network of known survey points was established to ensure repeatability for revisit surveys. The first benchmark point was designated as the datum and assigned an arbitrary coordinate of 10000 m, 10000 m, 1000 m. Then other benchmarks were surveyed in using the high precision Total Station setting.

Possible sources of error in the data can be linked to field data collection methods. Examples include: the Total Station not being set up exactly level and aligned over the benchmark points, the benchmark point moving slightly, or the survey rod not being held exactly perpendicular to the ground surface on the point that is being marked. All of these variables can introduce small errors into the resulting survey points. Much care was taken to eliminate as much uncertainty in the data as possible, but it is certain that there were at least a few centimeters of error incorporated throughout the survey.

Figure 1. Points gathered at the mainstem Mill Creek site in 2015 and 2016.


My research focuses on geomorphic change at two levels: the reach level and the basin level. There have been many studies that examine geomorphic change induced by large wood jams at the reach level. Based on these previous studies, I would expect the channel to become more heterogeneous in terms of elevation and width after the addition of a LW jam. I would also expect the stream to get wider due to more inundation of the floodplain during winter high flows. Pools will form and deepen upstream of the jam.

At the basin level, I would expect LW jam reaches in streams of intermediate size to experience the most geomorphic change. This is based on the balance between two driving factors: percent contact between the LW and the stream channel (greatest at small sites) and size and duration of overbank flows (greatest at large sites). Since the data gathered by the previous Masters student was not intended to compare differences in geomorphic change regulated by scale, it will be hard for me to make conclusions using this dataset but I can still attempt to compare the amount of change between the three stream reaches.


Last term in GIS II, I experimented with different interpolation methods in ArcGIS to determine which method would produce the most accurate raster, based on comparison of interpolated raster elevation values and known survey point elevations. I also incorporated TIN editing processes to create more accurate DEMs. Then I can use these rasters to compare the stream geomorphology before and after the addition of a LW jam.

In this class I want to continue to refine my interpolation methods and incorporate geomorphic change detection analyses and statistics, including accounting for interpolation error and, ideally, field data collection error as well. I have recently been reading some papers that present new ideas for minimizing interpolation error, including removing the trend in the data before interpolating and then adding the trend back in afterwards so I am interested in exploring this method.

Once I am satisfied with my DEMs, I will use ArcGIS to subtract the “before” and “after” DEMs to create a DEM of difference (DoD). Then I will use either ArcGIS or R to quantify the amount of geomorphic change (percent or raw) that occurred.


  • Figures quantifying geomorphic change between the two surveys in terms of stream elevation and width
  • Statistical relationships between amount of geomorphic change before and after the addition of a LW jam
  • Statistical relationships between amount of geomorphic change at sites of different size (based on upstream drainage area and bankfull width)
  • Error analyses to determine how much of the change is “real”, incorporating field data collection and interpolation error


Many engineered LW studies have been conducted that examine LW jam effects on stream geomorphic change and juvenile salmonid populations. There have been many reach-scale analyses and a few larger-scale studies that synthesize data from many watersheds but there has been little quantification of the geomorphic and biological responses to engineered large wood jams on the watershed scale within an individual basin. This kind of research can help refine stream restoration efforts and optimize resource allocation.


ArcGIS: Intermediate level (I spent last term working with it. I can perform most basic manipulations and figure out what I don’t know how to do from the help menus and the internet.)

Python: Beginner level (I have used it in two classes to perform basic calculations and plot data/results.)

R: Very limited experience (I used it a few times in my undergraduate Statistics class but that was 7 years ago so I will probably be very rusty.)

Comparison of near-shore and on-shore predictions of tsunami inundation

Filed under: Final Project,My Spatial Problem 2017 @ 1:06 pm

Research Question

A tsunami is a set of ocean waves caused by any large, abrupt disturbance of the sea surface. As evidenced by the 2004 Indian Ocean and 2011 Japan tsunamis, local tsunamis can destroy coastal communities in a matter of minutes. Tsunamis rank high on the scale of natural disasters. Since 1850 alone, tsunamis have been responsible for the loss of over 420,000 lives and billions of dollars of damage to coastal structures and habitats [1]. Predicting when and where the next tsunami will strike is currently impossible. Predictions for tsunami arrival time and impact based on the source of tsunami, however, can be predicted with modeling and measurement technology. These predictions are vital for coastal communities to make the necessary preparations to mitigate tsunami damages.

My research is involved with developing a methodology for making fast near-shore tsunami predictions based on arbitrarily defined off-shore conditions. The purpose of this methodology is to provide ensemble predictions of near-shore inundation to quantify the uncertainty of tsunami predictions. While novel, this methodology has the limitations of only being able to predict the inundation at near-shore locations and not on-shore locations. Thus the question becomes whether or not we can apply the uncertainty estimations near-shore to on-shore locations.

For this class project, I focused on comparing different near-shore locations to various on-shore locations. I examined what I defined as “major inundation events” and examined if each of these events corresponded to a significantly large wave in the near-shore. Thus my research question is:

“Do variations in the near-shore inundation time series correspond to variation for on-shore inundation?”

or more specifically for this case:

“Do predictions for major flooding events on-shore during tsunami inundation correspond to predictions of large waves near-shore?” 

Data Set

I used near-shore and on-shore tsunami simulation data that predicted sea surface elevation and fluid velocities for a stretch of coast in southern California near Port Hueneme. There were 27 possible data sets each with a different initial tsunami source. For this study I focused on the scenario where a large tsunami was generated near Alaska. This simulation data is provided by the Method of Splitting Tsunami (MOST) model which is the standard model used at the NOAA Center for Tsunami Research (NCTR) [2]. Inundation data for various simulated tsunami events over a duration of 10 hours (600 min) and over a geographical area of bounded by the coordiantes (-119.2469462, 34.1384259) to (-119.1949874, 34.2000000). There are 3600 time steps for each set of data on a 562 by 665 grid. The spacing between the longitude and latitudes are uneven but can be projected onto an even grid. A time series of sea surface elevation and fluid velocities is available for each geographical point (see Figure 1).

Figure 1 – Simulated tsunami inundation data using MOST v4 at Port Hueneme, CA for a simulated Cascadia source. Image is taken from the Performance Based Tsunami Engineering (PBTE) data explorer.

In addition to the tsunami data, I also have the topographical/bathymetric data for the region (which does not change in time) with the same spatial resolution as the gridded data. This data is presented in Figure 2.

Figure 2 – Topological/bathymetric data for Port Hueneme, CA in geographic coordinates (left) and UTM projected coordinates (right).


To strengthen my research, it would be quite nice if every major flooding event corresponded to a large near-shore wave such that I could extrapolate uncertainty predictions in the near-shore to the on-shore. Thus I will phrase my hypothesis like so:

“Each prediction for large on-shore flooding events relates to a near-shore large wave prediction.” 


Many approaches were used to analyze this data. Most of them, unfortunately, failed to produce any meaningful results. At first, I attempted to analyze all of the spatial data – at a given time step – all at once, treating the data essentially as raster layers. For the raster layers I attempted to calculate autocorrelation using Moran’s I and cross-correlations between the inundation data and the topographical data. In both cases, the procedures failed to produce any meaningful results.

After analyzing raster layers failed, I determined that perhaps extracting time series was the way to go. I then selected the points of interest along the coast near Port Hueneme, California. These points were arbitrarily selected but were also ensured to be somewhat well distributed along the coastline and were within the domain of the dataset. Figure 3 shows the locations of the points of interest (POIs) along the coast of Port Hueneme, CA. A total of 12 points were selected and were all meant to represent some building or port location. Their names and geographical coordinates were stored in a .csv file as shown in Figure 4.

Figure 3 – POIs where time series of inundation were extracted along coast near Port Hueneme, CA

Figure 4 – CSV file storing locations and their geographic coordinates

The .csv file was read into R by using the “read.csv” function and this table was subsequently converted to a data frame using the procedure highlighted in “Tutorial 2: Automated Autocorrelation of Tsunami Inundation Time Series at Various Points of Interest.”

After the time series of inundation were extracted for each point of interest, autocorrelations and cross-correlations between the different time series were calculated. Unfortunately these forms of analysis also failed to produce anything meaningful results. Mostly the data showed that there was autocorrelation in the data, but it was not useful for determining similarity or difference in the variations for the different time series. Additionally, differencing between different time series was also applied but was found to not be appropriate for the data.

All of these failed attempts led to less polished approach. For each of the times series of data, the 5th and 95th percentile of each of the time series was calculated and plotted on each of the time series plot as shown in figure 5.

Figure 5 – Time series of inundation at each location with 5th and 95th percentiles of inundation levels plotted as the blue and red dashed lines respectively.

What we do here is define a significantly large wave as any wave in the first 3 locations – near-shore locations – that have positive maximums that exceed the 95th percentile or the red line. For the 9 on-shore locations, flood events were defined as the inundation peaks in the time series. Here we define significant flood events as ones with heights that exceed the 95th percentile or the dashed red line for these. In each case, the timings for each of the big waves or inundation events were tabulated and we define these variations as related if they were within 10 minutes of one another. This 10 minute value was derived for the rough 10 minute periodicity of the waves in the near-shore time series. Additionally, we calculate the distance between each of these points by using a plane approximation since the locations are within 10 km of one another.


Table 1 shows the timings of significant waves for near-shore locations and significant inundation events for on-shore locations. Based on our definitions, it was found that all inundation events on land corresponded to some near-shore large wave. However, the reverse is not true in that not every large wave found in the near-shore corresponded to a large flooding event. Notice that almost every single big flooding event corresponded to large waves at all 3 near-shore locations of Port Hueneme Jetty, Channel islands Harbor Jetty, and Port Hueneme. The exceptions however are the red and light blue flooding events. The red event was observed at the Beachcomber Tavern, Manda’s Deli, and the Naval Construction Batallion Center. However, the red events were only considered significant waves at the Port Hueneme locations and the light blue event was only considered significant at the center of Port Hueneme and not at the jetty.

Table 1 – Timings (in minutes) for significant waves or inundation events. Cells highlighted with the same color indicate related events.

Using the distances between points calculated and shown in table 2, we hypothesize that a certain max distance should be used to match variations from near-shore to on-shore. However, we can see that the Port Hueneme Jetty was actually closer to the Beachcomber Tavern than the center of Port Hueneme. We do, however, see that the Channel Islands Harbor Jetty was much further away and it may be pertinent to not use locations that are more than 1 kilometer to one another for comparison.

Table 2 -Distances between near-shore locations and on-shore locations. Distance values in kilometers.


Tsunami inundation data has been very difficult to measure in the field throughout history due the inherent dangers of having people or instrumentation in the inundation zone. This fact, combined with the relative infrequency of tsunami events, forces managers and disaster planners to rely on prediction data from tsunami models [3]. The problem with this, however, is that these models are deterministic and do not give a good idea of what the uncertainty of these predictions are. This can be problematic when relying on these predictions to perform risk analyses.


The tool being developed in my other research has limitations for only being able to predict near-shore variation in tsunami inundation and not on-shore. Preliminary results from this study suggest that most variations of on-shore inundation can be captured by simply looking at large wave events in the near-shore which adds the usefulness of the aforementioned methodology that is beyond the scope of this class. However, it should be noted that one has to be careful when selecting which near-shore points to use to extrapolate variability for on-shore measurements. This area will require much extra study before actually extrapolating predictions for uncertainty.

Software Learning

Before this class began I had no experience in R or Python. Throughout this class I have had the opportunity to learn a great deal of R and some Python. Overall I think it has been a useful and rewarding experience.

Statistics Learning

I was able to pickup on some statistical techniques regarding auto-correlations, cross-correlations, and wavelet analysis in a general sense despite their lack of meaning for my overall project. The main thing I learned, however, was that working with big data is difficult and that most traditional data analysis techniques appear to fail when encountered with such a large data set. The experience of having to poke around and decompose the data into usable forms was very educational for me, but I still think I have a long way to go before being comfortable with big data analysis.

Response to Commentary

Most of the commentary I received was on the use of the techniques of auto-correlation, cross-correlation, and wavelet analysis. However, in the end I found that none of these techniques were useful for what I was trying to determine so unfortunately I could not apply any of the suggestions or comments mentioned.

There was however, one question in the comments that stated, “Finally, I was wondering if the areas with highest inundation are also the most the vulnerable?” Which is a good question because high inundation would usually highlight a dangerous situation, but vulnerability is not so simply defined. Vulnerability can really be thought of as risk*exposure. And thus, the area would only be vulnerable if there were people or structures of interest in the area with high inundation. Simply analyzing prediction of tsunami inundation are unfortunately unable to answer these questions and determine vulnerability. Assessing both the natural and human systems will be required to perform vulnerability analysis and that is beyond the scope of this study.


[1] NOAA. Web. Retrieved 20 March 2017.

[2] Numerical modeling of tidal wave runup, VV Titov, CE Synolakis (1999). Journal of Waterway, Port, Coastal, and Ocean Engineering 124 (4), 157-171,

[3] Wegscheider, S., Post, J., Zosseder, K., Mück, M., Strunz, G., Riedlinger, T., Muhari, A., and Anwar, H. Z.: Generating tsunami risk knowledge at community level as a base for planning and implementation of risk reduction strategies, Nat. Hazards Earth Syst. Sci., 11, 249-258, doi:10.5194/nhess-11-249-2011, 2011.

REVISED: Spatial Relationships of Vegetation in Restored and Remnant Salt Marshes, Salmon River Estuary, Oregon

Filed under: My Spatial Problem 2017 @ 12:56 pm

My Spatial Problem Blog Post


  1. A description of the research question that you are exploring.

Salmon River is one of the smallest estuaries on the northern Oregon coast (800 ha), with the largest proportion of tidal marsh (400 ha) for any Oregon estuary. It borders Tillamook and Lincoln counties, and is designated as an Important Bird Area by The National Audubon Society.  Conservation Research at Salmon River Estuary has been a focus of government, non-profit, and educational institutions since the 1970’s due to concern over salmonid habitat and the impacts of sea level rise on the coast. Salmon River consists of public and protected wetlands that have been restored and protected since the U.S. Forest Service removed dikes from three sites in 1978, 1987, and 1996. Tidal flow to the ocean is currently unobstructed on sites that were previously used as pastureland and/or diked. One wetland on the estuary was never diked and is used as a reference marsh for field research, to determine functional equivalency of restored marshes. Salmon River Estuary has been a site for place-based restoration and studies of community recovery over the last 40 years (Flitcroft et al. 2016). Vegetation has been monitored at this site since dike removal, however survey records are still being deciphered from past researchers. If sufficient plot data can be recovered and confirmed, future analyses of vegetation patterns over the last 40 years will be investigated further.

Factors influencing the richness and environmental integrity of Salmon River are associated with the physiognomic and taxonomic features of the plant community. This study focuses on the spatio-temporal distribution patterns of Salmon River vegetation to explore how remnant and restored marshes differ in terms of biodiversity and species composition. I expect that different durations of tidal exclusion through dike establishment will reveal differences between sites in the context of plant species composition and distribution.

  1. A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

I have collected species data (ocular estimations of percent coverage) from 1 m2 plots on transects from four tidal marshes: Mitchell Marsh (dike removed 1978), Y Marsh (dike removed in 1987), Salmon Creek Marsh (dike removed 1996), and one remnant marsh adjacent to Y marsh as a control (never diked). I also collected soil samples at each sampled transect plot, and tested them for salinity, conductivity, and bulk density, as well as nitrogen content. Transect plots were square shaped plots, 50 m apart in increasing distance from the tide.  My objective for data analysis was to describe the spatial patterns of the vegetation communities in tidal marshes of Salmon River Estuary after dike removal. I surveyed a total of 74 square meter plots on transects.

Stohlgren plots, also known modified Whittaker plots (MW, 1,000 square meters), were established at each marsh site to collect data on species abundance for comparison with transect data. MW plots were implemented to test for patterns of diversity at multiple scales beyond what transect, square meter plots may capture within the same site. The restored and remnant sites have three MW plots each, for a total of 12 MW plots. The MW, plots were placed at a random distance 50 m along and 20 m offset from the sampled transects at each marsh for a stratified random sample design. Each MW plot is a minimum of 50 meters apart, depending on where they were placed along the transect. At each MW plot, percent cover and presence/absence of species were estimated (with the aid of 1 meter square grids). Samples of all species identified were collected, pressed and are being examined to confirm identification. Elevation data were obtained from LiDAR surveys in 2015, and used to pinpoint elevation for all plots.

The data sets for my project include spreadsheets that describe percent vegetation cover, elevation, and soil characteristics per transect plot. MW plots were not sampled for soil and thus only have percent vegetation cover and elevation. The spatial grain of my study is one square meter, represented by the size of my smallest sampling frame for plots. With the nested sample plots and my stratified random sampling techniques, I have multiple spatial extents for this project. One extent could be considered the length of a transect (which vary by tidal marsh), the area of a MW plot (1,000 m2), or could arguably be extended to the entire Estuary. There are some interesting temporal aspects to my study as well; three of the four tidal marshes have experienced successive dike removal. These marshes have been surveyed for vegetation cover post dike removal and every 5 years subsequently. Incorporating these historical data will add dimension to my spatial and temporal analysis of variation at my study site.

3. Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

Are tidal marshes that were restored earlier more similar to the Reference Marsh in terms of environmental conditions (soil, elevation), and species composition when compared to those restored more recently?

I predict that restored marshes will be significantly different from the Reference Marsh. I also predict that time since dike removal will not be strong indicator of similarities between restored marshes and the reference marsh. I anticipate that sites which have experience recent dike removal have soils with higher soil salinity and conductivity, compared to remnant marsh plots.

Does species richness captured differ between restored and reference salt marshes?

I predict that Reference Marsh has higher richness compared to restored marshes. I  expect that plots from the reference marsh (Transect C, adjacent to Y Marsh) will be more diverse and heterogeneous than tidal marshes that have been diked. I predict sites that have experienced dike removal more or less recently will both have different species composition and be less diverse compared to reference sites. I hypothesize that the reference marsh will be the most diverse, with the highest richness and spatial heterogeneity of species throughout, compared to the other low marshes that have been diked.

Do Species Area relationships differ between restored and reference salt marshes?

I predict that the Reference Marsh has a greater number of species over area compared to restored marshes. I also predict that there will be spatial correlation of plant species at a larger scale, with fine scale patchiness within my site, suggesting that there may be ‘nesting’ of species or smaller pockets of diversity within the marsh, with similarities in species assemblages occurring at a larger scale.

Do field methods, specifically nested-rectangular (Modified Whittaker) plots and non-nested-square (Transect) plots capture species richness differently?

I predict that Modified Whittaker plots will capture more species than Transect plots, since MW plots will be able to address species richness at greater scales.

4. Approaches: describe the kinds of analyses you completed using your data.

I produced Mantel tests, and ISA (Indicator Species Analysis), as well as species area curves and a map of my site to compare and contrast differences in species assemblages by site. I used PC-Ord, Excel, and Arcmap to complete these analyses.

5. Results: What did you produce/find?

I conducted a Mantel test on all of my data to determine the scale at which my plot data were spatially autocorrelated (self-related). I found that none of my plots were spatially autocorrelated (Mantel’s r statistic Transect: r = 0.037797, p = 0.182182; MW plot: r = 0.027994, p = 0.164164, accept null hypothesis of no relationship). This is a little surprising, but may be indicative of noise within my dataset, and variation of species at this scale. It is possible that autocorrelation is not detectable at this scale, or perhaps I need a larger dataset with less proportional noise to sort out the autocorrelation signal. I was however able to detect spatial autocorrelation at the 1,000 square meter scale for the Modified Whittaker plots (r = 0.638224, p = 0.00010), suggesting that there may be more fine scale patichness, variation, or nestedness among plant species at each of the SRE tidal marshes. Salt may also be a confounding factor that drives spatial diversity of vegetation, in addition to dike removal, as salt is a limiting factor for some salt marsh plants; not all species are equally tolerant of it.

For the ISA (Indicator Species Analysis) test I completed to determine which species were associated with which tidal marsh environments, I found that (1) Triglochin maritima, Sarcocornia perennis, and Distichlis spicata were significant indicators of salty, restored conditions, (2) Dechampsia caespitosa, Juncus arcticus var. littoralis, Potentilla pacifica, Glaux maritima, and Hordeum brachyantherum were significant indicators of upland, high elevation conditions, and (3) Carex lyngbyei was a significant indicator of restored conditions. I broke my dataset up and conducted a mantel test for each of the groups, using only plots that recorded the presence of at least one species in each of the groups. I did not find any significant autocorrelation either with any of the strong indicator species (that were all found in a high proportion of plots surveyed). I am curious if my plots were located closer to each other, and/or I had surveyed more plots over a larger area, spatial autocorrelation patterns would begin to emerge.

an example of PC-Ord output for a Mantel test. The R statistic implies the amount of correlation between species cover and distance, the p value implies the significance of the correlation/similarity in comparison to chance (computed by Monte Carlo randomizations).

I also produced a map in Arc, to visualize the patterns of diversity I saw in my data. I coded each of my plot locations (for Transects and Modified Whittaker plots) by the dominant plant strategy type: Carex lyngbyei dominant (A plot that had at least 50% or more cover of Carex lyngbyei ), salt dominant (at least 50% or more cover of one or more salt tolerant species), or mid-high marsh dominant (at least 50%, 0r more cover of a mid-high marsh species that is not Carex lyngbyei). I think the map helps to visualize spatial patterns of diversity well, on a broader scale, by grouping plots into different assemblies or types/guilds based on their content.

Map of my site, indicating species assemblages by plot location; The Reference Marsh is dominated by high elevation species, and the previously diked marshes are all Low Marsh environments, dominated by C. lyngbyei and salt-tolerant species.

I created Species Area curves to examine how species richness increased over an incremental increase in area sampled (from 1 to 10, 100, and 1,000 square meters). A Species-Area curve represents the exponential relationship between species richness and scale of area; as the scale of area sampled increases, you may be more likely to find new species. A steeper slope for a species area curve indicates higher richness (number of species) and a shallow curve indicates lower richness. One of the things that can distinguish a species-area curve from a species accumulation curve, is the relative ‘nestedness’ of the environment being sampled. As I mentioned earlier, ‘nestedness’ is a measure of structure and distribution of species across a location, So an example of nestedness would be a location that may have a few species overall, with subsets of locations with more species, or pockets of diversity and heterogeneity. When nestedness is high, the slope of the species area curve is reduced relative to the species accumulation curve. The opposite occurs when nestedness is low. So in this case, all of the slopes of the species area curves I sampled are lower or less than the slopes of the species accumulation curves, for all sites. This suggests that in terms of the spatial patterns of diversity at Salmon River, there may be nesting or patchiness occurring, within tidal marshes, where there may be fewer species overall, especially at restored sites, with pockets of diversity. And this inference is consistent with what I observed in the field on the ground.

As you can see from this Species-Area graph, the reference marsh has the highest richness of all marshes sampled, though YMarsh is a close second as plot size dimension goes up. Mitchell Marsh and Salmon Creek marsh are both comparably low, suggesting that as scale of area increases, species richness does not increase by much. This is also a slight deviation from the species accumulation curves, which suggested Mitchell Marsh may be more diverse than YMarsh at the 1 meter scale on transect plots. While diversity and richness seem to vary within restored marshes by scale, the Reference Marsh has consistently higher richness, and Salmon Creek has consistently low richness.

This graph, shows the trend for species accumulation average number of species encountered per meter squared plot (for transect) over cumulative plots sampled. These data are the result of a combinatrix from PC-Ord that accounts for every possible combination of x number of plots sampled. You can see from this graph, that the square meter plots on the reference marsh have a significantly higher number of cumulative species, compared to all other reference marshes. Salmon Creek has the lowest, and Mitchell and YMarsh are intermediate.

also compared species accumulation (effort curve) on the square meter plots from the MW plots I sampled and found that species accumulation patterns were overall consistent. The only difference here is that Mitchell Marsh was found to have higher species accumulation over area than YMarsh, which in this case was comparable to Salmon Creek’s low diversity. The difference in YMarsh diversity could be from the placement of MW plots or patch variation, both Salmon Creek and YMarsh have large patches of C. lyngbyei. Ultimately this shows that there is variability in species richness within Restored Marshes, but consistently high richness on the reference marsh.

I conducted a multivariate analysis with my field data using the non-metric multidimensional scaling technique which is ideal for ecological data that is non-normally distributed. NMS avoids linear assumptions and uses ranked distances to linearize relationships within the dataset. This allows the user to see a wider variety of structures within the ordination and make a number of insightful observations and conclusions. Now if any one particular graphic could summarize my entire thesis, this would probably be it, and I will do my best to highlight the most salient features here. First I would like to point out that between each of the tidal marshes, which are represented by these amorphous colored convex hulls, there is very little overlap between all sites, and the reference marsh, on the left here in red, is the most divergent from any other marsh.

The reference marsh is also closely associated with a number of high marsh species that are indicative of native, or ‘reference communities.’ The Reference marsh is high up on the elevation axis, also demonstrating that it has high elevation throughout. YMarsh, the blue convex hull at the bottom. Has the longest, widest convex hull, which means that elevation varies throughout the site which is why you see sample units on the lower end of the elevation axis and towards the middle of the elevation axis. The species that are coded here and directly associated with the YMarsh site are halophytic, or salt tolerant, suggesting that these plants are found at low elevations on areas with high soil salinity and conductivity, which describes the conditions of YMarsh. Mitchell Marsh and Salmon Marsh overlap in this case, likely because they both have lower soil salinity and conductivity values from freshwater influence, and mostly mid to low elevation ranges, with the exception of a few high elevation outliers. Both of these sites seem to be associated with introduced species (reed canary grass, PHAR) or pasture grasses like Agrostis stolonifera.

Predominately, there were many instances of homogenous patches of Carex lyngbyei at both of these sites, and at YMarsh as well. In fact the only point at which all three restored sites converge is at the end of the ‘restoration’ axis over Carex lyngbyei. Carex lynbgyei is also divergent from all other species sampled, because it often occupies monotypic swaths of marsh, and is found on the lower end of the restoration axis, based on my coding schematic; areas like the reference marsh were coded with a low number, and restored marshes were given codes with numerical values increasing with the chronological order of dike removals. Carex lynbgyei sits at the end of the axis associated with the highest ‘restoration axis’ values, as it represents a strong pattern within all marshes that have experienced dike removal at any point in time, thus it is indicative of restored ecosystems. Unfortunately, there were no significant differences found between sites and any of the other soil characteristics we sampled for, but this maybe more related to sampling techniques and would be interesting to revisit further. We only collected soil from a surface depth of 10 cm, so perhaps if our samples were collected at a deeper level, we would see stronger patterns related to pH, Bulk Density, and C:N ratios. So in summary, the reference tidal marsh vegetation is richer, more diverse, and complex (heterogenous), in the number and variety of species (high marsh/low marsh) than restored salt marsh vegetation at Salmon River Estuary, across field methods, ~40 years later Carex lyngbyei has persistently dominated restored areas post dike removal which marks a significant departure from patterns of species assemblage on the reference marsh.

NMS ordination for Transect plots, examining species distribution over tidal marshes, elevation and soil salinity.

I also conducted an NMS analysis with my Modified Whittaker plot, and the patterns I observed in species associations with environmental characteristics on transect methods were consistent here as well. Though I we did not collect soil samples . I observed the same things from MW plots as I did from Transect plots. There is one small exception, where the Mitchel Marsh convex hull overlaps with the YMarsh convex hull, and this has more to do with coincidence of similar species found within MW plots on those sites, YMarsh and Mitchell Marsh both had a large presence of C. lyngbyei. In this case, Mitchell Marsh also had instances of salt tolerant species within the MW plots, suggesting that there is variation within Mitchell not only at different scales but at different extents of sampling. This also suggests or reinforces the notion by suggesting that Mitchell Marsh has saltier soils compared to Salmon Creek Marsh, despite both of them having freshwater influences. Also, despite differences in shape, all of the restored marshes’ convex hulls converge over Carex lyngbyei, which is the species mostly strongly correlated with disturbed and restored conditions.

MW NMS that shows consistent species patterns over elevation and tidal marshes. Soil samples were not collected for MW plots.


6. Significance: How is your spatial problem important to science? to resource managers?

Over 40 years later, the tidal marshes of Salmon River Estuary are still very different, and it’s possible they were different to begin with, based on their unique geographies, that influence salt inundation and soil patterns. Salmon River Estuary salt marshes also appear to have responded to and developed from disturbance differently; each is still following a different restoration pathway 40 years later. Soil salinity, elevation, and inundation patterns (channels) vary by geography among these salt marsh sites in the SRE, and have likely played a role in determining species composition by site. Extensive stands dominated by dense cover of C. lyngbyei represent an alternate stable state for vegetation of SRE salt marshes, and would be an important component to understanding novel community functions, as they relate to restoration and future scenarios. Species assemblages vary both by biogeography (soil, elevation, location) and land use history (pasture use, diking, dike removal).

The spatial problem I have chosen to investigate is of importance to scientists, as it provides further insight into how Pacific Northwest Coastal Estuaries recover from land use and disturbance, a phenomenon that has not been thoroughly studied yet. This work is valuable to land managers and conservationists who are tasked with coastal wetland mitigation in the PNW, as this case study severs as one of the few examples of long term estuary esearch on the Oregon coast. Estuaries have historically served as habitat and resources for keystone salmonid fish species, invertebrates, migratory birds, waterfowl, and mammals (such as beavers), particularly in the Pacific Northwest (Klemas 2013). Restoring these habitats is critical for protecting wildlife, managing wetland resources and eco-services, and maintaining our shorelines, especially as we face sea level rise from impending climate change. Threats to environmental stability in the case of wetlands can also harm their cultural value. Wetlands have inherent eco-beauty and are among the many natural systems associated with outdoor recreation. If wetlands are disturbed via pollution, compaction, or compositional change in vegetation, little is understood about the certainty of recovery in the context of reference conditions.

Factors influencing the richness and environmental integrity of estuaries like Salmon River are associated with the physiognomic and taxonomic features of the plant community. This study focuses on the spatio-temporal distribution patterns of Salmon River vegetation to explore how remnant and restored marshes differ in terms of biodiversity and species composition. Salmon River Estuary is especially noteworthy due to its unique management history. Salmon River is exceptional compared to other Oregon coastal wetlands, as it was federally protected before any industries could establish influence. Arguably, Salmon River has avoided most disturbance from development because of its relatively small size; there have been no instances of dredging or jetty construction for the purposes of navigation. Previous use as pastureland with dike establishment in the 1960’s is the dikes established in the 1960s and removed from 1978 onwards encapsulates the majority of known human influence on the marsh. Beginning in 1978, periodic vegetation surveys on site with long term ecological research at Salmon River has created intimate knowledge of the estuary and promoted ecological sustainability.

There has been a dramatic shift with regards to wetland protection and how our government and the public views them. Over the last few decades, policies promoting wetland conversion and development have been exchanged for protection and regulation initiatives. Wetland management goals today are largely focused on restoration to compensate for loss and damage, which has forged new industries tasked with wetland recovery and monitoring. However, some mitigation project datasets and sites are too small to collect useful data or make a meaningful impact on a large environmental scale. It is necessary to amass a variety of high quality data on larger wetland areas over longer periods of time to address how natural recovery processes may be employed for wetland conservation. Salmon River is an excellent long term case study for examining the prospect of rehabilitation for ecosystem functionality and reference conditions (Frenkel and Moran 1991; Frenkel 1995; Flitcroft et al. 2016).

So to recap, there seems to be a false association between restoration and reference conditions, in the case of Salmon River. Though Salmon River Estuary is an example of successful restoration, it does not mirror pre-disturbance ecosystem structure. Thus it seems challenging to manage for pristine environments, since pre-disturbance conditions are often not well known or pristine for that matter, the impacts of disturbance may persist over long periods of time (like C. lyngbyei) and both intact and disturbed wetlands are changing constantly so its impossible to protect them from undue influence. However, we can continue to define and promote functionality in ecosystems as function may change with structure. As a result, from the work I have done, I would recommend adapting our expectations for Salmon River and for the passive, deliberate restoration of estuaries. I would also recommend to continue to restore for function and monitor structural changes so we can understand and infer novel function in ecosystem context (Gray et al 2002)

7. Your learning: what did you learn about software (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) other?

I have previous experience with Arc that I have developed and expanded upon at Oregon State while pursuing my MS in Geography. I was able to do some work in Arc with mapping my data, but I utilized knowledge that I had acquired previously. Ultimately, I have analyzed most if not all of my data thus far in PC-Ord, under the guidance of Dr. Bruce McCune (the software creator), which enables a variety of ordination and multivariate analysis options. I learned how to conduct Mantel tests, ISA, and other multivariate comparisons within PC-Ord.  I did not learn much additionally about Python or R, except what other students were able to complete from their tutorials.

8.What did you learn about statistics, including (a) hotspot, (b) spatial autocorrelation (including correlogram, wavelet, Fourier transform/spectral analysis), (c) regression (OLS, GWR, regression trees, boosted regression trees), and (d) multivariate methods (e.g., PCA)?

I learned that statistics programs like PC-Ord can be preferable for datasets that have lower or more fine scale spatial resolution; my data were difficult to use in Arc because it’s format was not easily to interpolate. As a result, I learned a lot about Principal Components Analysis techniques to tease out patterns in my data, and look at how species assemblage patterns vary by environmental conditions and site treatments (dike removal). However, I found it helpful to visualize my data in a map, even though I was limited to the point location of my plots, and categorize them based on spatial patterns from my plot data (percent cover of species).  I also learned how to conduct a Mantel test on my data at a variety of spatial scales to look for auto-correlation.

From learning about other student’s tutorials, I learned about geographically weighted regression (GWR), and how one may examine clustering of particular environmental conditions with the GWR tool in Arc. GWR can show that certain environmental characteristics  (like canopy cover, understory vegetation cover, elevation) show positive or negative correlation in different locations. I also learned about hotspot analysis from student tutorials as well and found that it can also be used to infer spatial relationships between environmental variables and location. Hotspot analysis can be useful for looking at density of populations or biodiversity.

9. How did you respond to comments from your peers and mentors/instructors?

I received useful comments from my peers and the instructor (Dr. Julia Jones), about considering how salt inundation on the tidal marshes I study, may have a causal relationship to differences in species assemblages in addition to restoration treatment (diking and dike removal).  It’s important to acknowledge confounding factors within my data, as my study is inductive.Dr. Jones also suggested originally that I investigate spatial autocorrelation with Mantel tests, to see how variable species assemblages are within my data. I was able to incorporate feedback that helped me with the formation of my analysis and interpretation of my results.

Literature Cited

Adamus, P.R., J. Larsen, and R. Scranton. 2005. Wetland Profiles of Oregon’s Coastal

Watersheds and Estuaries. Part 3 of a Hydrogeomorphic Guidebook. Report to

Coos Watershed Association, US Environmental Protection Agency, and Oregon

Depart. of State Lands, Salem.

Borde, AM, Thome, RM, Rumrill S, Miller, LM. (2003). Geospatial Habitat Change

Analysis in Pacific Northwest Coastal Estuaries. Estuaries, 26, 1104-1116.

Flitcroft, RL, Bottom, DL, Haberman, KL, Bierley, KF, Jones, KK, Simenstad, CA, Gray, A,

Ellingson, KS, Baumgartner, E, Cornwell, TJ and Campbell, LA. 2016. Expect the

unexpected: Place-Based Protections can lead to Unforeseen benefits. Aquatic

Conservation: Marine and Freshwater Ecosystems (26): 39-59.

Mather, P.M. 1976. Computational methods of multivariate analysis in physical

geography. J. Wiley & Sons, London. 532 pp.

McCune, B. and J. B. Grace. 2002. Analysis of Ecological Communities. MjM Software,

Gleneden Beach, Oregon, USA (

National Soil Survey Center, Natural Resources Conservation Service, U.S. Department of

Agriculture (2009), Soil Survey Field and Laboratory Methods Manual. Soil Survey

Investigations Report No. 51.

Stohlgren, TJ, Falkner, MB, and LD Schell. 1995. A Modified-Whittaker nested vegetation

sampling method. Vegetatio 117: 113-121.

Weilhoefer, C, Nelson, WG, Clinton, P, Beugli, DM. (2013). Environmental

Determinants of emergent macrophyte vegetation in Pacific Northwest estuarine

tidal wetlands. Estuaries and Coasts, 36, 377-389


Habitat use of blue whales New Zealand’s industrial South Taranaki Bight region

Filed under: My Spatial Problem 2017 @ 12:44 pm

Research objectives

My research focuses on the ecology of blue whales (Balaenoptera musculus brevicauda) in New Zealand’s South Taranaki Bight region (STB). Despite the recent documentation of a foraging ground in the STB (Torres et al. 2015), blue whale distribution remains poorly understood in the southern hemisphere. The STB is New Zealand’s most industrially active marine region, and the site of active oil and gas extraction and exploration, busy shipping traffic, and proposed seabed mining (Torres 2013). This potential space-use conflict between endangered whales and industry warrants further investigation into the spatial and temporal extent of blue whale habitat in the region. My goals are to investigate the relationship between blue whale presence and their environment, and subsequently to examine how their space-use overlaps with industry presence. Specifically, I intend to:

  • Quantify the relationship between sea-surface temperature (SST), chlorophyll-a (chl-a), krill density, and blue whale presence
  • Investigate the spatial overlap between blue whale presence and oil and gas extraction platforms, the Trans-Tasman Resources Ltd. proposed seabed mining site, and shipping traffic

Map of New Zealand with the South Taranaki Region indicated by the white box.


A blue whale surfaces in front of an oil rig in the South Taranaki Bight. Photo by Kristin Hodge.


I will be working with data collected during vessel-based surveys in February of 2014, 2016, and 2017 in the STB. Blue whale sighting location and group size were recorded by observers during the surveys. To record the oceanographic conditions throughout the water column, profiles of water column depth, temperature, and salinity were recorded using a Sea-Bird microCAT (SBE 911plus) Conductivity, Temperature and Depth (CTD) sensor approximately every hour during survey and at every blue whale sighting. Krill density and patch size will be quantified from hydroacoustic backscatter data collected with a Simrad EK60 echosounder (Simrad ES120-7DD splitbeam transducer, 120kHz transceiver, 250 W, 1.024 ms pulse length, 0.5 s ping rate). The echosounder data have not yet been processed, however I hope to do so this term so that the prey data can be included in these analyses.

In addition to the in situ data collected during our surveys, I plan to incorporate satellite imagery of SST and chl-a concentration for the region. I will use satellite data generated from NASA Moderate Resolution Imaging Spectrometer (MODIS).

I have been provided with the locations of the oil and gas drilling platforms and the proposed site for the iron sands seabed mine. I will use ship automatic identification system (AIS) data for the shipping traffic layer.


  • Blue whale presence will show a positive relationship with chl-a concentration and krill density
  • Blue whale presence will show a negative relationship with SST
  • There will be apparent inter-annual differences in blue whale sighting distribution, reflecting the strong El Nino conditions seen in 2016
  • Blue whale presence will overlap spatially with industrial activities


I plan to use ArcMap to visualize blue whale presence and the layers of oceanographic data I have described previously (in situ SST and prey density, remote-sensed SST and chlorophyll-a). I will then use R to compute either a generalized linear model (GLM) or generalized additive model (GAM) to evaluate the association between blue whales and these oceanographic variables, with blue whale presence as the response variable.

For the overlap between blue whale presence and industry, I intend to use ArcMap to visualize this spatial overlap by creating buffers around the stationary platforms and the proposed mining site and examining how often blue whale sightings took place within those buffers.


I intend to produce map figures that will show oceanographic measurements interpolated over our study area, overlaid with our blue whale sighting locations for each year of study. I hope to be able to report the model results that quantify the impact of our measured oceanographic conditions on blue whale presence.

My priority for this course is the habitat analysis and modeling. The industry overlap will be more of a visual examination at this stage, and more quantitative analyses of impacts will take place subsequently once a foundational understanding of blue whale habitat use in the region has been established.


Despite their enormous size and once-large global population, relatively little is known about blue whale distribution and habitat use due to their elusive nature and relative inaccessibility for study. Blue whales have extremely high metabolic demands in addition to employing an energetically expensive lunge-feeding foraging strategy (Croll et al. 1998, Goldbogen et al. 2011, Hazen et al. 2015). The ability to consistently locate dense patches of prey is therefore critical to blue whale survival, making the documentation of foraging grounds such as the STB region important. The STB region presents a unique opportunity for studying this species in a location where they seem to be consistently found in high abundance and relatively close to shore. But beyond their relative accessibility, the understanding of blue whale habitat use on a foraging ground will contribute to the body of knowledge on these endangered and little-studied whales.

Under the New Zealand threat classification system, blue whales are currently listed as ‘Migrant’ in New Zealand waters. However, our preliminary analyses point toward the possibility of a resident population of blue whales in New Zealand. Observations of foraging, breeding, and nursing behaviors demonstrate the likelihood of the STB form multiple critical life history functions. The strong industrial presence in the region and the ongoing push for industry expansion makes gaining an understanding of the spatial and temporal extent of blue whale habitat critical for management decisions.

A blue whale mom and calf surface in the South Taranaki Bight. Photo by Dawn Barlow.


I have some experience with Arc and R from coursework and my own preliminary analyses. I think that with more time and more practice I will grow more confident in my ability to use them. I have used MATLAB some, but mostly as a platform for running more specialized software programs specific to my field. I have no experience in python.


Croll DA, Tershy BR, Hewitt RP, Demer DA, Fiedler PC, Smith SE, Armstrong W, Popp JM, Kiekhefer T, Lopez VR, Urban J, Gendron D (1998) An integrated approach to the foraging ecology of marine birds and mammals. Deep Res II 45:1353–+

Goldbogen JA, Calambokidis J, Oleson E, Potvin J, Pyenson ND, Schorr G, Shadwick RE (2011) Mechanics, hydrodynamics and energetics of blue whale lunge feeding: efficiency dependence on krill density. J Exp Biol 214:131–146

Hazen EL, Friedlaender AS, Goldbogen JA (2015) Blue whales (Balaenoptera musculus) optimize foraging efficiency by balancing oxygen use and energy gain as a function of prey density. Sci Adv 1:e1500469–e1500469

Torres LG (2013) Evidence for an unrecognised blue whale foraging ground in New Zealand. New Zeal J Mar Freshw Res 47:235–248

Torres LG, Gill PC, Graham B, Steel D, Hamner RM, Baker S, Constantine R, Escobar-Flores P, Sutton P, Bury S, Bott N, Pinkerton M (2015) Population, habitat and prey characteristics of blue whales foraging in the South Taranaki Bight, New Zealand.

Nitrate Abundance in 10 Oregon Watersheds as linked to percentage of Red Alder Forest Cover

Filed under: My Spatial Problem 2017 @ 11:31 am

Research Description

I am exploring aquatic chemistry among 10 streams in the Oregon Coast range. I am running anion analysis over the period of 1 year, sampling monthly. These streams are divided into three categories, three are located East of Mary’s Peak in the Rock Creek Watershed, three near the top of Mary’s peak (highest in elevation), and four streams are sampled on the Oregon Coast within 200 meters of the Pacific Ocean. I would like to perform a comparison analysis of anions between the three categories, and find if there is any significant difference in anion abundance between the three locations and over time (6 months of data collected so far).

Dataset Analysis

The dataset I will be analyzing includes temporal anion data (for 6 months, project still in progress) and GPS points for all 10 site locations. The anions analyzed include chloride, sulfate, fluoride, and nitrate. Nitrate may be the most interesting as far as ecological impact is concerned (known to be limited in aquatic ecosystems), but chloride has also been an abundant anion while looking at the data.


I hypothesize that due to the influence of Red Alder in the forested watersheds of the Oregon Coast range, the Oregon Coast and Rock Creek watershed streams will have a higher amount of measurable nitrate than those streams sampled near the top of the watershed (near Mary’s Peak). I also predict that Nitrate would be higher in the Fall (November and December) than in the spring (March and April) due to the “flushing” of nutrients from the forest floor after summer low flows.

Analysis Approach

I would like to perform statistical analysis on these streams as they relate to nitrate using R and map the differences using Arc GIS. Preferably I would like to delineate the 10 watersheds and then do a comparison analysis of the percentage of Red Alder in each watershed, and the amount of nitrate data collected in the stream.

Expected Outcome

I would like to produce a map showing nitrate export in all 10 study sites as well as statistical evidence of the different categorical zones. This map will include the 10 delineated watersheds, one watershed is “nested”, therefore the output will likely include 9 watersheds and rank the watersheds based on red alder abundance and nitrate concentration. I’d also like to explore a statistical analysis comparing the three categorical locations and nitrate concentrations.


Baseline data collection for anions in Oregon Coast Range streams over a year has not been done before. In the face of a changing climate, it is important to document and characterize stream chemistry for forest managers in the future. Increases in nitrates in these streams can affect local flora, fauna as well as increases nutrient loading to the ocean. This loading can affect macroinvertebrate and salmon habitat which can alter ecosystem dynamics and services that currently exist.

Level of preparation

I have basic knowledge of R, but I think it would the most useful tool for statistical analysis of the data. Arc-info is a program I at one time used daily but it has been many years since. I would like to be able to map the anion data between the 10 sites but may struggle in the beginning linking the data with Arc. I have no experience with Python or other relevant spatial analysis software.

Patch dynamics for short-interval disturbances of beetle outbreak and wildfire

Filed under: My Spatial Problem 2017 @ 11:05 am

Research Question

What controls high-severity patch size during wildfire when landscape have large portions of dead forest from a prior disturbance of beetle outbreak?


Fire is a complex landscape process that is controlled by the heterogeneity of vegetation/fuels, topography, and weather.  The landscapes created by fire may dictate forest structure and function for decades. In central interior British Columbia, lodgepole pine (Pinus contorta var. latifolia) forests have been the epicenter of a recent regional outbreak of mountain pine beetle (Dendroctonous ponderosae), which has resulted in widespread tree mortality and potentially increased forest vulnerability to wildfire. While lodgepole pine landscapes are accustomed to large-scale singular disturbances of beetle outbreaks and wildfire, beetle outbreaks followed by wildfires are less familiar. Warm winters and an abundance of mature lodgepole pine have facilitated the unprecedented scale of beetle activity in the region. The resulting tree mortality alters vegetation/fuel structure, fire behavior, and subsequent fire severity. It is unclear if the resulting landscape structure from beetle outbreak influences the subsequent patchwork from fire.  A number of regional studies have examined the controls of post-fire patch size under a single disturbance of wildfire (Collins and Stephens 2010, Harvey et al. 2016, Reilly et al. 2017).  An important knowledge gap I have identified is the role of underlying disturbance structure on patch dynamics during wildfire events.

In this course, my primary object is to determine the controls of patch size for high severity patches under conditions of consecutive landscape scale disturbances of beetle outbreak and fire.  In order to do this, there are a number of intermediary steps/questions to ask and answer:

  1. What are the patch sizes following beetle outbreaks?
  2. What are the patch sizes following wildfire?
  3. What patches are high-severity fire?
  4. What are the topographic roughness characteristics associated with each high severity patch?
  5. What is the radiative output (proxy for burning conditions) for each high-severity patch during day of burn?
  6. What is the dominant vegetation type and moisture level associated with each high severity patch prior to disturbance?
  7. What is the relationship between each explanatory variable – topography, burning conditions, vegetation, and beetle outbreak patch size, to the response variable –high-severity patch size?


Figure 1. Post-fire photo of a fire that burned through beetle-killed forest in 2014 with various patches of fire severity.

 Description of datasets

  1. The spatial extent of all layers are defined by the fire perimeters for three different fires that burned in 2012, 2013, and 2014.
  2. Tree mortality from the bark beetle outbreak has been identified by calculating a Normalized Difference Vegetation Index (NDVI) layer derived from Landsat 30m resolution from one year prior to each fire event.
  3. Fire severity patches will serve as the response variable. Fire severity patches were generated from calculating the Relative differenced Normalized Burn Ratio (RdNBR) from Landsat 30m resolution (Miller and Thode 2007). RdNBR is calculated by generating an NBR image for both pre- and post-fire, then calculating the dNBR, which is then used to calculate the RdNBR. The purpose of RdNBR is to minimize the bias of the pre-fire conditions.  This relativization process allows for comparison across fire events with minimal biasing from differences in pre-fire conditions.
  4. The pre-disturbance landscape structure including vegetation and moisture types will be based on a polygon shape file with the biogeoclimatic zones for the region. Biogeoclimatic zones are a specific classification system for British Columbia, Canada. Zones are based on climax vegetation while accounting for climatic and edaphic conditions (Meidinger and Pojar 1991).
  5. The gently rolling landscape of the central interior offers little topographic relief. While topography may not be a strong control over patch size, we will account for topographic variability with a layer for topographic roughness. Topographic roughness was calculated from a 3-by-3 moving window using a 25 m Digital Elevation Model (DEM).
  6. Weather is considered a strong control for fire severity and patch size. Day of burn weather conditions are based on the thermal band hotspot data from MODIS. The hotspot data is a point shape file that includes a recording of radiative energy for specific point locations.


My ecological hypothesis is that burning conditions (i.e. weather during day of burn) exert strong control over patch size.  Fire conducive weather, dry and windy conditions, control large patch development by facilitating crown fire in lodgepole pine dominated landscapes.  Beetle outbreaks change the landscape structure by killing trees, which results in decreased continuity of canopy fuels. The lack of canopy fuels may constrain fire spread to the surface and thus limiting patch size. However, fire weather may still override the changes in fuel arrangement and continuity created by beetle outbreak, where dry windy conditions produce large patches regardless of underlying patch characteristics from the beetle outbreak.


  1. Calculate patch variables for tree mortality from the beetle outbreak using Fragstats (McGarigal et al. 2012). Characterize patch size, complexity, and configuration.
  2. Calculate patch variables for high-severity patches using Fragstats (McGarigal et al. 2012). Characterize patch size, complexity, and configuration.
  3. Overlay layers and extract explanatory variables (topographic roughness, radiative energy, beetle outbreak patch size) that correspond to individual patches in ArcMap.
  4. Conduct multiple regression analysis to determine which variables influence high severity patch size using R statistical software.

Expected Outcomes

I would like to produce maps showing characteristics of the landscape. I would also like to have analysis showing if there is a relationship between any of these variables and patch size.


Overlapping disturbances of beetle-outbreak and wildfire will influence landscape structure and function for decades in central interior British Columbia, Canada.  It is important to characterize patch dynamics to generate ecological context for overlapping disturbances. This research can provide insight on patterns and controls for fires that were allowed to burn unhindered by suppression, which is important for understanding contemporary disturbance regimes and providing guidance for management decisions.

Level of Preparation

  • Arc: fairly proficient
  • Model builder and Python: have used both, but probably a bit rusty; novice
  • R: growing knowledge; able to conduct statistical analysis and process Landsat imagery


Collins, B. M., and S. L. Stephens. 2010. Stand-replacing patches within a “mixed severity” fire regime: Quantitative characterization using recent fires in a long-established natural fire area. Landscape Ecology 25:927–939.

Harvey, B. J., D. C. Donato, and M. G. Turner. 2016. Drivers and trends in landscape patterns of stand-replacing fire in forests of the US Northern Rocky Mountains (1984???2010). Landscape Ecology 31:2367–2383.

McGarigal, K., A. Cushman, and E. Ene. 2012. FRAGSTATS v4: Spatial patterns analysis program for categorical and continuous maps. Univeristy of Massachusetts, Amherst, Massachusetts.

Meidinger, D. V, and J. Pojar. 1991. Ecosystems of British Columbia. Page B.C.

Miller, J. D., and A. E. Thode. 2007. Quantifying burn severity in a heterogeneous landscape with a relative version of the delta Normalized Burn Ratio (dNBR). Remote Sensing of Environment 109:66–80.

Reilly, M. J., C. J. Dunn, G. W. Meigs, T. A. Spies, R. E. Kennedy, J. D. Bailey, and K. Briggs. 2017. Contemporary patterns of fire extent and severity in forests of the Pacific Northwest, USA (1985-2010). Ecosphere 8:e01695.

Addressing Spatial Patterns in Multispectral Imagery of SW White Pine Seedlings Grown in Common Garden Boxes

Filed under: 2017,Final Project,My Spatial Problem 2017 @ 8:49 am

1. The research question I explored in GEOG 566 is: “What relationship do spectral reflectance signatures of southwestern white pine (P. strobiformis) seedlings in common garden boxes have with box number and distance from center of box?”

Another way to word the question is : “How do spectral responses differ with the boxes that seedlings are grown in or to the position of the seedlings within the box?”

2. My raw data consist of 500 photos taken from a UAV of common garden boxes in northeastern Arizona. I used Agisoft Photoscan to compile the images into a 5 layer orthomosaic (Figure 1) where each layer represents a discrete band from the multispectral sensor (Table 1).  Processed mosaic images are georeferenced using RTK GPS coordinates for targets which were arranged around perimeter of the AOI. Once georeferencing is completed, the stacked orthomosaic is exported from Photoscan as a TIFF file which can be viewed and manipulated further in ArcMap or R. Two additional spectral index layers (NDVI and TGI) were easily created from the other bands using raster algebra (Figure 2).

Analyses were conducted on the orthomosaic raster which is in .TIF format and is 336MB in size. The extent of the scene is about 10m x 29m and includes 1697 seedlings. The pixel width of the image (spatial resolution) is about 5mm. This fine scale allows for individual leaves to be detected in the image.

Figure 1: True color composite orthomosaic image of common garden boxes containing southwestern white pine (P. strobiformis) seedlings in Kaibab National Forest, Arizona, USA.

Table 1: Micasense Rededge multispectral sensor band designations and spectral information.

Figure 2: Normalized Differential Vegetation Index (NDVI) layer for one common garden box

3. Based on visual analysis of NDVI hot spot analysis (Figure 3), there appears to be some patterns both at the box and plot levels. Specifically, the healthiest seedlings are mostly grouped in the center of boxes and the least healthy ones are mostly grouped around the edges. Also, the healthiest and least healthy seedlings seem to occur in different boxes nearly all of the time.

Based on this, my hypothesis was that spatial statistics would reveal patterns between spectral reflective signatures of individual plants and individual boxes and/or the location of the plant within its box.

Furthermore, I hypothesized that by regressing on these spatial variables, the distribution of spectral responses would resemble more what we expect in a common garden experiment: randomness.

Figure 3: Hot spot analysis reveals visual patterns in  mean crown NDVI both across and within boxes.

4. Following the inital hot spot analysis of mean crown NDVI in ArcMap, I used ordinary least squares (OLSR)  to view the distributions of my spatial variables compared to NDVI. The results support the notion that hypothesis 1 is correct. Next, I used geographically weighted regression (GWR)  to test the first hypothesis and weigh the variables. Because the multiple regression seems to bring out the randomness in the spectral data, hypothesis 2 was also well supported.

5.  The results of my OLSR analysis (Figure 4) supported my first hypothesis. The box number (BOX_) variable has an obviously non-normal distribution, and the distance from center of box variable (BOXCENTER) has a skewed distribution. The other two variables, distance to nearest neighbor (NN) and distance to plot center (PLOTCENTER) appeared to be normally distributed. As a result, I was able to confidently move forward and look into the second hypothesis only for the two spatial variables mentioned (BOX_ and BOXCENTER).

Figure 4: Distributions of 4 spatial variables vs. NDVI.

To investigate my second hypothesis I created GWR maps of my seedlings, first with BOX_ as a lone explanitory variable (Figure 5), then with BOX_ and BOXCENTER together (Figure 6). The resulting data are much less obviously spatially autocorrelated, suggesting that the second hypothesis could also be confirmed.

Figure 5: GWR of NDVI using box number as an explanitory variable reveals a large reduction in the number of clumped outlier datapoints and an increase in the apparent ‘randomness’ of spectral signatures

Figure 6: Though the effect is less pronounced than the box number regression, adding distance to box center to the GWR did seem to further increase randomness, especially at the east and west extremes of the area of interest.

6. My analyses are significant for a few main reasons. For my own research, they mean that I now have a protocol for processing this sort of dataset in a way that allows me to account for spatial patterns. I did not expect to see a box effect of this magnitude but I am much more equipped to address it moving forward.

For the larger project I am part of, it means that we need to account for (or at least test for) the box effect in the context of all analyses. If the boxes are really as different from one another as my spectral data suggest, then some of the assumptions other members of my research team are making could be invalid. One way to investigate this could be testing soil moisture content across many boxes to look for differences.

For science as a whole, there is promise that I will be able to reliably phenotype seedlings based on relative drought resistance in the next two years. Even more exciting, my workflow can also be used in other experiments with seedlings grown in common garden boxes, including disease resistance screening. If effective, these techniques could greatly reduce the man hours required to conduct these screenings and allow for larger, more comprehensive experiments without requiring as much funding.

7. I learned immensely in this course, especially about tools in the “spatial statistics’ toolbox in ArcMap. I had not conducted hot spot analysis, OLSR, or GWR with my own data before. After doing so, I feel much more capable of carrying out and interpreting these.

Also, I was able to create a complex and versatile R code over the course of the term. Within R I learned to test for and execute principal components analysis (PCA) and to automate many of the step in my processing workflow. Without a doubt, the latter will save me hundreds of hours of time. For that, I am simultaneously, impressed, proud, and grateful.

8. Though I knew about the spatial statistics toolbox, I did not know how any of its functions worked nor did I know how to interpret the results. After completing this term, I know how the data need to be formatted to analyze patterns, map clusters, and model spatial relationships in ArcMap. Also, though I still have much to learn, I can now explain what the results of some of these analyses mean within the context of actual data.

In regards to learning about how to use R, I have been in an exponential period of learning since I started using it this past February. Just this term I have been introduced to several packages that have exciting applications within my research interests. For more details about how I used R to automate my data analysis, see my tutorial 2 here.

9. The comments I received about my tutorials and presentations have been helpful for me to stay on track with my analysis. Periodically, Dr. Jones re-oriented me on the specific question I was asking and how I could refrain from spurious analyses. Because I have so many variables and no experience doing this type of work, that orientation was especially useful and in the end instrumental in the volume of what I was able to accomplish.

My peers were helpful because they gave me a sounding board to bounce my ideas off of in the small group presentations. By reading over the synopses of my short talks I was able to get an idea of how well I explained my study and results. I’ve tailored this final edit to my spatial problem with their comments in mind in order to give the best final presentation possible.

Sensitivity of Forest Structure Parameters Estimation to Unmanned Aircraft System (UAS) Data Processing Algorithms

Filed under: 2017,My Spatial Problem 2017 @ 12:40 am


The emission of Carbon Dioxide (CO2) is one of the greatest factors causing climate change. Forests, as the largest terrestrial carbon reservoir have an important role to reduce CO2 emissions and to prevent climate change. Considering the important role of forests to reduce CO2 emissions and prevent climate change, it is necessary to manage forests sustainably. In sustainable forest management, forest manager must frequently monitor and measure forest structure parameters, such as number of trees, tree locations, tree heights, crown width, biomass, and forest carbon. However, these forest structure parameter, including forest carbon, are usually measured through conventional field measurements, which are costly and labor intensive. New technology, such as Unmanned Aircraft Systems (UAS) can be used as alternative methods to monitor and measure forest structure parameters. There are some programs designed to process UAS visual imagery data and measure forest structure parameters from the data, for example Agisoft PhotoScan, PiX4D, Fusion, Trevaw, TrEx, and Wathershed.

Research question:

The purpose of my research is to evaluate several programs that are designed to measure forest structure parameters, such as number of trees, tree locations, diameter breast height (DBH), tree heights, and crown width which are needed to estimate biomass and forest carbon.  I will base my comparisons against manually segmented tree inventory counts, measured using Fusion software, that are randomly sampled from a hierarchical design. The research will answer questions: (1). How does the accuracy differ between data processing programs to measure forest structures, biomass, and carbon?; (2). What is the best program and model to measure forest structures, biomass, and carbon?. However, for the purpose of this class, I might eliminate some forest structure parameters and focus on parameters number of trees and tree locations.


I hypothesize that all software used in this study can be used to measure forest structure parameters and to estimate forest biomass and carbon, but all of the software have different accuracy due to different algorithms used in each one of the software.


This research will use Unmanned Aircraft System (UAS) visual imagery data that is taken from Sony camera NEX-5R mounted in a fix wing UAS. The data was gathered in Siberut National Park in West Sumatra Province, Indonesia. There are 3 flights with 300 images in each flight. The image resolution is 350 x 350 inch. I will discuss it with my committee members whether or not I should use all the images. I think I will focus on one flight for this class.


This study will use a modeling approach to measure forest structure parameters based on Unmanned Aircraft System (UAS) data. The sampling plots will be selected hierarchically using ArcGIS. The 3D point clouds will be generated using Agisoft PhotoScan and PiX4d. Then, based on the 3D point clouds, forest structure parameters, which include number of trees, tree locations, diameter breast height (DBH), tree heights, and crown width, will be measured using some forest data processing programs (Fusion, TreVaw, TrEx, and Watershed). In addition, I will develop allometric equations and use those equations to statistically measure biomass and carbon based on the tree heights and diameter breast height (DBH) or canopy width. The allometric equations will be based on the linear relationship between forest structure parameters (tree heights, DBH, and canopy width), forest biomass, and forest carbon. I will use RStudio to do the statistical analysis (RMSE and Linear Regression Model). Again, for the purpose of this class I will only focus on number of trees and tree locations. I also will not use all the programs for this class, Maybe I will focus on ArcGIS, Agisoft, Fusion, and TreVaw.

Image 1. The example of hierarchically selected sampling plots using ArcGIS.

Image 2. The example of 3D point cloud generating process in Agisoft PhotoScan.


Expected outcome:

If the hypothesis is true, then the accuracy of each program will be different due to different algorithm. There will be one best program and model compared to manual segmentation using Fusion. However, all the automatic segmentations will not be too different compared to manual segmentation (will be proven by statistical method).


This research will assess and propose UAS as a low cost alternative method for measuring forest structure parameters. Forest structure parameters are usually measured by forest managers using conventional methods of field ground measurements. These conventional methods are costly and labor intensive. By using UAS, forest managers will be able to monitor forest structure parameters in a faster and cheaper way. Furthermore, this research will assess and find the best program and model to estimate forest structure parameters through automatic segmentation. This new method will be very useful, especially when it is used in remote areas with limited accessibility, where it is almost impossible to do field ground measurements.

Level of Preparation:

I have some experience in using Arc GIS but it is just basic thing, and I need to learn more about this software. I used RStudio in two statistic classes (ST 511 and ST 512), and now I am taking ST 513. For other software that I will use in this research (Agisoft PhotoScan, PiX4D, Fusion, TreVaw, TrEx, and Watershed), I do not have much experience with all these software but I will learn how to use them.


April 6, 2017

Evaluating interspecific competition between wolves and cougar in northeast Oregon

Filed under: My Spatial Problem 2017 @ 3:36 pm

Background & Research Question(s)

The research I am conducing is focused on understanding and quantifying the competitive interactions between two apex predators, wolves and cougar. Populations of large carnivores have been expanding across portions of their historical range in North America, and sympatric wolves (Canis lupus) and cougars (Puma concolor) share habitat, home ranges, and prey resources (Kunkel et al. 1999, Husseman et al. 2003, Ruth 2004), suggesting these coexisting predators may be subject to the effects of interspecific competition (interference or exploitative). Interspecific competition among carnivores can affect the spatial distribution, demography, and population dynamics of the weaker predator (Lawton and Hassell 1981, Tilman 1986), but demonstrating asymmetric competition affects through quantified measures or experiments has remained difficult for highly mobile terrestrial mammals like large carnivores. Generally, it is expected that cougar are the subordinate competitor in wolf-cougar interactions, but the frequency and strength of agonistic interactions can be system specific and will determine the overall influence either predator has on the population dynamics of the community (Kortello et al. 2007, Atwood et al. 2009, Creel et al. 2001). My goal is to assess the influence wolf recolonization in northeast Oregon has had on cougar spatial dynamics. For this course, I plan to focus on two analyses associated with questions about how wolves influence the distribution of sites where cougar acquire food (prey) resources and how wolves influence cougar movement paths. More specifically, I hope to address:

  1. Is niche partitioning of kill sites evident between wolves and cougar in northeast Oregon?
  2. Has wolf presence/recolonization changed where cougar kill prey?
  3. Has wolf presence/recolonization changed cougar movement dynamics?


The data I’ll be using to address the above questions comes from location points downloaded from global positioning system (GPS) collars that were deployed on a sample of wolves and cougar in northeast Oregon (Figure 1). Collars were programmed to obtain 8 locations per day and were downloaded very 2-3 weeks to identify location “clusters” for field investigation as potential kill sites. After identifying “clusters” from location data for each carnivore, we loaded the coordinates for the clusters onto handheld GPS units and systematically searched surrounding areas for prey remains. When prey remains were located, we recorded the spatial coordinates of the kill site and determined the species, age, sex and physical condition of the prey item that was killed. This process was done before (2009 – 2012) and after (2014 – 2016) wolves recolonized a 1, 992km2 area (Mt. Emily WMU) and produced a data set of kill site locations for cougar (npre-wolf = 1,213; npost-wolf = 541) and wolves (n = 158).


Figure 1. Location of the Mt. Emily Wildlife Management Unit in northeast Oregon and global positioning system (GPS) locations for cougars monitored to determine kill site distribution and spatial dynamics pre- (2009-2012) and post-wolf recolonization (2014-2016).


Based on competition theory and emerging evidence on wolf-cougar interactions in other systems (Alexander et al. 2006, Kortello et al. 2007, Atwood et al. 2009), I expect cougar and wolves in northeastern Oregon to exhibit resource partitioning in foraging niche (resource partitioning hypothesis). Representative evidence for resource partitioning between carnivores would be if cougar kill sites occur on the landscape in areas disparate from wolf kill sites. Further, I expect the presence of wolves to affect cougar movement patterns and spatial distribution, which may be evident through a shift in the distribution and space used by cougar between pre- and post-wolf recolonization periods (niche shift [competitive exclusion] hypothesis). Under this premise, I would also expect cougar to alter their movement and space use relative to pre-wolf recolonization patterns (active avoidance hypothesis).

  1. Is niche partitioning of kill sites evident between wolves and cougar in northeast Oregon?

H1: Resource partitioning hypothesis – cougar kill site distribution on the landscape will occur spatially partitioned from wolf kill sites to avoid interference competition with wolves.

  1. Has wolf presence/recolonization changed where cougar kill prey?

H2: Niche shift (competitive exclusion) hypothesis – cougar may demonstrate altered foraging niche (prey acquisition sites will begin to occur on steeper slopes) and prey species selection (increased numbers of mule deer) between time periods with and without wolf presence.

  1. Has wolf presence/recolonization changed cougar movement dynamics?

H3: Active avoidance hypothesis – cougar will alter their spatial distribution and movement paths to avoid interference competition with wolves.


I will be using ArcMap to visualize data and extract site-specific habitat features for kill sites, but am also interested in avenues to address my questions I may not have thought of. I plan to examine changes in cougar kill site characteristics in a logistic regression framework via latent selection difference (LSD) functions (Czetwertynski 2007, Latham et al. 2013). LSDs allow for direct comparison of habitat selection between two groups of interest to contrast the difference in selection through quantifiable measurements of relationship strengths. The model takes the form,

w(x) = exp(β1x1 + β2x2 + … + βixi)

where w(x) represents the relative probability of wolves (coded as 1) occurring on the landscape compared to cougar (coded as 0) thus using predator species as the dependent variable. The selection coefficient βi is represented for each predictor variable (xi) from a vector of covariates (x) and is interpreted as the relative difference in selection between wolves and cougar, not the selection or use of a given habitat (Czetwertynski 2007). I will be using the lme4 package (Bates et al. 2011) in program R to estimate coefficients contrasting the differences between cougar and wolves, and between cougars pre- and post-wolf recolonization as it relates to kill site habitat features. For the first LSD analysis, known cougar and wolf kill sites will be regressed to evaluate the relative difference in selection for features explicit to these sites (Table 1). For the second LSD analysis, pre- (coded as 1) and post-wolf (coded as 0) cougar kill sites will be regressed to evaluate the relative selection differences in cougar kill sites as it relates to wolf presence.

I also plan to evaluate cougar movement (step length and turn angles) for differences between pre- and post-wolf time periods (question 3) using Geospatial Modelling Environment (GME version 7.2.0, Beyer 2012) and program R. I will also use R to summarize this data (dplyr, MASS, car packages) and evaluate/test for differences using ANOVA and/or non-parametric tests.

Table 1. Landscape structure characteristics of interest for use in regression and latent selection difference (LSD) function modeling in relation to wolf-cougar prey kill site occurrence in northeast Oregon.

Expected Outcome

I plan to produce visuals (graphs/tables/maps) that depict the statistical relationships between:

  • Wolf and cougar kill sites
  • Cougar kill sites pre- and post-wolf recolonization
  • Cougar movement pre- and post-wolf recolonization


Predation is recognized as a major factor influencing population dynamics in which the direct and indirect effects of predators can influence community level dynamics (Terborgh and Estes 2010, Estes et al. 2011). Predation effects on prey populations are tied to the complexities of intraguild dynamics, as the predation risk for shared prey can vary relative to the nature of predator-predator interactions as well as based on the behavioral responses of prey to predators (Atwood et al. 2009). In addition to addressing key ecological questions regarding predator-predator interactions, the results from this research will provide information on the effect of wolves on cougar populations, and potential effects of this expanded predator system on elk and mule deer populations. Knowledge gained from this study will be critical to effective management and conservation of cougar in Oregon and could be useful to other parts of western North America facing similar changes in community dynamics as wolves continue to expand their range.


  • ArcInfo: intermediate/advanced, comfortable with all ESRI products and finding independent sources where I do have knowledge gaps
  • Modelbuilder and Python: novice, have used both but limited knowledge of their strengths and limitations
  • R: intermediate, comfortable manipulating data, using base statistical packages for descriptive stats and more advanced regression modeling
  • Other: GME (intermediate), QGIS (novice)


Atwood, T.C., E.M. Gese, and K.E. Kunkel. 2009. Spatial partitioning of predation risk in a multiple predator-multiple prey system. Journal of Wildlife Management 73:876-884.

Beyer, H. 2012. Geospatial Modelling Environment (software version

Bates, D., M. Maechler, and B. Bolker. 2011. lme4: linear-mixed-effects models using S4 classes. Version 0.999375-42. Available online:

Estes, J.A., J. Terborgh, J.S. Brashares, M.E. Power, J. Berger, W.J. Bond, S.R. Carpenter, T.E. Essington, R.D. Holt, J.B.C. Jackson, R.J. Marquis, L. Oksanen, T. Oksanen, R.T. Paine, E.K. Pikitch, W.J. Ripple, S.A. Sandin, M. Scheffer, T.W. Schoener, J.B. Shurin, A.R.E. Sinclaire, M.E. Soule, R. Virtanen, and D.A. Wardle. 2011. Trophic downgrading of planet Earth. Science 333:301-306.

Creel, S., G. Sprong and N. Creel. 2001. Interspecific competition and the population biology of extinction-prone carnivores. Pages 35-60 in J.J. Gittleman, S.M. Funk, D. Macdonald, and R.K. Wayne (eds). Carnivore Conservation. Cambridge University Press, Cambridge, UK.

Czetwertynski, S.M. 2007. Effects of hunting on the demographics, movement, and habitat selection of American black bears (Ursus americanus. Ph.D. thesis, Department of Renewable Resources, University of Alberta, Edmonton, Canada.

Husseman, J.S., D.L. Murray, G. Power, C.M. Mack, and C.R. Wegner, and H. Quigley. 2003. Assessing differential prey selection patterns between two sympatric carnivores. Oikos 101:591-601.

Kortello, A.D., T.E. Hurd, and D.L. Murray. 2007. Interactions between cougar (Puma concolor) and gray wolves (Canis lupus) in Banff National Park, Alberta. Ecoscience 14:214-222.

Kunkel, K.E. and D. H. Pletscher. 1999. Species-specific population dynamics of cervids in a mulitpredator ecosystem. Journal of Wildlife Management 63:1082-1093.

Latham, A.D.M., M.C. Latham, M.S. Boyce, and S. Boutin. 2013. Spatial relationships of sympatric wolves (Canis lupus) and coyotes (C. latrans) with woodland caribou (Rangifer tarandus caribou) during calving season in a human-modified boreal landscape. Wildlife Research 40:250-260.

Lawton, J.H. and M.P. Hassell. 1981. Asymmetrical competition in insects. Nature 289:793-795.

Ruth, T.K. 2004. Patterns of resource use among cougars and wolves in northwestern Montana and southeastern British Columbia. Ph.D. dissertation, University of Idaho, Moscow, ID, USA.

Terborgh, J. and J.A. Estes. 2010. Trophic cascades: predators, prey, and the changing dynamics of nature. Washington, DC: Island Press.

Tilman, D. 1986. Resources, competition and the dynamics of plant communities. Pages 51-74 in M.J. Crawley ed. Plant Ecology. Oxford: Blackwell Scientific Publications.

Amphibian Disease in the Cascades Range, OR

Filed under: My Spatial Problem 2017 @ 1:06 pm

Carson Lillard
My Spatial Problem


Ranavirus (Rv) and chytrid fungus (Bd) are both deadly pathogens that infect amphibian hosts all over the world. Amphibian die-offs associated with both pathogens have occurred throughout North America, but not much is known about ranavirus in Oregon due to most research being concentrated on Bd. I plan to sample for ranavirus in the Cascades Range, OR. Therefore, one of my objectives is to detect the presence or absence of ranavirus in Oregon, starting in the Cascades Range via eDNA. In preparation, I would like to be able to create an occurrence probability map for ranavirus to determine sites that may contain the virus throughout Oregon to sample. However, I am not sure if needed information is available due to the lack of research on life history of the virus. After data is collected, I would like to assess influence of spatial patterns related to geography, land-use type, pathogen movement, and climate on the distribution of the two diseases, ranavirus and chytrid fungus.

  • Is ranavirus present in Oregon and at what scale?
  • Can ranavirus presence be predicted using statistical methods and life history data?
  • What kind of spatial patterns are associated with ranavirus and chytrid fungus presence, movement, and spread in the Cascades Range?
    After much exploration of data and problems associated with my original spatial problem, I have slightly revised my questions. I originally wondered if I could obtain information about some of the factors that might influence the presence of 3 amphibian diseases (Ranavirus, Bd, and Bsal) and perhaps create a habitat suitability model. To do this, I was going to need information on the current distributions of the pathogens. I quickly realized most of this baseline information is missing therefore it is hard to create a predictability model. And so, my new approach is to create a model after I collect data on disease prevalence in my area of interest (Deschutes National Forest). Now I will collect up to 160 samples from the Deschutes National Forest and perform cross validation with half of those withheld samples to the remaining amount. By doing this, I can create a region-specific model of predicted occurrence after samples are collected. Once I collect information on pathogen distribution I also would like to answer some questions using spatial statistics afterward.  My collection starts this summer so I had no data to work with, so instead I used artificial data I created.

    Slightly revised questions are:

    • Are ranavirus, Bd, and Bsal present in Deschutes NF, and at what scale?
    • What kind of spatial patterns are associated with pathogen presence, movement, and spread?
      • (Exercise 2; Tutorial 1; I asked a smaller question within this bigger question- “How is Bd presence/absence related to the distance from road or trail?”)


Spatial patterns related to geography, land use type, pathogen movement, and climate will affect the presence of ranavirus and chytrid fungus throughout the Cascades Range, OR.

Ranavirus will be more prevalent in areas that see high cattle and/or visitor use.

Ranavirus and chytrid will not be as prevalent in geographically isolated regions.

Bd and ranavirus will be more prevalent at sites where Pseudacris regilla (Pacific chorus frog) exist due to higher mobility.

Ranavirus and chytrid presence and load will vary spatially due to climate, temperature, and elevation.

With a warming climate, pathogens will spread to high elevation ecosystems (if they aren’t already there) and amphibian species will be more at risk of infection in those ecosystems.

Revised hypotheses:

  • I hypothesize that there will be a greater presence of pathogens at lower elevations because of a longer seasonal activity period due to warmer temperatures.
  • I hypothesize that the more amphibian species present the more likely a pathogen is to be present because of likely transmission between hosts.
  • I hypothesize that the size of the water body will influence detection rates of pathogens because of dilution of the pathogen in bigger water bodies.
  • (Exercise 2; Tutorial 1) I hypothesize that Bd presence will be higher in areas that see higher visitor use because of higher transmission rates.


I would like to learn how to map or predict occurrence of pathogens, and understand all the moving parts needed to do so. I would also like to learn the best way to represent and interpret spatial data related to pathogens since they are very complex and context-dependent. I am mainly interested in using ArcGIS programs or learn about other software that are more tailored to my question.

Expected Outcome
I would like to ultimately learn about statistical methods leading to statistical relationships based on pathogen occurrence and spatial patterns as mentioned above. If enough information is available, I would like to produce a map predicting ranavirus occurrence probability in Oregon, or globally, such as the map below showing predicted global Bd occurrence from Xie et al. 2016.


As previously stated, Ranavirus and chytrid fungus are both deadly pathogens that infect amphibians all over the world. Amphibian die-offs associated with both pathogens have occurred throughout North America, but not much is known about ranavirus in Oregon due to most research being concentrated on Bd. Ranavirus die-offs are thought to have occurred in Oregon, implying that the virus is present but no samples have been taken. Knowing whether or not ranavirus is present, managers and researchers can make more educated decisions with this baseline data. Interpreting any spatial patterns would also be helpful in learning more about life history and eco-pathology of both ranavirus and chytrid fungus.


I am intermediate to proficient with ArcMap by taking GEOG 560 and 561. I am currently in GEOG 562 and learning Python. Some novice experience with Rstudio from a stats course.

An investigation into the correlation between snowpack and post-wildfire forest greening.

Filed under: 2017,Final Project,My Spatial Problem 2017 @ 10:14 am

Research Question

My spatial question looked into the relationship between winter snowpack and re-vegetation following a severe wildfire. I used variograms and binned scatter plots to characterize this relationship in a visual manner.


To estimate snowpack, I used our research group’s new snow cover frequency (SCF) product, which uses satellite reflectance data from the Moderate Resolution Imaging Spectroradiometer (MODIS) to measure the percentage of snow-covered days over a user-defined span of time. MODIS imagery is acquired daily and is available from the year 2000 to the present. The spatial resolution of MODIS imagery is 500 meters. This project considered October 1 to May 1 to be the relevant timeframe for each year included in the analysis. So in the example below, the highest pixel value for SCF (0.58) corresponds to the interpretation that 58% of the valid days from October 1 to May 1 were snow-covered.

MODIS reflectance data will also be used to estimate forest greening. The Enhanced Vegetation Index is calculated using red, infrared, and blue wavelength bands to estimate canopy greenness, a quality which depends on leaf area, chlorophyll content, and canopy structure. EVI images are readily available in the form of 16-day composites collected at a resolution of 500 m. For each summer season (June 1 – September 30), EVI data will be condensed across bands to create a maximum summer EVI image which will be utilized in the spatio-temporal analysis.

Wildfire burn perimeter and severity data will be obtained through the Monitoring Trends in Burn Severity (MTBS) project. Analysis will be constrained to wildfires within the CRB occurring over forested land cover with large areas of high burn severity. This data is available back to 1984, but will only be considered post-2000 due to the limiting data availability of MODIS imagery.

To look at soil texture, I will be using the State Soil Geographic dataset (STATSGO), which includes soil polygon shapefiles along with soil texture information that I have joined to the spatial layer. This layer was created by generalizing more detailed soil survey maps. When these detailed maps were not available, the survey used data on geology, topography, vegetation and climate to predict the probable classification.


The primary relationship driving this study is that between winter snowpack and the following summer’s revegetation following a severe wildfire. My hypothesis was that snowpack is most strongly correlated with forest greening directly following a severe wildfire, and that the strength of this correlation decreases with time following the wildfire. Additionally, I expected to find the snowpack-greening relationship to be strongest where soils are coarse and skeletal with low water-holding capacity.


My research question utilized two approaches. First, I conducted a series of variograms and cross-variograms with the general goal of understanding the patchiness of my data. I initially created variograms of both SCF and EVI raster layers for the pre- and post-fire year. Overall I found that the patchiness of both the snow and vegetation data remained consistent across the pre- and post-fire years. I then created cross-variograms between SCF and EVI to compare spatial autocorrelation between my snow cover data and vegetation data to see how these variables change together across space within the perimeter of the Spur Peak wildfire, and to see how this relationship changes temporally.

Second, and more in line with my thesis research, I shifted my attention to the Pot Peak wildfire, where I had already been analyzing the post-wildfire snow-revegetation relationship prior to this class using a different approach (Figure 1). Specifically, I plotted EVI against SCF with EVI as the dependent variable. To standardize the changes in snow and vegetation, I’ve plotted the change in EVI against the change in SCF by subtracting each post-fire year condition from the pre-fire year conditions.

To elucidate the potential role that burn severity and soil type have on this relationship, I’ve binned the data in two ways. To monitor potential effects of different soil textures, I delineated and separated the SCF/EVI raster data by soil polygon boundary. To visualize the effect that burn severity might have on this relationship, I used burn severity thresholds recommended by Miller and Thode (2007) to colorize by highly burned, moderately burned and control (low/no burn) pixels. Finally, for every year and for each class of burn severity, I conducted a regression analysis including a regression line and the corresponding R-squared values and p-values.

Figure 1: Pot Peak wildfire burn severity map. Colored polygons represent different soil types, brighter pixels indicate higher burn severity.

Results, Part 1: Cross variograms

A couple of apparent trends are evident in the time-series of cross-variograms shown below (Figures 2a-d). First, although each year shows two clear bumps in the cross-variogram, the lag distance of those bumps shift significantly from year to year. My guess would be that the initial shift from pre-fire to the first year post-fire was primarily due to the variation in burn severity across the landscape. The shifts in ensuing years may be due to varying rates of vegetation recovery or perhaps due to changes in snowpack distribution caused by the wildfire. It’s also worth noting that the valley between these two peaks remains fairly consistent at a lag distance of 5000 meters. In tandem, these trends suggest that there may be two patch sizes for this wildfire (of shifting size), but that the between patch distance remains fairly consistent at about 5000 meters. While I learned quite a bit regarding the usage of variograms and how to interpret them, these findings ultimately didn’t shed much light on the spatial question that I was asking, so I decided to change routes into a completely different toolbox.

Figure 2a: Pre-fire cross variogram between SCF and EVI                                     Figure 2b: First year post-fire cross variogram.


                        Figure 2c: Second year post-fire cross variogram                                                 Figure 2d: Third year post-fire cross variogram


Results, Part 2: Regression scatter plots

For the second part of my project, I switched gears completely and brought in some of the work I had been doing outside of class on the Pot Peak wildfire in central Washington. I’ve included the scatter plots for all 3 soil types below (each group of 6 plots is a single soil polygon), but only conducted a regression analysis on the Wedge-Fernow soil type (Figure 3a), characterized by a primarily ashy and pumiceous soil texture.

The most noticeable result is the apparent strong relationship that exists in the first year following the wildfire for both the moderate and high burn pixels. The R-squared values were 0.753 and 0.657 for the moderately and highly burned pixels, respectively. The p-values for these pixels were also quite low. Therefore, in this first year post-fire, we see a significant relationship suggesting that for an increase in snow cover frequency, a resulting increase in greenness (EVI) is likely to be observed. Interestingly, we see a complete dissolve of this relationship in the second and third years following the wildfire. The R-squared values drop to near zero and the p-values increase significantly. The control plots show no significant correlation in any of the post-fire years.

For the other two soil type polygons (Figures 3b and 3c) no significant trend exists in any years following the wildfire. The only apparent instance where the moderate and high burn pixels have noticeably differently relationships between snow cover and greenness following the wildfire is within the Saska-Ramparter soil type (Figure 2c). However, looking back at the entire burn zone (Figure 1), we can see that there is a clear separation of a highly burned patch of land to the north and a moderately burned patch of land to the south. There is likely some topographic factor, perhaps vegetation type or elevation, that is driving this difference as opposed to the scale of burn severity.

Figure 3a: Scatter plots and regression analysis for the Wedge-Fernow soil type polygon. Red points correspond to highly burned vegetation, yellow points correspond to moderately burned vegetation, and blue points correspond to unburned or very-low burned vegetation. Included are the R-squared and associated p-values for both the moderately and highly burned areas.


            Figure 3b: Scatter plots for the McCree-Ardenmont soil type                                 Figure 3c: Scatter plots for the Saska-Ramparter soil type



Snow accumulation has already been shown to influence peak summer forest greenness, especially at moderate elevations (Trujillo et al. 2012). The post-wildfire relationship between snowpack and forest revegetation is critical to understand as current trends of increasing temperatures, more ephemeral snowpack and intensifying wildfire activity are all forecasted to continue. Consequently, an expanding area of western mountain regions are becoming vulnerable to disturbance, revegetation and successional growth.

Forest managers and watershed managers may find the analysis of this research useful in preparing for future climate regimes. As wildfires continue to become more prevalent, having a comprehensive understanding of the ecological impacts of such disturbances will be critical for effective management of post-wildfire landscapes.

The ecological implications of this research are multi-faceted, especially regarding the changing climate affecting western U.S. mountains. Forests in the CRB are significant contributors to carbon sequestration, as western forests are responsible for 20-40% of total carbon sequestration in the contiguous U.S. (Schimel et al. 2002; Pacala et al. 2001). Depending on how successful forests recover following wildfires, western forests’ role as carbon sinks versus carbon sources may become more uncertain in future climate scenarios.

Software learning

ArcGIS: Coming into this class, I was already very comfortable with ArcGIS tools and capabilities. I did learn about a couple of new tools, including adding XY coordinates to a point shapefile to allow for smoother spatial analysis in R.

R: Before this class, I was only familiar with R in the context of a past statistics course using tabular data. Through this research question, notably the variogram analysis, I became more comfortable reading in raster layers to R, searching for appropriate R packages, and ultimately analyzing spatial data.

MATLAB: For the second half of my project, I primarily worked in MATLAB in creating the scatter plots. Through this process I learned a great deal regarding general plotting syntax, data access from a csv or text file, and categorical data binning.

Statistical learning

While I was already familiar with the creation and interpretation of scatter plots and regression statistics, the concept of semivariograms and cross-variograms was completely new to me. Working together with other students and sharing our thoughts regarding variogram and cross-variogram interpretation proved to be extremely helpful in understanding what the variograms were telling me. Throughout the course of this project, I even observed a couple of seminar talks in which variograms were referenced, so I was thankful to be able to quickly grasp the implications of the variograms that were being displayed.

Response to exercise/tutorial comments

Following Exercise 3, it was suggested that perhaps I should change research methods by stepping away from variograms and instead involve the research that I was already conducting outside of class. This was the primary reason for switching analysis approaches for my final project. Following my Tutorial 2 presentation, we discussed the limitations of my current method, especially the fact that there are so many other influential topographic factors that play into the revegetation of the burn zone. Moving forward, I’m planning on comparing the 5 years before the wildfire with the 5 years after the wildfire over the exact same extent to account for this issue.


Miller, J. D. & Thode, A. E. Quantifying burn severity in a heterogeneous landscape with a relative version of the delta Normalized Burn Ratio (dNBR). Remote Sensing of Environment 109, 66–80 (2006).
Pacala, S. W., Hurtt, G. C., Baker, D. & Peylin, P. Consistent land- and atmosphere-based U.S. Carbon Sink Estimates. Science 292, 2316–2320 (2001).
Schimel, D. et al. Carbon sequestration studied in western U.S. mountains. Eos Trans. AGU 83, 445–449 (2002).
Trujillo, E., Molotch, N. P., Goulden, M. L., Kelly, A. E. & Bales, R. C. Elevation-dependent influence of snow accumulation on forest greening. Nature Geosci 5, 705–709 (2012).


April 5, 2017

My Spatial Problem

Filed under: 2017,My Spatial Problem 2017 @ 8:01 pm

This is the blog page for Spring 2017 – My Spatial Problem posts

Using radar to quantify water in forests

Filed under: 2017,My Spatial Problem 2017 @ 1:54 pm

My research focuses on using radar measurements to isolate the water content of a landscape. This will involve comparing measurements taken during wet periods and dry periods, and comparing the backscatter. The areas of interest will mostly be single-age monoculture forest stands, for simplicity in isolating backscatter due to trees (branches, trunks, etc.) from backscatter due to water.

The dataset I’m working with is the Sentinel-1 dataset from the European Space Agency. This dataset covers the whole Earth with 6-day repetition, and has been collecting data since 2014. On the ground, the resolution is anywhere from 5 m to 20 m, depending on the method of scan used and what pre- and post-processing has been done.

My hypothesis is that canopy density influences water interception expect the water content of heavily forested areas will be higher for longer periods than more open terrain (e.g. cropland, urban areas) or younger stands of forest.

For this data, it first needs to be converted from slant-range to ground-range in order to be analyzed by geostatistical software. Because of the geometry of radar acquisition, the product will not line up geographically without some processing. In addition, the values in each pixel (digital number) need to be converted to a meaningful number, such as backscatter.

Once the processing steps have been completed, I will then need to build a weather database to line up the pixels in the images with how “wet” that pixel was at time of acquisition, as well as the uncertainty in the weather variables. The radar data is measured across the whole landscape, but weather stations are point-measurements, and any variables measured (dew point, precipitation, temperature, etc.) will increase in error with increasing distance from the station. For this course, I will focus on building the weather maps for my area and time.

The weather map will most likely be of rain accumulation prior to time of radar data acquisition. By comparing a large number of radar data, my hope is to capture the full range of water levels for a more robust model.

The expected outcome is to produce a statistical relationship between age of forest (or type of forest) with water quantity (normalized for time since it rained last). I also would like to produce maps, since I will be doing most of the work using maps and analyzing maps.

This problem has the potential to inform managers concerned with water retention. If my prediction is correct and more dense forests retain water for longer, then it provides land managers with data on how dense or sparse they should plant if they want to take water interception and retention into account.

I have enough experience in ArcGIS to build my weather map, but I would like to learn how to better streamline the process by using Modelbuilder or other Python programming. I don’t have any experience with Python beyond exposure in classes that I have taken. I am pretty good at R on my own, and I know how to find the answers to most problems I come across via internet searches. I am getting more familiar with ENVI and SNAP, which is the native software for processing the Sentinel-1 radar data.


Assessing Ponderosa Pine Plantations in the Willamette Valley

Research Question

My research explores the extent and distribution of ponderosa pine (Pinus ponderosa) plantations in the Willamette Valley, and the understory plant communities associated with them. In this class, I will be substituting my own field data with FIA (forest inventory and analysis) data collected by the USFS. The data I am collecting and assessing includes information about the plantation environments (or plot conditions, for the sake of this class), and vegetation response. I am primarily interested in determining how ponderosa pine canopy density affects understory species cover and richness. 


Willamette Valley ponderosa pine (WVPP) is a race of ponderosa pine that is native to western Oregon. WVPP grew extensively in the Valley pre-white settlement, and could be found in open woodlands alongside Oregon white oak (Quercus garryana). Burning by indigenous peoples, such as the Kalapuya, maintained the open structure of the forest that supported the growth of WVPP. After settlers began suppressing these once-frequent fires, fire-intolerant species, such as Douglas-fir (Pseudotsuga menziesii) began encroaching upon the WVPP range. Settlers also made use of the pine, harvesting it in great quantities for homesteads and other building materials. Soon, only a few small islands of original WVPP remained. These islands were genetically weak, and the species was threatened by potential extinction. 

Very few studies have been conducted on WVPP, so little is known about the understory species associated with west-side ponderosa. I assume that the open ponderosa pine stands maintained by the Kalapuya had unique plant associations that are now nearly non-existent. However, with the quantity of pure ponderosa pine stands increasing in the Willamette Valley, I am interested in assessing these new forest types and exploring the understory plant associations growing within them.


I used a subset of the Forest Service’s massive FIA dataset to replicate what I imagine my own data will look like. The biggest difference between this dataset and my own future data is its geographic location. Because the FIA plots are gridded, and the proportion of land that falls within WV ponderosa pine plantations is so small, there are very few FIA plots in the Willamette Valley that are entirely composed of ponderosa pine. As a substitute, I used plots from Deschutes county, which has a much higher proportion of pure ponderosa forests than the Willamette Valley does. There were 98 plots in Deschutes county that had ponderosa overstories. I used plot and subplot data to find the associated location, inventory year, canopy cover, elevation, slope aspect, and understory percent cover based on growth form (bare soil, forb, grass, shrub, and seedling). I included the environmental variables elevation, slope, and aspect to help explain variation within understory cover that canopy cover cannot explain.

Ponderosa Pine Plot Dataset


I hypothesize that understory species associated with ponderosa pine forests are adapted to open canopies with abundant light availability; therefore, the greatest  percent coverage of all understory growth forms will be present in low-cover canopies. I expect that there will be considerable variance in the understory response that cannot be attributed to canopy conditions, because environmental factors and previous land use will strongly influence the understory composition. However, I hope that including the environmental variables in my correlation assessment will help to identify the influence that canopy cover does exert over the understory. 


I will assess the auto-correlation within my primary explanatory and response variables (canopy cover and understory percent cover). I will accomplish this through a Global Moran’s Index analysis in ArcMap.

I will also assess the correlation between understory percent cover and the array of explanatory variables in my dataset through a multiple linear regression in R.

Expected outcome

This project will provide information about the auto-correlation within my primarily variables, and will also provide figures describing the relationships between understory cover, canopy cover, and environmental variables, and the significance of each relationship. 


My goal for this understory assessment is to provide information to land managers who are interested in managing their plantations in an ecologically beneficial way. Understory communities are frequently overlooked in working forests, but understory species and structural diversity is the most significant element of a forest for many animal species. Understory plants also influence forest microclimates, erosion, and fire patterns. My hope for this project is to identify relationships between canopy cover and understory cover, and to provide this information to managers so that they can make more informed decisions about tree spacing and thinning to facilitate a healthy understory community.

Level of preparation

I have used ArcMap for a few years and am comfortable navigating the software. I have some experience in R, but have not thought much about how it could apply to my project. I have not used the other software mentioned in this class.

© 2019 GEOG 566   Powered by WordPress MU    Hosted by