Category Archives: My Spatial Problem – 2020

Gray Whale Foraging and Zooplankton Density

Research Question

The Pacific Coast Feeding Group (PCFG) is a subgroup of the Eastern North Pacific (ENP) population of gray whales (Scordino et al. 2018). The ENP, a population of 24,000-26,000 individuals, migrates along the U.S. west coast from breeding grounds in the lagoons of Baja California, Mexico to feeding grounds in the Bering Sea (Omura 1988). The PCFG, currently estimated at 264 individuals (Calambokidis et al. 2017), stray from this norm and do not complete the full migration, instead choosing to spend their summer feeding months along the Pacific Northwest coast (Scordino et al. 2011). Since gray whales as a species already exhibit specialization (by being the only baleen whale that benthically forages) and since the PCFG display a second tier of specialization by not using the Bering Sea feeding grounds, it seems plausible that individuals within the PCFG might have individual foraging specializations and preferences. Therefore, my research aims to investigate whether individual gray whales in Port Orford exhibit individual foraging specializations. Individual foraging specializations can occur in a number of different ways including habitat type (rocky reef vs sand/soft sediment), distance to kelp, time and distance of foraging bouts, and prey type and density. For this class, my research question is whether prey quantity and/or community drives whale foraging.


Prey data

The prey data has been obtained through zooplankton tow net samples from a research kayak in the summers of 2016, 2017, 2018 and 2019. Kayak sampling effort varies widely between the three years due to weather creating unsafe conditions for the team to collect samples. These samples have been sorted and enumerated to the zooplankton species level so that for each day when a prey sample was collected, we have known absolute abundances of prey species communities at each sampling station. Additionally, a novel GoPro method has been used to quantify relative zooplankton density at the same sampling stations. 

Whale data

The whale data is in the form of theodolite tracklines of gray whales that used the Port Orford study area during the summers of 2016, 2017, 2018 and 2019. Since whale tracking occurs at the same sites as prey sampling, we are able to map the prey community present at a particular location that whales forage at. The tracklines occur on a very fine spatial resolution as the study area is approximately 2.5 km in diameter, though some of the tracklines extend out to approximately 8 km offshore. Furthermore, as whales forage in the area, photographs are taken of each individual in order to match the trackline with a particular individual. This way, potential individual specializations may be detected if there are repeat tracklines of an individual. These tracklines have been analyzed using Residence in Space and Time, which assigns a behavior to each spatial point. The behaviors are broken down into three categories foraging, searching, travelling. Each of these behaviors is assigned a residual value (-1, 1, 0, respectively).


Gray whales will prefer areas with the densest prey, regardless of the community composition of prey found at different locations.


Daily interpolation layers will be created for prey density between the two sites, Mill Rocks and Tichenor Cove. These interpolations will then also be weighted according to the percentage that mysids and amphipods (the two main taxonomic prey groups) made up the community at each location. This will result in 3 layers per day: an overall interpolated prey density layer, an interpolated layer weighted by mysid abundance, and an interpolated layer weighted by amphipod abundance. Once these layers have been created, the RST analyzed whale tracks will be overlaid onto them. Statistical analyses (likely linear mixed models) will be used to determine whether overall prey density or specific prey communities drive gray whale foraging by relating the interpolated layers with the behavior residual values.

Expected Outcomes

There will be many interpolated layers of prey density and weighted prey density. Furthermore, there could be some plots of overlaid tracklines showing where whales prefer to forage and search vs travel.


This spatial problem is important to science since genetic evidence suggests that there are significant differences in mtDNA between the ENP and PCFG (Frasier et al. 2011; Lang et al. 2014), and therefore it has been recommended that the PCFG should be recognized as being demographically independent. In the face of a proposed resumption of the Makah gray whale hunt as well as increased anthropogenic coastal use, there is a strong need to better understand the distribution and foraging ecology of the PCFG. This subgroup has an important economic value to many coastal PNW towns as many tourists are interested in seeing the gray whales. Therefore, understanding what drives their distribution and foraging habits will allow us to properly manage the areas where they prefer to forage. 


I have novice/working knowledge of Arc-Info and Modelbuilder. I have never used Python before. I am proficient in R and image processing.

1a. What is A?

A is the behavior displayed by a gray whale at a certain location.

1b. What is B?

B is the density of prey as well as the weighted density of mysids and amphipods.

1c. What is the relationship you want to test between B and A?

Whether prey quantity and/or species drives whale foraging at a certain location.

1d. Why or how does B case/influence A? What is your hypothetical explanation for why or how B causes A (this is mechanism C)?

Gray whales, like most baleen whales, have a breeding season which they spend in warm waters and a feeding season which is spent in colder, more productive waters. For a subgroup of the Eastern North Pacific gray whale population, this summer feeding season is spent along the NW U.S. coastline (northern California, Oregon, Washington and southern British Columbia). The activities undertaken during each season is so distinct that no feeding is undertaken during the breeding season, and vice versa. As such, gray whales must regain 11-29% of critical body mass during the feeding season in order to prepare themselves for the breeding season during which they do not feed. Therefore, it seems logical to assume that their distributions and movements should be entirely dictated by where their zooplankton prey is and also, where it is the most abundant or most dense since their sole purpose during the summer is to feed.

1e. Now, write your question in this form: “How is [the spatial pattern of] A related to [the spatial pattern of] B via mechanism C?”

How is gray whale behavior (specifically foraging) related to the spatial pattern of prey density via optimal foraging theory.

Literature Cited

Calambokidis, J. C., Laake, J. L. and A. Pérez. 2017. Updated analysis of abundance and population structure of seasonal gray whales in the Pacific Northwest, 1996-2015. Draft Document for EIS. 

Frasier, T. R., Koroscil, S. M., White, B. N. and J. D. Darling. 2011. Assessment of population substructure in relation to summer feeding ground use in the eastern North Pacific gray whale. Endangered Species Research 14:39-48.

Lang, A. R., Calambokidis, J. C., Scordino, J., Pease, V. L., Klimek, A., Burkanov, V. N., Gearin, P., Litovka, D. I., Robertson, K. M., Mate, B. R., Jacobsen, J. K. and B. L. Taylor. 2014. Assessment of genetic structure among eastern North Pacific gray whales on their feeding grounds. Marine Mammal Science 30(4):1473-1493.

Omura, H. 1988. Distribution and migration of the Western Pacific stock of the gray whale. The Scientific Reports of the Whales Research Institute 39:1-9.

Scordino, J., Bickham, J., Brandon, J. and A. Akmajian. 2011. What is the PCFG? A review of available information. Paper SC/63/AWMP1 submitted to the International Whaling Commission Scientific Committee.

Scordino, J., Weller, D., Reeves, R., Burnham, R., Allyn, L., Goddard-Codding, C., Brandon, J., Willoughby, A., Lui, A., Lang, A., Mate, B., Akmajian, A., Szaniszlo, W. and L. Irvine. 2018. Report of gray whale implementation review coordination call on 5 December 2018. 

My Spatial Problem _ An analysis of health patterns in Pakistan


Agricultural crop residue burning is a common practice among farmers in many developing countries including Pakistan. It is used to clear the fields of stalks left behind after the use of Combines to cut the harvest. For most farmers, it serves as a low-cost alternative to prepare fields for planting because the fires remove most post-harvest vegetative material and reduce the risk of pest and disease. However, this practice also contributes to air pollution by increasing emissions of fine particulate matter (PM2.5). Smoke from these fires can trigger and exacerbate respiratory diseases especially among children (Chakrabarti et al., 2019; Zhuang et al., 2018; Awasthi et al., 2010; Balmes, 2010). PM2.5 exposure has been linked to increased hospitalizations and asthma related emergency visits as well as reduced life expectancy (Chen et al., 2017; Adar et al., 2013).

Research question: Is there any association between individual level health outcomes and exposure to the pollution source i.e. burning of agricultural crop residue.

Data: There are two main sources of data. The first is a nationwide household survey data for Pakistan for the three years 2011, 2013, 2014. Around 8000 households were repeatedly surveyed across the three years and location data in the form of latitude and longitude is available.

Individual responses reported within the health module are used to measure health outcomes – specifically type of illness (if any) experienced in the last 2 months.

The data on agricultural residue crop fires is from NASA’s Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite data on active fires. Daily fire (point) data for Pakistan is available and a global landcover raster can be used to see which of the fires are occurring over agricultural land.

Land cover raster data from the European Space/Climate Change Initiative (ESA/CCI) with a 300m spatial resolution was used to identify fires occurring on cropped land.

Hypotheses: I hypothesize that respiratory illnesses are spatially correlated and associated with higher frequency of exposure to crop fires.

Approaches: I would like to use clustering / hotspot analysis to assess possible spatial correlation, butmy concern is that for the entire country it may not be useful since the exhibited pattern would mirror population distribution and not provide useful results.

I hope to use buffer analysis to measure proximity to fires. To empirically test the association between crop fires and health outcomes I would use the following regression estimation:

Hit = β1(Exposure to crop fires)it + βZit + εit

where Hit = 1 if the individual reported experiencing respiratory illness in the 2 months prior to the interview and 0 otherwise. Exposure is the number of fires within a buffer radius over the past 3 months and Zi represent a vector of individual and household level controls. εit is the idiosyncratic error term. I can use three estimations based on multiple buffer distances of 5km, 10km and 15km.

However, my concern is again that number of fires may not be a meaningful assessment of exposure. At present, I do not have another way to measure impact of agricultural crop residue burning.

Expected Outcome: I expect that health outcomes are spatially correlated and in a way that can be explained by patterns of agricultural crop fires.

Significance: The proposed analysis would examine individual level health outcomes in regard to what is a prominent source of pollution in South Asia – Agricultural crop residue burning. Political inaction on this issue can to some degree, be tied to the lack of rigorous evidence on the health effects of this practice. While this study would only look at the immediate, short-term impact of exposure, any significant results would show that there is in fact a causal association of burning crop residue on the health of individuals living within some proximity of the fires. Such results may contribute to policy emphasis on addressing a potential environmental concern and to the formulation of agricultural policy that may help farmers to switch to alternative harvesting practices.

Level of Preparation: ArcMap – working knowledge; R – between novice and working knowledge; Python – no experience.

I am proficient in Stata and have been using that for data manipulation and basic spatial commands (distance calculations).

Concerns:  Most studies have used a fire ‘event’ which makes it easier to assess impact across time and areas (control and treatment). Agricultural crop fires are a continuous phenomenon across the year, though there is a spike in post-harvest period (Figure below). I am concerned about the time dimension of my data.


Adar, S. D., Sheppard, L., Vedal, S. et al. (2013). Fine particulate air pollution and the progression of carotid intima-medial thickness: a prospective cohort study from multi-ethnic study of atherosclerosis and air pollution. Plos Medicine. Vol 10.

 Awasthi, A., Singh, N., Mittal, S., Gupta, P. .K, Agarwal, R. (2010). Effects of agriculture crop residue burning on children and young on PFTs in North West India. Science of the Total Environment. Vol 408: 4440 – 45.

Balmes, J. R. (2010). When Smoke gets into your lungs. Proceedings of the American Thoracic Society. Vol 7(2): 98 – 101.

Chakrabarti, S., Khan, M. T., Kishore, A., Roy, D., Scott, S. P. (2019). Risk of acute respiratory infection from crop burning in India: estimating disease burden and economic welfare from satellite and national health survet data from 250,000 persons. International Journal of Epidemiology. pp 1113-1124.

 Chen, L., Li, C. & Ristovski, Z et al. (2017). A review of biomass burning and impacts on air quality, health and climate in China. Science of the Total Environment. Vol 579.

Zhuan, Y., Chen, D., Li, R. et al. (2018). Understanding the influence of crop residue burning on PM2.5 and PM10 concentrations in China from 2013 to 2017 using MODIS data. International Journal Environmental Research and Public Health. Vol (15).

Unmanned aerial vehicle (UAV)-based photogrammetric approach to delineate tree seedlings

A description of the research question that you are exploring.

Use of unmanned aerial vehicle (UAV) platforms have been rapidly expanded in forestry remote sensing applications, especially from stand-level to an individual tree level measurement (i.e., precision forestry, forest management planning, biomass estimation and modeling forest growth) (Koch et al., 2006). Among those applications, acquisition of three-dimensional (3D) structural data from high-resolution imagery is a novel and powerful approach available with advances in UAV technology (Puliti et al., 2015; Zarco-Tejada et al., 2014). Photogrammetric derived tree canopy/crown information that can be used to detect the individual trees and specifically to monitor tree structure information (St-Onge, 1999; Jozkow et al., 2016).

In this study, I am interested in observing how the tree crown completeness/voxel size varies as a function of the number of points that arranged in the 3D point cloud. Since voxel size is important for visualizing the tree architecture, the number of points per unit volume (voxel size) considers as an essential measurement in photogrammetric analysis, especially for mapping the 3D structure of trees or determining the desired image quality of tree architecture (Fig. 1). Estimation of an optimal number of points or voxels required to delineate tree structure is an important parameter in terms of time and cost. In this study, I seek to determine the threshold value for voxel size ( or point cloud density) associated with crown completeness/desired image quality that can explain the architecture of tree seedlings (Fig. 2(a)).

Figure 1: Changing of tree architecture or the crown description with voxel size. (source: Hess et al., 2018).

A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

For this study, I will use a hexacopter unmanned aircraft system (UAS) equipped with a high-resolution multispectral camera with five channels, including red, blue, green, near-infrared, and red edge to acquire the imagery from the study area. The spatial resolution of the multispectral sensor is 2.5 cm, and imagery collection for this study will be based on one-day data collection (temporal resolution). The extent of this study is approximately four acres and located in Benton County, Oregon, USA. The initial data (raw images) collected from UAV flights will be processed using Agisoft Metashape. Additionally, I will produce a 3-D point cloud using structure from motion algorithm in Agisoft Metashape. These products will be utilized to perform the spatial and statistical analyses using R studio, ArcGIS Pro, AgiSoft Metashape etc.

Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

In this study, I am interested in observing the behavior of voxel size variation as a function of point cloud density and how the voxel size affects the desired image quality (Fig.2 (b)). With this intention, I am expected to have a relationship between voxel size and desired image quality (of tree architecture) (Fig.2(a)) where increasing the voxel size is going to change the desired image quality (tree architecture) (Fig.1).

Therefore, I hypothesize there is a threshold value for tree crown completeness/desired image quality of tree architecture at a particular voxel size. I expect change (decrease) in the desired image quality of tree architecture/crown description beyond this threshold value.

Figure 2: (a) Schematic diagram showing desired image quality (crown completeness) with respect to the voxel size for tree seedling (*the shape of this distribution may be changed). (b) Schematic diagram showing the overview of the voxel model. This represents the relationship of point cloud data to the voxel model (Source: Fujiwara et al., 2017).

Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data.

Orthomosiac images can be used to detect the centroids and the canopy cover of each tree.  I will perform a separate supervised classification to identify the seedlings and other landcover properties. (Alternatively, I will perform binary classification to identify the seedlings (1) and other landcover properties (0)). Approximately 70% of these pixels will be used as the training data set, while 30% of the data set will be used as validation data. Random Forest (RF) and Support Vector Machine (SVM) classifiers will be used for classification. Further, I will produce confusion matrices to evaluate the model performance of RF and SVM classifiers and identify individual seedlings with their canopy cover in the initial stage. Additionally, the location of each centroid will be assessed.

Further, delineation of the individual tree seedling can be done utilizing the point cloud data.  I will perform a spatial arrangement of points as a function of data processing (i.e., by changing the parameters of generating point clouds) and define the crown completeness with respect to the number of points and voxel size (Fig.3).

Figure 3: Voxel resolution and point cloud binning process for a square field plot clipped from the ALS data. (Source: Almeida et al., 2019).

Source for GIF link: Moreover, I am interested in studying the vitality of tree seedlings based on the point cloud map together with some copula modeling with multispectral data to evaluate the seedling structure/architecture.

Expected outcome: what do you want to produce — maps? statistical relationships?

 From this study, I would like to identify the spatial distribution seedlings and their centroid locations, utilizing UAV remote sensing techniques. I will produce maps of the spatial distribution of seedlings with their canopy cover. Additionally, I am expected to define a threshold value for voxel size to determine the seedling architecture/crown description. Finally, I will make additional maps showing the vitality map of seedling using the results obtained from point cloud data together with copula modeling.

How is your spatial problem important to science? to resource managers?

In sustainable forest management practices, estimation of macroscopic information/tree-level attributes including tree counts, locations of individual trees, tree crown architecture, and tree heights are essential for monitoring forest regeneration (Strigul, 2012; Mohan et al., 2017). Especially, assessing the vitality of seedlings is an important task for forest managers. With this proposed method, the characterization of seedlings can be done using remote sensing data in cost-effectively and accurately compared to conventional field surveys, which is expensive and time-consuming. The other importance of this study is the applicability of the proposed method to collect periodic data for individual tree seedling. This helps to understand the growth and vitality of seedlings with respect to time, which is another essential attribute for forest management practices.

Level of preparation:

(a) Arc-Info: I have experience in ArcMap gained from previous classes and some research projects.

(b) Modelbuilder and/or GIS programming in Python, No previous experience.

(c) R: Have substantial experience in terms of statistical analysis with limited knowledge in spatial analysis.

(d) image processing, I have gained some experience with Google Earth Engine and ENVI Abased on my previous classes.

(e) Other relevant software: Agisoft Metashape


Almeida, D. R. A. D., Stark, S. C., Shao, G., Schietti, J., Nelson, B. W., Silva, C. A., … & Brancalion, P. H. S. (2019). Optimizing the remote detection of tropical rainforest structure with airborne lidar: Leaf area profile sensitivity to pulse density and spatial sampling. Remote Sensing, 11(1), 92.

Fujiwara, T., Akatsuka, S., Kaneko, R., & Takagi, M. (2017). Construction Method of Voxel Model and the Application for Agro-Forestry. Internet Journal of Society for Social Management Systems, Vol. sms17.

Hess, C., Härdtle, W., Kunz, M., Fichtner, A., & von Oheimb, G. (2018). A high‐resolution approach for the spatiotemporal analysis of forest canopy space using terrestrial laser scanning data. Ecology and evolution, 8(13), 6800-6811.

Jozkow, G., Totha, C., & Grejner-Brzezinska, D. (2016). UAS TOPOGRAPHIC MAPPING WITH VELODYNE LiDAR SENSOR. ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences, 3(1).

Koch, B., Heyder, U., & Weinacker, H. (2006). Detection of individual tree crowns in airborne lidar data. Photogrammetric Engineering & Remote Sensing, 72(4), 357-363.

Mohan, M., Silva, C. A., Klauberg, C., Jat, P., Catts, G., Cardil, A., … & Dia, M. (2017). Individual tree detection from unmanned aerial vehicle (UAV) derived canopy height model in an open canopy mixed conifer forest. Forests, 8(9), 340.

Puliti, S., Ørka, H.O., Gobakken, T. and Næsset, E., 2015. Inventory of small forest areas using an unmanned aerial system. Remote Sensing, 7(8), pp.9632-9654.

St-Onge, B. A. (1999). Estimating individual tree heights of the boreal forest using airborne laser altimetry and digital videography. International Archives of Photogrammetry and Remote Sensing, 32(part 3), W14.

Strigul, N. (2012). Individual-based models and scaling methods for ecological forestry: implications of tree phenotypic plasticity. Sustainable forest management, 359-384.

Zarco-Tejada, P. J., Diaz-Varela, R., Angileri, V., & Loudjani, P. (2014). Tree height quantification using very high resolution imagery acquired from an unmanned aerial vehicle (UAV) and automatic 3D photo-reconstruction methods. European journal of agronomy, 55, 89-99.

Spatial Problem Blog Post by Hongyu Lu

  1. A description of the research question that you are exploring.

Landslides often occur in certain areas, and some areas have national highways or frequently used highways. Once a landslide occurs, travelers will spend a lot of time detouring, but this is very inefficient. In my research, I hope to find the most sensitive area by analyzing the spatial pattern of landslide susceptibility map and analyze the heat map. Mark the dangerous area and make a high-risk area map.

  • A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

The dataset I will analyze is Oregon State landslides map and landslide susceptibility map. The dataset comes from Oregon Department of Geology and Mineral Industries, the spatial pattern of the landslides susceptibility area should be dispersed, so these the further analysis will be done in this spatial pattern to generate high-risk area. High-risk landslide area map should be clustered, the high-risk landslide area will be shown clearly.

  • Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

I expect my pattern could has hotspot. First of all, the spatial analysis should be generated, hexagon hotspot map will be shown, by this hexagon hotspot map find highest landslide occur area.

  • Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data.

I would like to learn the classification analysis by spectrum difference. Also I want to learn how to generate the most efficient route if one main route is closed.

  • Expected outcome: what do you want to produce — maps? statistical relationships? other?

The Two map I want to produce, the first is hotspot map, second is Spectral classification map.

  • Significance.How is your spatial problem important to science? to resource managers?

I think landslide is very serious hazard in the world, not only affect traffic, but also cause casualties and substantial economic losses, if the road or land could be strengthen, or the highway could avoid those area, it will help a lot. So, it is important to do landslide analysis.

  • Your level of preparation: how much experience do you have with (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) image processing, (e) other relevant software

ArcInfo: novice

Modelbuilder/Python: novice

R: novice

Image processing: Very basic working knowledge

I also have very basic Work knowledge of Google Earth Engine and ArcGIS Pro.

A. My spatial problem: Habitat-induced GPS Fix Success Bias

My master’s thesis is studying habitat quality of black-tailed deer in western Oregon, and the link between habitat selection and survival rates.

Although there are lots of spatial and temporal relationships I want to analyze, for this blog post I will focus on the question: what environmental characteristics (topography, vegetation) are correlated with missed fixes in my study area? The extent of this problem is important to identify and account for in habitat selection studies.

An adult female Columbian black-tailed deer freshly captured and released, sporting a GPS collar programmed to collect a location every 4 hours.
Deer were captured in 4 Wildlife Management Units (WMUs) in western Oregon: the Alsea, Trask, Indigo and Dixon. Fix success test sites are located in the Indigo and Dixon WMUs, and sites reflect variation topography and vegetation across the entire study area. Fix success testing is still in progress at the time of this post, and the map does not include all of the test sites to be used in the final analysis.

A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

Data downloaded from Lotek 3300S GPS collars include coordinates, date and time and are obtained from GPS collars that were deployed on black-tailed deer in western Oregon in 4 Wildlife Management Units (WMUs; Alsea, Trask, Indigo, Dixon). Collars stay on the deer for up to 72 weeks, but may be shorter if there is a collar malfunction or if the deer dies. Deer capture effort lasted from 2012-2017, and I will be focusing on winter season only (January 1 – March 31).

Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

My response variable is fix success rate (number of acquired fixes divided by number of total fix attempts) and my predictor variables are environmental characteristics.

Fix success will likely be higher where there is less obstruction to satellites. I expect that fix success will be negatively related to steep terrain, dense canopy cover, larger trees and forested land cover types. I predict that there will be a positive association between fix success and flat terrain, low canopy cover, small trees, and non-forested land covers.

Examples of predicted relationships of fix success rate with canopy cover and slope.

I am testing the magnitude of GPS bias (location error and missed fixes) by deploying GPS collars for 1 week at test sites in my study area. At the moment we are still collecting data and have one week left to complete the tests.

Test sites have known environmental characteristics, and were specifically chosen in a systematic sampling design to capture variation in topography and vegetation across the study area (n=50). Nine collars were rotated between sites over the course of 6 weeks. Measurements are taken on the ground for each site (technicians were told to focus on a 30x30m area), and variables include: canopy cover, slope, aspect, dominant land cover type, dominant tree species and their associated diameter at breast height, tree height, and whether the stand is even-aged or not. Other variables for that are remotely sensed include topographic position index (10x10m) and elevation (10x10m). Collars were programmed to reflect the same fix schedule as the collars deployed on deer: 1 fix every 4 hours.

Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data.

I will determine probability of fix success (PFIX) for each stationary collar test site using logistic regression and perform model selection using AICc. I will utilize the top model to predict the probability of fix success for each pixel on the landscape in my study area using remotely sensed data (GNN 30x30m; Statewide Habitat Layer 30x30m, DEM 10x10m). This “predicted fix success map” will be used as a correction factor in habitat selection analyses (also a logistic regression analysis). Each deer’s GPS location will be weighted by the inverse probability of fix success (1/PFIX) associated with that pixel’s fix success.

Expected outcome: what do you want to produce — maps? statistical relationships? other?

I want to produce a map of predicted fix success for the study area.

Significance. How is your spatial problem important to science? to resource managers?

Although GPS technology have opened the door for researchers to gain massive amounts of information about their study system, it is not perfect. Habitat-induced bias caused by obstruction to satellites produce location error and missed fixes and can bias habitat selection studies, and ultimately researchers and managers can misinterpret what habitat variables are actually being selected or avoided.

Columbian black-tailed deer population trends are declining across their range and are relatively understudied compared to other ungulate species. As an important prey item, game species, and charismatic megafauna, Oregon state wildlife managers are interested in determining causes behind the apparent population decline.

Your level of preparation: how much experience do you have with (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) image processing, (e) other relevant software

Arc-Info: ArcMap- working knowledge.

Modelbuilder and/or GIS programming in Python: none

R: Working knowledge

image processing: none

other relevant software: FRAGSTATS- very little. I’ve attempted using package Landscapemetrics program FRAGTATS on my own. But not very successfully.

Exploring Patterns of Past Human Behavior through the Use of 3-Dimensional Models

­Research Question

Human behavior in North America can easily be identified through the lens and in the voice of early Europeans who colonized and claimed rights over every living species on this “new found land” even fellow human species. Through this lens, the stories and lives of early Indigenous communities in the Americas have been silenced and, in the void, we can read about every “discovery” that early settlers made across this “untouched” land. For the first time, there is a chance to intensively study an area in North America that contains the oldest evidence of human activities in the state of Idaho; roughly 16,000 years old (Davis et al 2019).

My thesis research seeks to better explain the human activities at this archaeological site in order to shed more light on how early humans interacted within their landscape through time. This will create a space for discussions on how Indigenous early settlers used their landscape for over 15,000 years prior to when European “discoveries” were made. There must be something truly unique about this area in Idaho for people to keep coming back and utilizing this landscape. Humans are, after all, animals of patterns and consistency; this is what my project in this class will be focused on.

How is the spatial distribution and patterning of artifacts related to human behavior and natural transforms at one point in time and through time at this archaeological site?

Description of Dataset

The dataset that I am working with is an artifact assemblage of 327 artifacts that have been recorded with x-y-z-coordinates in the constraints of a 10-meter x 5-meter x 1-meter excavation interval. The temporal setting has been calculated through charcoal remains of two hearths that date to around 12,000 years ago. Everything else must be in relative setting to this data. This is only a partial segment of the archaeological site, but my hope is to create a standardized procedure that can be used for any number of artifacts within any constraint and at any location.


Hypothesis 1. The artifacts will exhibit significant clustering in 2-Dimensional and 3-Dimensional perspectives.

This pattern is commonly produced through human use and re-use of specific areas across the site. Utilization of space by humans leaves evidence of use in the form of features or artifacts such as hearths and stone tools.

Hypothesis 2. The clustered areas will show significant differences in artifact type (human behavior) both across the site horizontally and vertically.

  1. Differences in horizontal dispersion shows diverse site use during one point in time.
  2. Similarities in horizontal dispersion may present unsystematic variation in my dataset.
  3. Differences in vertical dispersion shows diverse site use through multiple points in time.
  4. Similarities in vertical dispersion shows similar site use through multiple points in time.

A pattern such as this is caused by segregated use across the site at any one point in time. For example, it would not make logical sense to produce stone tools near and area that is used for processing food in fear of contamination. Therefore, a processing area should be shown by specific stone tools, bones, and very few debris from stone tool production. On the other hand, stone tool production should contain a relatively high amount of stone debris with little evidence for food preparation.

Hypothesis 3. Using observational statistics, the physical distances between clusters and within the clusters will represent how human behavioral practices were employed at the site and how space was utilized by past people.

Hypothesis 2 addresses the similarity of artifact types within each cluster horizontally and vertically; this approach uses feature space. Hypothesis 3 seeks to address the physical spatial relationships between the clusters as separate entities and between the artifacts within each of the clusters. My thinking in this regard is: as people use space in their environment, highly organized clusters may indicate an area of primary refuse (planned use of the environment), whereas, less organized clusters may represent an area of low frequency use (not-planned use of the environment). Furthermore, the physical space between clusters may indicate how past people perceived their environments and the choices made to organize it.


I have begun to utilize spatial clustering tests within the dataset that have available for this project and study. Among these tests, I have explored Ripley’s K, L-function, and G-function. For the purpose of calculating accurate clusters in an archaeological setting, I have also explore using k-means clustering and a density-based clustering algorithm. Both have their advantages and disadvantages for my specific aims (e.g. Baxter 2015, Ducke 2015). For the purpose of this class, I wish to expand on the knowledge and understanding of these functions as well as an increased understanding for general statistical methods and procedures. My hope is to identify specific clusters within my dataset and then statistically correlate each artifact within that cluster with each other and then correlate that cluster with the other clusters identified across the site horizontally and vertically.

Expected Outcome

As an outcome for this project, I expect to arrive at many statistical relationships, but I wish to emphasize the best ways to visualize these relationships. I seek this type of visualization because, unlike many other studies, one of the most influential factors for these relationships is human behavior and we all know how hard it can be to predict human behavior. Thus, for this reason, I want to express my results in a visual way in order to take the numbers and algorithms out of the data because computations will never fully and accurately predict human behavior or correlation. It is important to recognize the potential ethical problems with representationalism given the strict computational approach I am seeking. Therefore, to put computed numbers and a purely human visualization of data on a more even playing field, I seek both statistical relationships as well as 2-Dimensional and 3-Dimensional mappings of the data and corresponding relationships.


I listed the global significance within the ‘Research Question’ section of this post; that is, the examination of human behavior within an area that contains the oldest evidence for human occupancy in North America. In terms of this course, the significance of my proposed procedure is found within the emerging theme of 3-Dimensional representations in the archaeological science. Space has always been on the mind of archaeologists but only recently has there been the capabilities for a push in the direction of 3-Dimensional and even 4-Dimensional understandings of human behavior. This project will not only solidify an understanding in the perception of N-Dimensional space, but it will help to pave a path towards a more standardized approach in identifying past human behavior through time and space. Additionally, the way in which archaeological sites are excavated in the future will be forever changed based upon the intricacy that is required for producing the models that I seek to examine.


I would say that I am proficient at using both ArcGIS and ArcScene. I have used these two synchronously for the past couple years within an academic setting for the production of predictive maps, simulations, and a plethora of other academic inquiries. Additionally, I have some experience playing around with LiDAR data within these systems. I have only used model builder within ArcGIS a couple times and would say I am a novice at this function as well as GIS programming in python. I have coding experience in java and R but have very little in python. I would say that I am proficient at utilizing the tools, understanding, and implementing ideas within R, though the code may not be as efficient as it could be. I do not have much image processing experience or other software systems that could be of use during this project.


Baxter, M. J. “Spatial k-means clustering in archaeology–variations on a theme.” Academia (2015).

Davis, Loren G., David B. Madsen, Lorena Becerra-Valdivia, Thomas Higham, David A. Sisson, Sarah M. Skinner, Daniel Stueber et al. “Late Upper Paleolithic occupation at Cooper’s Ferry, Idaho, USA, ~ 16,000 years ago.” Science 365, no. 6456 (2019): 891-897.

Ducke, Benjamin. “Spatial cluster detection in archaeology: Current theory and practice.” Mathematics and Archaeology (2015): 352-368.

Matt Barker Spatial Problem

Description of research question

How is the density of woody debris related to window size via resolution? I will have window sizes at various spatial scales, i.e. 1x1m, 3x3m, etc., and I will manipulate the resolution of my associated classified raster (native resolution is 0.0588 m).

Description of dataset

We acquired multispectral imagery from three UAS flights on 9/23/2019 between approximately 12:00 to 13:45 PDT. Imagery is also available from August 2019 flight to conduct potential temporal analysis. Resolution (EO bands) is 5.9 cm/pix. Phase I of South Fork McKenzie Floodplain Enhancement Project is approximately 150 acres. I have produced a shapefile of classified woody debris using a random forest supervised classification (kappa ~ 0.75)


I hypothesize wood density will decrease with coarser resolution because I will lose detail of smaller diameter wood. 

  1. I expect the spatial pattern of wood density to be clustered. There will be areas of moving water that have relatively low wood density, and still areas where wood will be deposited with high density.
  2. I anticipate this clustered pattern of wood density due to fluvial patterns of the river.


I would like to either develop a tool in ArcGIS Pro or write a script in R that adjusts resolution of the raster and sliding window. In this way, I will be able to simplify wood density outputs by only adjusting resolution and window size.


I want to produce a map of wood density throughout the South Fork McKenzie site at various resolutions. Additionally, I would like to analyze spatial statistics of woody debris distribution.


Woody debris is a critical component of forest ecosystems and essential for wildlife habitat, especially for fish (Howson et al. 2012). Current methods for detecting and quantifying woody debris rely on ground surveys that are expensive, time-consuming, and potentially dangerous when working around moving water. Development of remote sensing methods using relatively low-cost unmanned aircraft promises a safer, more efficient alternative compared to current ground surveys.

Level of preparation

  1. ArcInfo – No experience
  2. Modelbuilder – novice: some experience in GEOG 561

Python – working knowledge: I took the python GIS programming course (GEOG 562), but I am a little rusty.

  • R – Proficient: This is the software I use most frequently for statistical analysis and have done some spatial analysis with it as well.
  • Trimble eCognition – Novice: I have some experience, but I may have difficulty accessing software due to COVID-19 outbreak.
  • Agisoft Metashape – Working knowledge: I use this software frequently to produce othomosaics and point clouds from UAS imagery

Howson, T.J., Robson, B.J., Matthews, T.G., and Mitchell, B.D. 2012. Size and quantity of woody debris affects fish assemblages in a sediment-disturbed lowland river. Ecological Engineering 40: 144-152. doi:

Dust and Marine Productivity

1.Research Description

My current research is focused on accessing the source of the Fe on the North Atlantic Ocean by using HYSPLIT back trajectory modeling. For this class, I am going to focus on my research about the relationship of dust deposition to marine productivity in the ocean. The dust source will be derived from HYSPLIT back trajectory modeling from NOAA and the marine productivity will be accessed from BATS (Bermuda Atlantic Time-series Study) discrete bottle data in Bermuda Ocean by using the Particulate Organic Phosphorous (POP) and PP (Primary Production) component.

The challenge is the time series and the depth of the dataset, three of these datasets have unique time interval, and for some data, there is no correlation between depth and the value of POP and PP where it is expected that less than 200 meters the POP and PP will be affected by the dust deposition.  

Fig 1. The spatial pattern of POP and PP
Fig 1b. Depth pattern of POP and PP

From fig 1a and 1b, we can see the spatial pattern of the POP and PP location, hence this study will focus on that hotspot location to generate the hysplit back trajectory data.

2. Dataset

1.  POP and PP data from BATS bottle.

This dataset is derived from BATS bottle data, the temporal resolution is daily, and only the appropriate data is used for this analysis. It is an in-situ data. Figure 2 describes the availability of data based on the date of the year.

Figure 2. Date of year POP and PP Data in Bermuda Ocean 2012-2016

2. Hysplit is vector data, that will show us the source of the dust that affects the marine productivity in Bermuda Ocean. The spatial resolution oh hysplit data is 500-meter AGL. The back trajectory will be generated based on the cluster coordinate on fig 1 and date on fig 2. Temporal resolution is every hour.

Fig 3. Hysplit back trajectory data with end point on May 29, 2015

3. Hypotheses

1. The spatial pattern of POP and PP is clustered or has a hotspot, and the abundance of marine productivity is starting from spring until fall.

2. Atmospheric transport and deposition dust-associated iron (Fe), causing the Fe and Pi (phosphorus) sufficient and limitation, thus it will affect the marine ecosystem (Letelier et al, 2019). The near-crustal dust from organic dust or Sahara dessert is proven insoluble and has a less significant effect on North Atlantic Ocean productivity compare to the anthropogenic dust from north America which soluble (Conway et al, 2019).   The general hypothesis of this study is the hysplit back trajectory will show that the marine productivity in the Bermuda ocean is more likely derived from anthropogenic dust in north America.

4. Methodology Approach

1. Eliminate missing and bad data of POP and PP by using Matlab; 2. Time series analysis to see the temporal and spatial pattern ; 3. Clip the hysplit data by the result number 2; 4. Analyst the source of dust that affect marine productivity by regression; 5. Seasonal time series analysis using Matlab. 

5. Outcome

The outcome of this study is the map of dust trajectory and the evaluation of the seasonal dynamic of marine productivity in the Bermuda ocean.

6. The significance

This study will help us to quantify the role of ocean-atmosphere coupling on the air-sea exchange of carbon export to the ocean interior. Carbon export includes both inorganic and organic carbon (OC), and for the latter, both particulate and dissolved phases. Of particular interest, in light of clear evidence for ocean acidification, this study will help us to understand how our behavior contributes to the production of the anthropogenic aerosol. This anthropogenic aerosol will affect marine productivity, particularly the pelagic ecosystem.

6. Level of proficiency

I have proficient skills for ArcGIS desktop because I have been using it since I was in undergraduate study. However, I never use ArcGIS pro. My R knowledge is just for statistical analysis, not for spatial statistics. I do not know how to connect the R and ArcGIS, in this study I will not use R. I ever used phyton when I was an undergraduate however I never go deep into it and never use it anymore. I ever took Digital Image Processing (GEOG 581) class and very confident with google earth engine coding. I also use Matlab for geoshow and spatial analysis.

Greenland Outlet Glacier Dynamics

1.My primary research question is: How is the spatial pattern of Greenland outlet glacier extent (or, more specifically, their rate of growth/retreat) related to the spatial pattern of local glacier velocity via the mechanisms of glacier mass balance and basal slip?

Put simply: Are the glaciers that are retreating faster also physically flowing faster, or not?

2.I have 2 primary datasets for this project. The first covers what I consider to be my response variable: rate at which Greenland outlet glacier termini are moving (either advancing or retreating). This variable will be constructed from annual polyline shapefiles sitting on the visible edge of the glacier, over ~4 different years. The spatial pattern of these termini lie along the outer edge of Greenland’s geographic border, because that is where outlet glaciers terminate.

  1. Data: This dataset consists of a glacier ID consistent across every year (I will likely use 2015, 2016, 2017, possibly 2001). The dataset was produced via image processing of data from satellite sensors indicated in the link (there are several). Each shapefile consists of a vector polyline that represents the terminus extent of the glacier in a given year (winter max extent).


My second dataset covers what I consider to be my influential variable: glacier velocity. This dataset is a raster that covers the entire Greenland continent. There are clearly dendritic-hotspots along the outer edge of the continent, where the ice moves the fastest. The center of the continent, where most of the ice accumulates, has very little horizontal movement (only a few meters/year).

  1.  Data: This larger dataset represents annual ice velocity magnitudes across the surface of Greenland at 200 or 500m, annual resolution. Data is also available for separate x and y components of velocity, as well as their errors. This is a raster file covers the entire Greenland ice sheet, including all outlet glaciers mentioned above^.


3.My initial hypothesis is that statistically, faster flowing glaciers will tend to retreat faster. My explanation would be that vulnerable glaciers that are melting typically develop wet, slippery basal conditions. The slippage along the ground also allows the glacier to move faster. These same melting glaciers (with a negative mass balance) also typically experience a retreat of their termini, because they are melting and calving away. This is why I would generally expect a positive correlation between a glacier’s velocity, and the distance or area of retreat.

However, among healthy, dry-based glaciers, the relationship might be totally different. In these cases, glaciers with the highest accumulation (aka most snowfall) flow the fastest because they have so much mass to move to maintain equilibrium. This provides an alternate hypothesis that the fastest glaciers are the ones with the highest accumulation, regardless of terminus behavior.

As far as geospatial distribution goes, I’d expect rapidly retreating glaciers to have a clustered geospatial distribution. There are probably some areas where most of the glaciers are retreating, and a few areas where the majority of the glaciers are stable or growing. That hypothesis is just based on the observation that entire ice sheets do not melt at the same rate (for example, the Antarctic Peninsula is retreating far faster than the Eastern Antarctic Ice Sheet (EAIS)) Rather, ice sheets move/deform based on their own individual dynamics and local climate, which includes variables like temperature, precipitation, wind, cloudiness, and perhaps most importantly, ocean conditions. Considering the southeast portion of Greenland is subjected to the “rapidly warming and weakening” Gulf Stream, I’d predict that region of Greenland to be the most unstable and to display the greatest reduction in glacier extent.

4. I suppose Exercise 1 for me will focus on my response variable: the termini positions. I will need to develop a method to calculate the distance between 2 (jagged) polylines, and then determine whether that magnitude is considered an advance or a retreat… over multiple different years (with multiple years I can also determine the if the RATE of retreat is speeding up or slowing down). I can then create “points” out of these measurements, and determine if points of similar magnitude are clustered or evenly distributed, etc.

For the second step, I am envisioning working with the influential variable: glacier velocity. I think it would be interesting to create a hotspot analysis of this map, as well as create a map displaying vector directions and magnitudes of ice movement, to visualize directionality. I also think I will need to find a way to vectorize/ section off individual glaciers from the raster, possibly using a fixed buffer from the termini data. I’m also debating bringing in some mass balance data (gravitational mass anomaly), to see how glacier mass balance correlates with velocity.

For the final step, I’d like to connect the 2 datasets to interpret their relationship. I’d like to run a couple of correlation tools (Global Moran’s or something) to see quantitatively how distance retreated compares to glacier velocity. Finally, since I can’t expect the entire region to behave similarly, I’d like to cut Greenland into 6-8 sections and see how each behaves, to compare and contrast.

5.The expected outcomes are pretty diverse. The big thing I’m after is a statistical relationship between these variables, which is really just a number (R-squared). However there are several maps I am interested in making, including those depicting calculated rates of retreat, one depicting vector directions and magnitudes of ice movement, and a gridded map of Greenland depicting the R-squared between my variables in each region. While the answer to the question is a statistical relationship, there are several informative maps to be made along the way.

6. I think the scientific significance of this is mainly pointed towards making accurate predictions of sea level rise. Developing such predictions involves taking into account several not-so-obvious glacial mechanisms, and identifying which glaciers are most at risk. If ice velocity gives some indication towards which glaciers might contribute to sea level in the near future, then it could make the difficult job of modeling sea level a bit easier. It is also information that could be useful when put towards ecology and a handful of local communities. Data on recently uncovered land could help explain ecological shifts and invasive or unfamiliar species.

7. Proficiency:

-I have a working knowledge of ArcGIS Pro, based on the GIS I, GIS II and GIS III classes in the certificate program. This is more a textbook knowledge rather than applied experience.

-I have no experience whatsoever with R. I am not sure I wish to learn it, but if this class convinces me it is critical to a solid geospatial repertoire, I will concede.

-I have a working knowledge of Python, and I can use online resources to get most things done.

-I’ve used QGIS, ENVI, and MATLAB, but am not particularly well versed in any of them.

Agricultural Intensification and Population Decline in the Great Plains

My Research Question:

How are changing farm sizes and population correlated? There have been trends of agricultural intensification in Nebraska, Kansas, Oklahoma, and Texas. I am curious to see if the shift in distribution of farm sizes from more smaller farms to more larger farms has an effect on the population sizes.


In order to address this question, I will be looking at the United States Census for population data at the county level in 2010 and 2018. The 2018 data are estimates since the census is only collected every ten years. The United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) runs a Census of Agriculture every five years, which collects information about farms throughout the country. I will use 2012 and 2017 Ag Census data at the county level. The Ag Census reports the number of farms in a handful of size ranges as well as the area of farms within those size ranges. They also report income from farm production, and whether the farms are owned by individuals, partnerships, or corporations, which I might include in the study to provide more context. I will include county-level data from the US Census and Agricultural Census for all of Nebraska, Kansas, Oklahoma, and Texas.

Figure 1: Sample map from the interactive Ag. Census viewer of increases and decreases in the number of farms with 2,000 acres or more between 2012 and 2017. This is the largest farm size category in the data.


In the Agricultural Census data, I expect to see an increase in farms with larger sizes and a decrease in farms with smaller sizes. I do not know enough about the region to predict the pattern of the increase. Yan and Roy (2016) calculated field size with remote sensing and from their maps it appears that the Western sides of these states have much larger farms in data from 2010.  I predict that in the time frame I am studying the range of large farms will extend more broadly across the Mid-West.

In the US Census data, I expect to see slight decreases of populations in rural counties but increases of populations in counties that contain cities and those immediately surrounding cities. McGranahan and Beale (2002) found that was the case in the 1990s because of lack of income opportunities in rural areas, forcing many to move into urban and suburban counties in search of work.

I hypothesize that the there will be overlap in the counties that have increasing numbers of large farms and decreasing populations. I hypothesize that this pattern occurs because individuals who farm for large corporations have funds available to buy large plots of land, and then are able to manage farms with higher economic efficiency and therefore sell their products for cheaper prices. The smaller farms in the area are then priced out of the market and can’t afford to continue farming and leave for urban areas in search of other sources of income. This then takes away business from the local small businesses such as doctors’ offices, post offices, and other local stores.


I would like to explore using Python to find analyze the patterns and do some statistical analysis with this data. I plan on exploring statistical analysis with spatial data and calculating if any correlation between increases in farm size and decrease in population is statistically significant. I will also have to aggregate the farm size data from categories of farm size into a general increase or decrease. Or potentially doing separate analyses for the different groupings. I may have to expand my analysis into R to do some of the statistical computations.

Expected Outcome:

I expect to produce maps with the patterns that I find in both datasets and between the two. I will also produce statistics on the relationships between the patterns. I will also likely produce some maps with contextual information about agriculture, economics, and population.


This research is important to policymakers because it is important to understand the impact of changing industry on people’s living and economic situations.  Municipalities are typically concerned with their economies and population growth. If there is a clear correlation between a change in an industry and a change in population and available business, it might be reason for more research or a change in policy. This correlation might also have implications for soil health and the general environmental sustainability of agriculture.


I have many years of using ArcGIS (both in an out of school) and will likely use it for the final cartographic steps in making my maps.

I have a little bit of experience in both Python and R, which I will use for my statistical analyses. I don’t think I will be using remote sensing in this project, but I am proficient in Java Script for Google Earth Engine if I need it.


McGranahan, David A. & Beale, Calvin L., (2002). “Understanding Rural Population Loss,” Rural America/ Rural Development Perspectives, United States Department of Agriculture, Economic Research Service, vol. 17(4), December.

Yan, L., & Roy, D. P. (2016). Conterminous United States crop field size quantification from multi-temporal Landsat data. Remote Sensing of Environment, 172, 67–86.

U.S. Flood Hazard Population Dynamics

Stefan Rose – MS Geography

1. Description of Research Question

Research Question: How is the temporal pattern of  population flow by county related to the temporal pattern of flood insurance claims by county via risk compensation?

Flooding is eminently regarded as the costliest and most prevalent natural hazard around the world and within the United States (Miller et al., 2008; Kousky, 2018). On average, flooding in the United States causes approximately $7.96 billion in damages per year (NWS, 2018). Both the magnitude and frequency of floods are expected to increase due to climate change (IPCC 2007, 2012) along with population growth and increased economic assets in coastal zones (Jongman et al., 2012). My analysis seeks to contribute to a body of natural hazards and risk research by seeking to understand the influence of flood insurance on population mobility over time. I use “risk compensation” as the operative term to describe the flood insurance mechanism for economically valuing the risk of its participants by their asset exposure and steps toward hazard mitigation.

2. Data Description

National Flood Insurance Program (NFIP) claims data, operated through the Federal Emergency Management Agency (FEMA), includes the dollar amount paid, the date of loss for the filing, the zip-code and county for the claim. This data spans from 1977 – 2018. From the dataset I am able to aggregate  the number of claims filed for a given county annually. I can create a threshold for flood-occurrence versus no flood occurrence by the annual number of claims through the creation of a binary dummy variable (i.e. 0=no flood occurrence, 1=flood occurrence). My analysis will be focused only on the counties with active insurance claims in the continental United States.  See below for a figure of the number of insurance claims in force by state. 

Number of NFIP policies in the United States as of 2017

Internal Revenue Service (IRS) migration data shows the number of tax filings by location at the county level from 1990-2010. From this data I can determine origin and destinations of tax filers over time which can be represented by in-flows (in-migrants), out-flows (out-migrants), and net-flows (net migration). I will be only including population data for the counties to which I have a match for flood insurance claims. See below for a sample county (Beaufort County, North Carolina)  data screenshot that includes these population metrics. 

Sample data screenshot from RStudio

3. Hypothesis

I predict to see a statistically significant relationship in total population movement (in-flow or out-flow) in counties with high volumes of flood claims. I also predict to see a significant relationship in counties with high monetary amounts paid out from claims and lowered population out-flows. 

4. Approaches

I have created a large panel dataset from which I will perform multivariate regressions to gauge statistical significance in a relationship between population and insurance claims. I will seek to test that all assumptions of my regression model are appropriately met (normality, no multicollinearity, homoscedasticity, etc.).

5. Expected Outcomes

I wish to demonstrate a statistical relationship between variables and visualize my results appropriately through plots, charts, and maps. 

6. Significance of Research

My analysis applies to both policymakers and to the larger body of natural hazards and risk research. As stated above, flooding and flood risk is an increasingly costly issue that requires interdisciplinary responses. My research can help inform these varied response efforts. 

7. Level of Preparation

I have significant experience using R to process and transform datasets as well as for statistical analysis and graphing. I also have experience using Stata for statistical tests and creating result outputs. I have extensive experience using ArcGIS and moderate experience with Google Earth Engine. I have some experience with Python and will be taking a Python workshop concurrently at the start of this quarter to advance my skills further in this. I hope to utilize R for data transformation as well as mapping and result communication. I will use Stata in conjunction with R for data analysis and statistics. 

8. References

Field, Christopher B., et al. “IPCC, 2012: Managing the risks of extreme events and disasters to advance climate change adaptation. A special report of Working Groups I and II of the Intergovernmental Panel on Climate Change.” Cambridge University Press, Cambridge, UK, and New York, NY, USA 30.11 (2012): 7575-7613.

IPCC, Climate Change. “The physical science basis. Contribution of working group I to the fourth assessment report of the Intergovernmental Panel on Climate Change.” Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA 996 (2007): 2007.

Jongman, B., Ward, P. J., & Aerts, J. C. J. H. (2012). Global exposure to river and coastal flooding: Long term trends and changes. Global Environmental Change, 22(4), 823–835.

Kousky, C. (2018). Financing Flood Losses: A Discussion of the National Flood Insurance Program. Risk Management and Insurance Review, 21(1), 11–32.

Miller, S., Muir-Wood, R., & Boissonnade, A. (2008). An exploration of trends in normalized weather-related catastrophe losses. Climate extremes and society, 12, 225-247.

A Century of Oregon Salt Marsh Expansion and Contraction (Prologue)

Fig 1. Small portion (to conserve file size) of Nehalem Bay salt marsh edge from 1939. Note the logs along the channel edge likely related to log drives and splash damming.

Previous research suggests that most Oregon salt marshes have survived 20th century relative sea level rise by accumulating sediment at a pace similar to or exceeding the rate of sea level rise, except Salmon River Estuary and Alsea Bay (Peck et al. 2020). Additionally, though we predict that salt marsh growth can only occur under rising sea level and is limited by the rate of relative sea level rise, Nehalem Bay salt marshes are growing much faster than the local pace of sea level rise and salt marshes in the Coquille River estuary are growing vertically despite experiencing relative sea level fall. Are these patterns also reflected in the patterns of horizonal growth of the salt marshes?

For this course, I would like to georeferenced and digitize historical aerial photographs from at least two Oregon estuaries – Nehalem Bay and Alsea Bay – and estimate error associated with my digitization. I would then like to compute changes in vegetated salt marsh extent from 1939 to the present and combine this with vertical accretion rates measured using excess 210Pb from sediment cores. Ideally, I would be able to create a model of 3D change for the last ~80 years. If this method works, I will expand my research to two additionally estuaries – likely the Salmon River Estuary and Coquille River Estuary.

My main dataset will be historical aerial photographs that are roughly decadal from 1939 – 1990s that span four Oregon estuaries: Nehalem Bay, Salmon River Estuary, Alsea Bay, and the Coquille River Estuary. The resolution varies a lot on the images with the 1939 typically being the highest quality (I estimate ~0.5 m; see Fig 1). My sediment core dates typically extend ~120 ± 20 y and are also roughly decadal as well. I have between ~8 to 14 cores per estuary. I only plan on analyzing the parts of the salt marsh from which I was able to collect sediment cores.    

H1: Based on the historical aerial photograph analysis, the Alsea Bay “least disturbed” salt marshes of interest will exhibit net erosion while the Nehalem Bay “least disturbed” salt marshes of interest will exhibit net sediment accumulation similar to the patterns observed in the vertical sediment accumulation data.

H2: Time periods of horizontal growth of the salt marsh will correlate with time periods of vertical growth measured within sediment core data previously collected in Alsea Bay and Nehalem Bay.   

H3: Nehalem Bay will exhibit more periods of growth, observed in both horizontal and vertical data, than Alsea Bay despite experiencing similar rates of relative sea level rise.

H4: Periods of growth, observed in both horizontal and vertical data, will correspond with periods of intensive land use (especially logging), with wildfire (especially the Tillamook Burns), and with large flood events (especially the 1996 500-y return interval flood). Differences in the intensity of logging and wildfire between the two watersheds may explain apparent drowning of Alsea Bay though it experiences similar 20th century relative sea level rise as Nehalem Bay.     

To test these hypotheses, I know I will be doing a lot of georeferencing, digitizing, and associated error analysis but I’m uncertain what to call the rest of what I’ve described – i.e., creation of a 3D model of change. To my current knowledge, many have performed similar methods using Landsat images (e.g., Miller et al. 2017) that may also be applicable to panchromatic historical aerial photographs. The methods of others who have used historical aerial photographs in their analysis seem promising; however, these often rely on ground-truthing (e.g., Ballanti et al. 2017), which is beyond the scope of this study. Other promising methods I will investigate over the coming days/weeks are those by Goodwin & Mudd (2020).  

Ultimately, I would like to publish this project as a manuscript. The publication would include numerous figures I would like to produce in this class including a map of hotspots of salt marsh growth or erosion. I also envision a lot of figures comparing horizontal and vertical changes within the estuaries.

I believe this project has both local significance, specifically for Oregon coastal community members, and global significance, specifically for the intertidal ecogeomorphologist community. Primarily, salt marshes provide numerous important ecosystem services including habitat for economically and culturally important fish and shellfish; flood protection; filtration of nutrients, pollutants, and sediment; recreation; and carbon burial. However, salt marshes are threatened by rising sea level and reduced fluvial suspended sediment loads. Despite their importance and threatened status, we are still unsure the method by which they grow or erode. Theoretically, we expect that as relative sea level rise, both accommodation space and hydroperiod increase causing salt marshes to grow at a similar pace; however, despite no sediment limitation some Oregon estuaries appear to be drowning (Salmon River and Alsea) while others are growing faster than what we would expect given relative sea level rise (Nehalem and Coquille). It is likely that fluvial sediment supply – both magnitude and timing – play a role in salt marsh growth, but the relationship remains unclear. By comparing growth patterns to the histories of landscape change and climate oscillations within the associated watersheds perhaps we can elucidate these linkages.           

Below is my level of preparation:

  • ArcGIS pro: working knowledge  
  • Modelbuilder and/or GIS programming in Python: none
  • R: novice
  • image processing: novice/working in Fiji/ImageJ
  • other relevant software: MatLab – working knowledge  


Ballanti, L., Byrd, K. B., Woo, I., & Ellings, C. (2017). Remote sensing for wetland mapping and historical change detection at the Nisqually River Delta. Sustainability, 9(11), 1919.

Goodwin, G. C., & Mudd, S. M. (2020). Detecting the Morphology of Prograding and Retreating Marsh Margins—Example of a Mega-Tidal Bay. Remote Sensing, 12(1), 13.

Miller, G. J., Morris, J. T., & Wang, C. (2017). Mapping salt marsh dieback and condition in South Carolina’s North Inlet-Winyah Bay National Estuarine Research Reserve using remote sensing. AIMS Environmental Science, 4(5), 677-689.

Peck, E. K., Wheatcroft, R. A., & Brophy, L. S. (2020). Controls on sediment accretion and blue carbon burial in tidal saline wetlands: Insights from the Oregon coast, USA. Journal of Geophysical Research: Biogeosciences, 125(2), e2019JG005464.

Total Organic Carbon and Benthic Deposit Feeders

1. A description of the research question that you are exploring.

For my thesis, I am researching the relative contribution of discarded bait from the commercial Dungeness crab fishery, wild benthic prey, and cannibalism to the diets of Dungeness crab. The commercial fishery for Dungeness is exceptionally sustainable and productive: yield has increased over time despite increased fishing effort. I hypothesize that fishermen are sustaining crab populations by throwing used bait overboard – “farming” crab unintentionally. Benthic box core sampling off the Oregon coast has revealed low abundance of benthic organisms, a natural prey source for the crabs. These findings further support the hypothesis that the crabs are largely living off bait. 

In order to understand the feeding ecology of Dungeness crab, it’s important to have a solid grasp of the wild prey available to them. For this class, I will investigate the relationship between total organic carbon on the seafloor and the abundance and spatial distribution of benthic deposit feeders. I also hope to explore seasonal variability in the spatial distribution of total organic carbon levels and prey abundance. I hypothesize that total organic carbon levels and benthic prey abundance will be higher near designated crab collection sites during the peak winter crabbing season and will be more dispersed in the summer during the off-season. 

2. A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

The dataset I’ll be exploring comes from benthic box core sampling conducted between Fort Bragg, CA, and Grays Harbor, WA from 2010 to 2016. The box cores were taken at between 60 and 525 m depth. The spatial distribution varies, and I have not yet determined the range of that variation. 

3. Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

I hypothesize that the spatial distribution of benthic deposit feeders will be clustered around areas with high levels of total organic carbon (TOC) because deposit feeders eat organic detritus from the sea floor. Additionally, I hypothesize that clusters of high TOC will be evident in the winter during the fishing season, but TOC will be more dispersed during the summer and fall. Similarly, I hypothesize that the spatial distribution of benthic deposit feeders will be more clustered during the winter and more dispersed during the summer. 

4. Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data.

I don’t know what types of analyses I would like to undertake – I hope this class will help me figure that out!

5. Expected outcome: what do you want to produce — maps? statistical relationships? other?

I want to produce maps of the seasonal spatial distribution of TOC and benthic deposit feeders.  I also hope to determine the statistical relationship between amount of TOC and abundance & spatial density of benthic deposit feeders. 

6. Significance: How is your spatial problem important to science? to resource managers?

As mentioned above, the commercial Dungeness crab fishery has been historically sustainable and productive despite increased fishing effort. However, the fishery currently faces closures from whale entanglement in crabbing gear, domoic acid outbreaks, and now the COVID-19 pandemic. If my hypothesis that fishermen are sustaining the crab population through discarded bait is correct, then reducing effort could cause Dungeness crab populations to decline. The work I do for this class will help me to understand the impact of seasonal TOC levels on benthic prey availability for the crabs.   

7. Your level of preparation: how much experience do you have with (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) image processing, (e) other relevant software?

  • ArcInfo: working knowledge
  • Modelbuilder/Python: novice
  • R: working knowledge
  • Image processing: working knowledge
  • Work knowledge of Google Earth Engine, ENVI, and ArcGIS Pro. Proficient in ArcMap.