While I had hoped that this summer would be full of trips to sunny, salty, sea lion filled Newport to mentor ODFW’s Summer Scholars, unfortunately everyone is still working remotely. Though I have heard from my Newport-based coworkers that this pandemic is not stopping the hordes of tourists from flocking to the coast for celebrations such as the recent 4th of July.
One of the main projects that I’ve been working on this summer while stuck in Bend (there are worse places to be stuck!) is understanding if marine reserves have influenced socioeconomic conditions in communities located near them. To investigate this, I first had to gather information on the socioeconomic conditions of coastal communities over time. I used the Census Bureau’s American Community Survey 5-year estimates from 2010 to 2018. While accessing data prior to 2010 would be ideal, the first 5-year estimate summary tables were only released in 2010, so we work with what we got.
After collecting all of these data, the exploratory analyses began, as did the true test of what I can remember from those statistics courses long ago and how far my R coding skills can take me. These exploratory analyses include tests and visualizations such as correlation plots, non-metric multidimensional scaling plots, bubble plots, vector analyses, principal component analyses, PERMANOVAs, the list goes on. When working with complicated multivariate data, I have learned that exploration is key to understanding what is really shaping your data.
I have also been trying to figure out what my control and treatment communities should be. With this first approach, I am considering treatment communities as those that are located <15km from a marine reserve. Control communities are therefore all coastal communities located >15km from a marine reserve. Since the marine reserves were phased in over time, I have three separate treatment groups. The 2012 group includes communities located near Redfish Rocks Marine Reserve and Otter Rock Marine Reserve, the 2014 group near Cape Perpetua Marine Reserve and Cascade Head Marine Reserve, and the 2016 group near Cape Falcon Marine Reserve. While this approach is a good first step, I will likely need to consider if other groupings or controls would be more appropriate. One method I am currently researching is creating a synthetic control by weighting non-treatment communities based on socioeconomic similarities to treatment communities prior to marine reserve implementation. But I won’t get too into the weeds with that statistical discussion here for everyone’s sake!
As I’ve been working with these census data, I’ve been thinking about the unfortunate timing of the decennial census this year. The Census Bureau conducts a survey every ten years with the goal to obtain a comprehensive snapshot of households in the United States. Unfortunately, the census this year coincided with a massive pandemic leading to significant economic loss and unemployment and the consequences that follow that loss. When future researchers use the decennial census to look at change over time, they are going to see data from 2020 that is not representative of the previous ten years, which will likely impact their analyses. I’m assuming that this data issue will lead to many footnotes in future papers. Luckily I will only be using data through 2019 (once it is made available) since the 2020 data will not be made available until after the marine reserve synthesis report is due.
The pandemic ramifications are widespread, indeed! Do you have any good stats, or R, resource to share? Several summer scholars are testing the waters of running statistics in R and might be grateful for any tips, tricks, or info that you can share. Good luck with these messy (real!) datasets!!
One of my best stats resources is a helpful and willing co-worker. I’ve found that there are no written resources better than a knowledgeable person when you’re first figuring out what to do with your data. Once you have a general idea of the exploratory and statistical tests to pursue, then Google becomes your best friend. As long as you know the right terms, you can type any stats or coding related question into Google and typically find the answer with enough sleuthing.