Spreadsheets, ArcGIS, and Programming! Oh My!

By Morgan O’Rourke-Liggett, Master’s Student, Oregon State University, Department of Fisheries, Wildlife, and Conservation Sciences, Geospatial Ecology of Marine Megafauna Lab

Avid readers of the GEMM Lab blog and other scientists are familiar with the incredible amounts of data collected in the field and the informative figures displayed in our publications and posters. Some of the more time-consuming and tedious work hardly gets talked about because it’s the in-between stage of science and other fields. For this blog, I am highlighting some of the behind-the-scenes work that is the subject of my capstone project within the GRANITE project.

For those unfamiliar with the GRANITE project, this multifaceted and non-invasive research project evaluates how gray whales respond to chronic ambient and acute noise to inform regulatory decisions on noise thresholds (Figure 1). This project generates considerable data, often stored in separate Excel files. While this doesn’t immediately cause an issue, ongoing research projects like GRANITE and other long-term monitoring programs often need to refer to this data. Still, when scattered into separate long Excel files, it can make certain forms of analysis difficult and time-consuming. It requires considerable attention to detail, persistence, and acceptance of monotony. Today’s blog will dive into the not-so-glamorous side of science…data management and standardization!

Figure 1. Infographic for the GRANITE project. Credit: Carrie Ekeroth

Of the plethora of data collected from the GRANITE project, I work with the GPS trackline data from the R/V Ruby, environmental data recorded on the boat, gray whale sightings data, and survey summaries for each field day. These come to me as individual yearly spreadsheets, ranging from thirty entries to several thousand. The first goal with this data is to create a standardized survey effort conditions table. The second goal is to determine the survey distance from the trackline, using the visibility for each segment, and calculate the actual area surveyed for the segment and day. This blog doesn’t go into how the area is calculated. Still, all these steps are the foundation for finding that information so the survey area can be calculated.

The first step requires a quick run-through of the sighting data to ensure all dates are within the designated survey area by examining the sighting code. After the date is a three-letter code representing a different starting location for the survey, such as npo for Newport and dep for Depoe Bay. If any code doesn’t match the designated codes for the survey extent, those are hidden, so they are not used in the new table. From there, filling in the table begins (Figure 2).

Figure 2. A blank survey effort conditions table with each category listed at the top in bold.

Segments for each survey day were determined based on when the trackline data changed from transit to the sighting code (i.e., 190829_1 for August 29th, 2019, sighting 1). Transit indicated the research vessel was traveling along the coast, and crew members were surveying the area for whales. Each survey day’s GPS trackline and segment information were copied and saved into separate Excel workbook files. A specific R code would convert those files into NAD 1983 UTM Zone 10N northing and easting coordinates.

Those segments are uploaded into an ArcGIS database and mapped using the same UTM projection. The northing and easting points are imported into ArcGIS Pro as XY tables. Using various geoprocessing and editing tools, each segmented trackline for the day is created, and each line is split wherever there was trackline overlap or U shape in the trackline that causes the observation area to overlap. This splitting ensures the visibility buffer accounts for the overlap (Figure 3).

Figure 3. Segment 3 from 7/22/2019 with the visibility of 3 km portrayed as buffers. There are more than one because the trackline was split to account for the overlapping of the survey area. This approach accounts for the fact that this area where all three buffers overlap was surveyed 3 times.

Once the segment lines are created in ArcGIS, the survey area map (Figure 4) is used alongside the ArcGIS display to determine the start and end locations. An essential part of the standardization process is using the annotated locations in Figure 4 instead of the names on the basemap for the location start and endpoints. This consistency with the survey area map is both for tracking the locations through time and for the crew on the research vessel to recognize the locations. The step assists with interpreting the survey notes for conditions at the different segments. The time starts and ends, and the latitude and longitude start and end are taken from the trackline data.

Figure 4. Map of the survey area with annotated locations (Created by L. Torres, GEMM Lab)

The sighting data includes the number of whales sighted, Beaufort Sea State, and swell height for the locations where whales were spotted. The environmental data from the sighting data is used as a guide when filling in the rest of the values along the trackline. When data, such as wind speed, swell height, or survey condition, is not explicitly given, matrices have been developed in collaboration with Dr. Leigh Torres to fill in the gaps in the data. These matrices and protocols for filling in the final conditions log are important tools for standardizing the environmental and condition data.

The final product for the survey conditions table is the output of all the code and matrices (Figure 5). The creation of this table will allow for accurate calculation of survey effort on each day, month, and year of the GRANITE project. This effort data is critical to evaluate trends in whale distribution, habitat use, and exposure to disturbances or threats.

Figure 5. A snippet of the completed 2019 season effort condition log.

The process of completing the table can be a very monotonous task, and there are several chances for the data to get misplaced or missed entirely. Attention to detail is a critical aspect of this project. Standardizing the GRANITE data is essential because it allows for consistency over the years and across platforms. In describing this aspect of my project, I mentioned three different computer programs using the same data. This behind-the-scenes work of creating and maintaining data standardization is critical for all projects, especially long-term research such as the GRANITE project.

Did you enjoy this blog? Want to learn more about marine life, research, and conservation? Subscribe to our blog and get a weekly message when we post a new blog. Just add your name and email into the subscribe box below.


Marine Mammal Observing: Standardization is key

By: Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab

For the past two years, I’ve had the opportunity to be the marine mammal observer aboard the NOAA ship Bell M. Shimada for 10 days in May. Both trips covered transects in the Northern California Current Ecosystem during the same time of year, but things looked very different from my chair on the fly bridge. This trip, in particular, highlighted the importance of standardization, seeing as it was the second replicate of the same area. Other scientists and crew members repeatedly asked me the same questions that made me realize just how important it is to have standards in scientific practices and communicating them.

Northern right whale dolphin porpoising out of the water beside the ship while in transit. May 2019. Image source: Alexa Kownacki

The questions:

  1. What do you actually do here and why are you doing it?
  2. Is this year the same as last year in terms of weather, sightings, and transect locations?
  3. Did you expect to see greater or fewer sightings (number and diversity)?
  4. What is this Beaufort Sea State scale that you keep referring to?

All of these are important scientific questions that influence our hypothesis-testing research, survey methods, expected results, and potential conclusions. Although the entire science party aboard the ship conducted marine science, we all had our own specialties and sometimes only knew the basics, if that, about what the other person was doing. It became a perfect opportunity to share our science and standards across similar, but different fields.

Now, to answer those questions:

  1. a) What do you actually do here and b) why are you doing it?

a) As the only marine mammal observer, I stand watch during favorable weather conditions while the ship is in transit, scanning from 0 to 90 degrees off the starboard side (from the front of the ship to a right angle towards the right side when facing forwards). Meanwhile, an application on an iPad called SeaScribe, records the ship’s exact location every 15 seconds, even when no animal is sighted. This process allows for the collection of absence data, that is, data when no animals are present. The SeaScribe program records the survey lines, along with manual inputs that I add, including weather and observer information. When I spot a marine mammal, I immediately mark an exact location on a hand held GPS, use my binoculars to identify the species, and add information to the sighting on the SeaScribe program, such as species, distance to the sighted animal(s), the degree (angle) to the sighting, number of animals in a group, behavior, and direction if traveling.

b) Marine mammal observing serves many different purposes. In this case, observing collects information about what species are where at what time. By piggy-backing on these large-scale, offshore oceanographic NOAA surveys, we have the unique opportunity to survey along standardized transect lines during different times of the year. From replicate survey data, we can start to form an idea of which species use which areas and what oceanographic conditions may impact species distributions. Currently there is not much consistent marine mammal data collected over these offshore areas between Northern California and Washington State, so our work is aiming to fill this knowledge gap.

Alexa observing on the R/V Shimada in May 2019, all bundled up. Image Source: Alexa Kownacki

  1. What is this Beaufort Sea State scale that you keep referring to?

Great question! It took me a while to realize that this standard measuring tool to estimate wind speeds and sea conditions, is not commonly recognized even among other sea-goers. The Beaufort Sea State, or BSS, uses an empirical scale that ranges from 0-12 with 0 being no wind and calm seas, to 12 being hurricane-force winds with 45+ ft seas. It is frequently referenced by scientists in oceanography, marine science, and climate science as a universally-understood metric. The BSS was created in 1805 by Francis Beaufort, a hydrographer in the Royal Navy, to standardize weather conditions across the fleet of vessels. By the mid-1850s, the BSS was standardized to non-naval use for sailing vessels, and in 1916, expanded to include information specific to the seas and not the sails1. We in the marine mammal observation field constantly collect BSS information while on survey to measure the quality of survey conditions that may impact our observations. BSS data allows us to measure the extent of our survey range, both in the distance that we are likely to sight animals and also the likelihood of sighting anything. Therefore, the BSS scale gives us an important indication of how much absence data we have collected, in addition to presence data.

A description of the Beaufort Sea State Scale. Image source: National Weather Service.


  1. Is this year the same as last year in terms of weather, sightings, and transect locations?

The short answer is no. Observed differences in marine mammal sightings in terms of both species diversity and number of animals between years can be normal. There are many potential explanatory variables, from differences in currents, upwelling strength, El Nino index levels, water temperatures, or, what was obvious in this case: sighting conditions. The weather in May 2019 varied greatly from that in May 2018. Last year, I observed for nearly every day because the Beaufort Sea State (BSS) was frequently less than a four. However, this year, more often than not, the BSS greater than or equal to five. A BSS of 5 equates to approximately 17-21 knots of breeze with 6-foot waves and the water appears to have many “white horses” or pronounced white caps with sea spray. Additionally, mechanical issue with winches delayed and altered our transect locations. Therefore, although multiple transects from May 2018 were also surveyed during May 2019, there were a few lines that do not have data for both cruises.

May 2018 with a BSS 1

May 2019 with a BSS 6






  1. Did you expect to see greater or fewer sightings (number and diversity)?

Knowing that I had less favorable sighting conditions and less amount of effort observing this year, it is not surprising that I observed fewer marine mammals in total count and in species diversity. Even less surprising is that on the day with the best weather, where the BSS was less than a five, I recorded the most sightings with the highest species count. May 2018 felt a bit like a tropical vacation because we had surprisingly sunny days with mild winds, and during May 2019 we had some rough seas with gale force winds. Additionally, as an observer, I need to remove as much bias as possible. So, yes, I had hoped to see beaked whales or orca like I did in May 2018, but I was still pleasantly surprised when I spotted fin whales feeding in May 2019.

Marine Mammal Species Number of Sightings
May 2018 May 2019
Humpback whale 31 6
Northern right whale dolphin 1 2
Pacific white-sided dolphin 3 6
UNID beaked whale 1 0
Cuvier’s beaked whale 1 0
Gray whale 4 1
Minke whale 1 1
Fin whale 4 1
Blue whale 1 0
Transient killer whale 1 0
Dall’s porpoise 2 0
Northern fur seal 1 0
California sea lion 0 1

Pacific white-sided dolphin. Image source: Alexa Kownacki

Standardization is a common theme. Observing between years on standard transects, at set speeds, in different conditions using standardized tools is critical to collecting high quality data that is comparable across different periods. Scientists constantly think about quality control. We look for trends and patterns, similarities and differences, but none of those could be understood without having standard metrics.

The entire science party aboard the R/V Shimada in May 2019, including a marine mammal scientist, phytoplankton scientists, zooplankton scientists, and fisheries scientists, and oceanographers. Image Source: Alexa Kownacki

Literature Cited:

1Oliver, John E. (2005). Encyclopedia of world climatology. Springer.