Data Wrangling to Assess Data Availability: A Data Detective at Work

By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab

Data wrangling, in my own loose definition, is the necessary combination of both data selection and data collection. Wrangling your data requires accessing then assessing your data. Data collection is just what it sounds like: gathering all data points necessary for your project. Data selection is the process of cleaning and trimming data for final analyses; it is a whole new bag of worms that requires decision-making and critical thinking. During this process of data wrangling, I discovered there are two major avenues to obtain data: 1) you collect it, which frequently requires an exorbitant amount of time in the field, in the lab, and/or behind a computer, or 2) other people have already collected it, and through collaboration you put it to a good use (often a different use then its initial intent). The latter approach may result in the collection of so much data that you must decide which data should be included to answer your hypotheses. This process of data wrangling is the hurdle I am facing at this moment. I feel like I am a data detective.

Data wrangling illustrated by members of the R-programming community. (Image source: R-bloggers.com)

My project focuses on assessing the health conditions of the two ecotypes of bottlenose dolphins between the waters off of Ensenada, Baja California, Mexico to San Francisco, California, USA between 1981-2015. During the government shutdown, much of my data was inaccessible, seeing as it was in possession of my collaborators at federal agencies. However, now that the shutdown is over, my data is flowing in, and my questions are piling up. I can now begin to look at where these animals have been sighted over the past decades, which ecotypes have higher contaminant levels in their blubber, which animals have higher stress levels and if these are related to geospatial location, where animals are more susceptible to human disturbance, if sex plays a role in stress or contaminant load levels, which environmental variables influence stress levels and contaminant levels, and more!

Alexa, alongside collaborators, photographing transiting bottlenose dolphins along the coastline near Santa Barbara, CA in 2015 as part of the data collection process. (Image source: Nick Kellar).

Over the last two weeks, I was emailed three separate Excel spreadsheets representing three datasets, that contain partially overlapping data. If Microsoft Access is foreign to you, I would compare this dilemma to a very confusing exam question of “matching the word with the definition”, except with the words being in different languages from the definitions. If you have used Microsoft Access databases, you probably know the system of querying and matching data in different databases. Well, imagine trying to do this with Excel spreadsheets because the databases are not linked. Now you can see why I need to take a data management course and start using platforms other than Excel to manage my data.

A visual interpretation of trying to combine datasets being like matching the English definition to the Spanish translation. (Image source: Enchanted Learning)

In the first dataset, there are 6,136 sightings of Common bottlenose dolphins (Tursiops truncatus) documented in my study area. Some years have no sightings, some years have fewer than 100 sightings, and other years have over 500 sightings. In another dataset, there are 398 bottlenose dolphin biopsy samples collected between the years of 1992-2016 in a genetics database that can provide the sex of the animal. The final dataset contains records of 774 bottlenose dolphin biopsy samples collected between 1993-2018 that could be tested for hormone and/or contaminant levels. Some of these samples have identification numbers that can be matched to the other dataset. Within these cross-reference matches there are conflicting data in terms of amount of tissue remaining for analyses. Sorting these conflicts out will involve more digging from my end and additional communication with collaborators: data wrangling at its best. Circling back to what I mentioned in the beginning of this post, this data was collected by other people over decades and the collection methods were not standardized for my project. I benefit from years of data collection by other scientists and I am grateful for all of their hard work. However, now my hard work begins.

The cutest part of data wrangling: finding adorable images of bottlenose dolphins, photographed during a coastal survey. (Image source: Alexa Kownacki).

There is also a large amount of data that I downloaded from federally-maintained websites. For example, dolphin sighting data from research cruises are available for public access from the OBIS (Ocean Biogeographic Information System) Sea Map website. It boasts 5,927,551 records from 1,096 data sets containing information on 711 species with the help of 410 collaborators. This website is incredible as it allows you to search through different data criteria and then download the data in a variety of formats and contains an interactive map of the data. You can explore this at your leisure, but I want to point out the sheer amount of data. In my case, the OBIS Sea Map website is only one major platform that contains many sources of data that has already been collected, not specifically for me or my project, but will be utilized. As a follow-up to using data collected by other scientists, it is critical to give credit where credit is due. One of the benefits of using this website, is there is information about how to properly credit the collaborators when downloading data. See below for an example:

Example citation for a dataset (Dataset ID: 1201):

Lockhart, G.G., DiGiovanni Jr., R.A., DePerte, A.M. 2014. Virginia and Maryland Sea Turtle Research and Conservation Initiative Aerial Survey Sightings, May 2011 through July 2013. Downloaded from OBIS-SEAMAP (http://seamap.env.duke.edu/dataset/1201) on xxxx-xx-xx.

Citation for OBIS-SEAMAP:

Halpin, P.N., A.J. Read, E. Fujioka, B.D. Best, B. Donnelly, L.J. Hazen, C. Kot, K. Urian, E. LaBrecque, A. Dimatteo, J. Cleary, C. Good, L.B. Crowder, and K.D. Hyrenbach. 2009. OBIS-SEAMAP: The world data center for marine mammal, sea bird, and sea turtle distributions. Oceanography 22(2):104-115

Another federally-maintained data source that boasts more data than I can quantify is the well-known ERDDAP website. After a few Google searches, I finally discovered that the acronym stands for Environmental Research Division’s Data Access Program. Essentially, this the holy grail of environmental data for marine scientists. I have downloaded so much data from this website that Excel cannot open the csv files. Here is yet another reason why young scientists, like myself, need to transition out of using Excel and into data management systems that are developed to handle large-scale datasets. Everything from daily sea surface temperatures collected on every, one-degree of latitude and longitude line from 1981-2015 over my entire study site to Ekman transport levels taken every six hours on every longitudinal degree line over my study area. I will add some environmental variables in species distribution models to see which account for the largest amount of variability in my data. The next step in data selection begins with statistics. It is important to find if there are highly correlated environmental factors prior to modeling data. Learn more about fitting cetacean data to models here.

The ERDAPP website combined all of the average Sea Surface Temperatures collected daily from 1981-2018 over my study site into a graphical display of monthly composites. (Image Source: ERDDAP)

As you can imagine, this amount of data from many sources and collaborators is equal parts daunting and exhilarating. Before I even begin the process of determining the spatial and temporal spread of dolphin sightings data, I have to identify which data points have sex identified from either hormone levels or genetics, which data points have contaminants levels already quantified, which samples still have tissue available for additional testing, and so on. Once I have cleaned up the datasets, I will import the data into the R programming package. Then I can visualize my data in plots, charts, and graphs; this will help me identify outliers and potential challenges with my data, and, hopefully, start to see answers to my focal questions. Only then, can I dive into the deep and exciting waters of species distribution modeling and more advanced statistical analyses. This is data wrangling and I am the data detective.

What people may think a ‘data detective’ looks like, when, in reality, it is a person sitting at a computer. (Image source: Elder Research)

Like the well-known phrase, “With great power comes great responsibility”, I believe that with great data, comes great responsibility, because data is power. It is up to me as the scientist to decide which data is most powerful at answering my questions.

Data is information. Information is knowledge. Knowledge is power. (Image source: thedatachick.com)

 

Over the Ocean and Under the Bridges: STEM Cruise on the R/V Oceanus

By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab

From September 22nd through 30th, the GEMM Lab participated in a STEM research cruise aboard the R/V Oceanus, Oregon State University’s (OSU) largest research vessel, which served as a fully-functioning, floating, research laboratory and field station. The STEM cruise focused on integrating science, technology, engineering and mathematics (STEM) into hands-on teaching experiences alongside professionals in the marine sciences. The official science crew consisted of high school teachers and students, community college students, and Oregon State University graduate students and professors. As with a usual research cruise, there was ample set-up, data collection, data entry, experimentation, successes, and failures. And because everyone in the science party actively participated in the research process, everyone also experienced these successes, failures, and moments of inspiration.

The science party enjoying the sunset from the aft deck with the Astoria-Megler bridge in the background. (Image source: Alexa Kownacki)

Dr. Leigh Torres, Dr. Rachael Orben, and I were all primarily stationed on flybridge—one deck above the bridge—fully exposed to the elements, at the highest possible location on the ship for best viewing. We scanned the seas in hopes of spotting a blow, a splash, or any sign of a marine mammal or seabird. Beside us, students and teachers donned binoculars and positioned themselves around the mast, with Leigh and I taking a 90-degree swath from the mast—either to starboard or to port. For those who had not been part of marine mammal observations previously, it was a crash course into the peaks and troughs—of both the waves and of the sightings. We emphasized the importance of absence data: knowledge of what is not “there” is equally as important as what is. Fortunately, Leigh chose a course that proved to have surprisingly excellent environmental conditions and amazing sightings. Therefore, we collected a large amount of presence data: data collected when marine mammals or seabirds are present.

High school student, Chris Quashnick Holloway, records a seabird sighting for observer, Dr. Rachael Orben. (Image source: Alexa Kownacki).

When someone sighted a whale that surfaced regularly, we assessed the conditions: the sea state, the animal’s behavior, the wind conditions, etc. If we deemed them as “good to fly”, our licensed drone pilot and Orange Coast Community College student, Jason, prepared his Phantom 4 drone. While he and Leigh set up drone operations, I and the other science team members maintained a visual on the whale and stayed in constant communication with the bridge via radio. When the drone was ready, and the bridge gave the “all clear”, Jason launched his drone from the aft deck. Then, someone tossed an unassuming, meter-long, wood plank overboard—keeping it attached to the ship with a line. This wood board serves as a calibration tool; the drone flies over it at varying heights as determined by its built-in altimeter. Later, we analyze how many pixels one meter occupied at different heights and can thereby determine the body length of the whale from still images by converting pixel length to a metric unit.

High school student, Alishia Keller, uses binoculars to observe a whale, while PhD student, Alexa Kownacki, radios updates on the whale’s location to the bridge and the aft deck. (Image source: Tracy Crews)

Finally, when the drone is calibrated, I radio the most recent location of our animal. For example, “Blow at 9 o’clock, 250 meters away”. Then, the bridge and I constantly adjust the ship’s speed and location. If the whale “flukes” (dives and exposes the ventral side of its tail), and later resurfaced 500 meters away at our 10 o’clock, I might radio to the bridge to, “turn 60 degrees to port and increase speed to 5 knots”. (See the Hidden Math Lesson below). Jason then positions the drone over the whale, adjusting the camera angle as necessary, and recording high-quality video footage for later analysis. The aerial viewpoint provides major advantages. Whales usually expose about 10 percent of their body above the water’s surface. However, with an aerial vantage point, we can see more of the whale and its surroundings. From here, we can observe behaviors that are otherwise obscured (Torres et al. 2018), and record footage that to help quantify body condition (i.e. lengths and girths). Prior to the batteries running low, Jason returns the drone back to the aft deck, the vessel comes to an idle, and Leigh catches the drone. Throughout these operations, those of us on the flybridge photograph flukes for identification and document any behaviors we observe. Later, we match the whale we sighted to the whale that the drone flew over, and then to prior sightings of this same individual—adding information like body condition or the presence of a calf. I like to think of it as whale detective work. Moreover, it is a team effort; everyone has a critical role in the mission. When it’s all said and done, this noninvasive approach provides life history context to the health and behaviors of the animal.

Drone pilot, Jason Miranda, flying his drone using his handheld ground station on the aft deck. (Photo source: Tracy Crews)

Hidden Math Lesson: The location of 10 o’clock and 60 degrees to port refer to the exact same direction. The bow of the ship is our 12 o’clock with the stern at our 6 o’clock; you always orient yourself in this manner when giving directions. The same goes for a compass measurement in degrees when relating the direction to the boat: the bow is 360/0. An angle measure between two consecutive numbers on a clock is: 360 degrees divided by 12-“hour” markers = 30 degrees. Therefore, 10 o’clock was 0 degrees – (2 “hours”)= 0 degrees- (2*30 degrees)= -60 degrees. A negative degree less than 180 refers to the port side (left).

Killer whale traveling northbound.

Our trip was chalked full of science and graced with cooperative weather conditions. There were more highlights than I could list in a single sitting. We towed zooplankton nets under the night sky while eating ice cream bars; we sang together at sunset and watched the atmospheric phenomena: the green flash; we witnessed a humpback lunge-feeding beside the ship’s bow; and we saw a sperm whale traveling across calm seas.

Sperm whale surfacing before a long dive.

On this cruise, our lab focused on the marine mammal observations—which proved excellent during the cruise. In only four days of surveying, we had 43 marine mammal sightings containing 362 individuals representing 9 species (See figure 1). As you can see from figure 2, we traveled over shallow, coastal and deep waters, in both Washington and Oregon before inland to Portland, OR. Because we ventured to areas with different bathymetric and oceanographic conditions, we increased our likelihood of seeing a higher diversity of species than we would if we stayed in a single depth or area.

Humpback whale lunge feeding off the bow.
Number of sightings Total number of individuals
Humpback whale 22 40
Pacific white-sided dolphin 3 249
Northern right whale dolphin 1 9
Killer whale 1 3
Dall’s porpoise 5 49
Sperm whale 1 1
Gray whale 1 1
Harbor seal 1 1
California sea lion 8 9
Total 43 362

Figure 1. Summary table of all species sightings during cruise while the science team observed from the flybridge.

Pacific white-sided dolphins swimming towards the vessel.

Figure 2. Map with inset displaying study area and sightings observed by species during the cruise, made in ArcMap. (Image source: Alexa Kownacki).

Even after two days of STEM outreach events in Portland, we were excited to incorporate more science. For the transit from Portland, OR to Newport, OR, the entire science team consisted two people: me and Jason. But even with poor weather conditions, we still used science to answer questions and help us along our journey—only with different goals than on our main leg. With the help of the marine technician, we set up a camera on the bow of the ship, facing aft to watch the vessel maneuver through the famous Portland bridges.

Video 1. Time-lapse footage of the R/V Oceanus maneuvering the Portland Bridges from a GoPro. Compiled by Alexa Kownacki, assisted by Jason Miranda and Kristin Beem.

Prior to the crossing the Columbia River bar and re-entering the Pacific Ocean, the R/V Oceanus maneuvered up the picturesque Columbia River. We used our geospatial skills to locate our fellow science team member and high school student, Chris, who was located on land. We tracked each other using GPS technology in our cell phones, until the ship got close enough to use natural landmarks as reference points, and finally we could use our binoculars to see Chris shining a light from shore. As the ship powered forward and passed under the famous Astoria-Megler bridge that connects Oregon to Washington, Chris drove over it; he directed us “100 degrees to port”. And, thanks to clear directions, bright visual aids, and spatiotemporal analysis, we managed to find our team member waving from shore. This is only one of many examples that show how in a few days at sea, students utilized new skills, such as marine mammal observational techniques, and honed them for additional applications.

On the bow, Alexa and Jason use binoculars to find Chris–over 4 miles–on the Washington side of the Columbia River. (Image source: Kristin Beem)

Great science is the result of teamwork, passion, and ingenuity. Working alongside students, teachers, and other, more-experienced scientists, provided everyone with opportunities to learn from each other. We created great science because we asked questions, we passed on our knowledge to the next person, and we did so with enthusiasm.

High school students, Jason and Chris, alongside Dr. Leigh Torres, all try to get a glimpse at the zooplankton under Dr. Kim Bernard’s microscope. (Image source: Tracy Crews).

Check out other blog posts written by the science team about the trip here.

Big Data: Big possibilities with bigger challenges

By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab

Did you know that Excel has a maximum number of rows? I do. During Winter Term for my GIS project, I was using Excel to merge oceanographic data, from a publicly-available data source website, and Excel continuously quit. Naturally, I assumed I had caused some sort of computer error. [As an aside, I’ve concluded that most problems related to technology are human error-based.] Therefore, I tried reformatting the data, restarting my computer, the program, etc. Nothing. Then, thanks to the magic of Google, I discovered that Excel allows no more than 1,048,576 rows by 16,384 columns. ONLY 1.05 million rows?! The oceanography data was more than 3 million rows—and that’s with me eliminating data points. This is what happens when we’re dealing with big data.

According to Merriam-Webster dictionary, big data is an accumulation of data that is too large and complex for processing by traditional database management tools (www.merriam-webster.com). However, there are journal articles, like this one from Forbes, that discuss the ongoing debate of how to define “big data”. According to the article, there are 12 major definitions; so, I’ll let you decide what you qualify as “big data”. Either way, I think that when Excel reaches its maximum row capacity, I’m working with big data.

Collecting oceanography data aboard the R/V Shimada. Photo source: Alexa K.

Here’s the thing: the oceanography data that I referred to was just a snippet of my data. Technically, it’s not even MY data; it’s data I accessed from NOAA’s ERDDAP website that had been consistently observed for the time frame of my dolphin data points. You may recall my blog about maps and geospatial analysis that highlights some of the reasons these variables, such as temperature and salinity, are important. However, what I didn’t previously mention was that I spent weeks working on editing this NOAA data. My project on common bottlenose dolphins overlays environmental variables to better understand dolphin population health off of California. These variables should have similar spatiotemporal attributes as the dolphin data I’m working with, which has a time series beginning in the 1980s. Without taking out a calculator, I still know that equates to a lot of data. Great data: data that will let me answer interesting, pertinent questions. But, big data nonetheless.

This is a screenshot of what the oceanography data looked like when I downloaded it to Excel. This format repeats for nearly 3 million rows.

Excel Screen Shot. Image source: Alexa K.

I showed this Excel spreadsheet to my GIS professor, and his response was something akin to “holy smokes”, with a few more expletives and a look of horror. It was not the sheer number of rows that shocked him; it was the data format. Nowadays, nearly everyone works with big data. It’s par for the course. However, the way data are formatted is the major split between what I’ll call “easy” data and “hard” data. The oceanography data could have been “easy” data. It could have had many variables listed in columns. Instead, this data  alternated between rows with variable headings and columns with variable headings, for millions of cells. And, as described earlier, this is only one example of big data and its challenges.

Data does not always come in a form with text and numbers; sometimes it appears as media such as photographs, videos, and audio files. Big data just got a whole lot bigger. While working as a scientist at NOAA’s Southwest Fisheries Science Center, one project brought in over 80 terabytes of raw data per year. The project centered on the eastern north pacific gray whale population, and, more specifically, its migration. Scientists have observed the gray whale migration annually since 1994 from Piedras Blancas Light Station for the Northbound migration, and 2 out of every 5 years from Granite Canyon Field Station (GCFS) for the Southbound migration. One of my roles was to ground-truth software that would help transition from humans as observers to computer as observers. One avenue we assessed was to compare how well a computer “counted” whales compared to people. For this question, three infrared cameras at the GCFS recorded during the same time span that human observers were counting the migratory whales. Next, scientists, such as myself, would transfer those video files, upwards of 80 TB, from the hard drives to Synology boxes and to a different facility–miles away. Synology boxes store arrays of hard drives and that can be accessed remotely. To review, three locations with 80 TB of the same raw data. Once the data is saved in triplet, then I could run a computer program, to detect whale. In summary, three months of recorded infrared video files requires upwards of 240 TB before processing. This is big data.

Scientists on an observation shift at Granite Canyon Field Station in Northern California. Photo source: Alexa K.
Alexa and another NOAA scientist watching for gray whales at Piedras Blancas Light Station. Photo source: Alexa K.

In the GEMM Laboratory, we have so many sources of data that I did not bother trying to count. I’m entering my second year of the Ph.D. program and I already have a hard drive of data that I’ve backed up three different locations. It’s no longer a matter of “if” you work with big data, it’s “how”. How will you format the data? How will you store the data? How will you maintain back-ups of the data? How will you share this data with collaborators/funders/the public?

The wonderful aspect to big data is in the name: big and data. The scientific community can answer more, in-depth, challenging questions because of access to data and more of it. Data is often the limiting factor in what researchers can do because increased sample size allows more questions to be asked and greater confidence in results. That, and funding of course. It’s the reason why when you see GEMM Lab members in the field, we’re not only using drones to capture aerial images of whales, we’re taking fecal, biopsy, and phytoplankton samples. We’re recording the location, temperature, water conditions, wind conditions, cloud cover, date/time, water depth, and so much more. Because all of this data will help us and help other scientists answer critical questions. Thus, to my fellow scientists, I feel your pain and I applaud you, because I too know that the challenges that come with big data are worth it. And, to the non-scientists out there, hopefully this gives you some insight as to why we scientists ask for external hard drives as gifts.

Leila launching the drone to collect aerial images of gray whales to measure body condition. Photo source: Alexa K.
Using the theodolite to collect tracking data on the Pacific Coast Feeding Group in Port Orford, OR. Photo source: Alexa K.

References:

https://support.office.com/en-us/article/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3

https://www.merriam-webster.com/dictionary/big%20data

The Recipe for a “Perfect” Marine Mammal and Seabird Cruise

By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab

Science—and fieldwork in particular—is known for its failures. There are websites, blogs, and Twitter pages dedicated to them. This is why, when things go according to plan, I rejoice. When they go even better than expected, I practically tear up from amazement. There is no perfect recipe for a great marine mammal and seabird research cruise, but I would suggest that one would look like this:

 A Great Marine Mammal and Seabird Research Cruise Recipe:

  • A heavy pour of fantastic weather
    • Light on the wind and seas
    • Light on the glare
  • Equal parts amazing crew and good communication
  • A splash of positivity
  • A dash of luck
  • A pinch of delicious food
  • Heaps of marine mammal and seabird sightings
  • Heat to approximately 55-80 degrees F and transit for 10 days along transects at 10-12 knots
The end of another beautiful day at sea on the R/V Shimada. Image source: Alexa K.

The Northern California Current Ecosystem (NCCE) is a highly productive area that is home to a wide variety of cetacean species. Many cetaceans are indicator species of ecosystem health as they consume large quantities of prey from different levels in trophic webs and inhabit diverse areas—from deep-diving beaked whales to gray whales traveling thousands of miles along the eastern north Pacific Ocean. Because cetacean surveys are a predominant survey method in large bodies of water, they can be extremely costly. One alternative to dedicated cetacean surveys is using other research vessels as research platforms and effort becomes transect-based and opportunistic—with less flexibility to deviate from predetermined transects. This decreases expenses, creates collaborative research opportunities, and reduces interference in animal behavior as they are never pursued. Observing animals from large, motorized, research vessels (>100ft) at a steady, significant speed (>10kts/hour), provides a baseline for future, joint research efforts. The NCCE is regularly surveyed by government agencies and institutions on transects that have been repeated nearly every season for decades. This historical data provides critical context for environmental and oceanographic dynamics that impact large ecosystems with commercial and recreational implications.

My research cruise took place aboard the 208.5-foot R/V Bell M. Shimada in the first two weeks of May. The cruise was designated for monitoring the NCCE with the additional position of a marine mammal observer. The established guidelines did not allow for deviation from the predetermined transects. Therefore, mammals were surveyed along preset transects. The ship left port in San Francisco, CA and traveled as far north as Cape Meares, OR. The transects ranged from one nautical mile from shore and two hundred miles offshore. Observations occurred during “on effort” which was defined as when the ship was in transit and moving at a speed above 8 knots per hour dependent upon sea state and visibility. All observations took place on the flybridge during conducive weather conditions and in the bridge (one deck below the flybridge) when excessive precipitation was present. The starboard forward quarter: zero to ninety degrees was surveyed—based on the ship’s direction (with the bow at zero degrees). Both naked eye and 7×50 binoculars were used with at least 30 percent of time binoculars in use. To decrease observer fatigue, which could result in fewer detected sightings, the observer (me) rotated on a 40 minutes “on effort”, 20 minutes “off effort” cycle during long transits (>90 minutes).

Alexa on-effort using binoculars to estimate the distance and bearing of a marine mammal sighted off the starboard bow. Image source: Alexa K.

Data was collected using modifications to the SEEbird Wincruz computer program on a ruggedized laptop and a GPS unit was attached. At the beginning of each day and upon changes in conditions, the ship’s heading, weather conditions, visibility, cloud cover, swell height, swell direction, and Beaufort sea state (BSS) were recorded. Once the BSS or visibility was worse than a “5” (1 is “perfect” and 5 is “very poor”) observations ceased until there was improvement in weather. When a marine mammal was sighted the latitude and longitude were recorded with the exact time stamp. Then, I noted how the animal was sighted—either with binoculars or naked eye—and what action was originally noticed—blow, splash, bird, etc. The bearing and distance were noted using binoculars. The animal was given three generalized behavior categories: traveling, feeding, or milling. A sighting was defined as any marine mammal or group of animals. Therefore, a single sighting would have the species and the best, high, and low estimates for group size.

By my definitions, I had the research cruise of my dreams. There were moments when I imagined people joining this trip as a vacation. I *almost* felt guilty. Then, I remember that after watching water for almost 14 hours (thanks to the amazing weather conditions), I worked on data and reports and class work until midnight. That’s the part that no one talks about: the data. Fieldwork is about collecting data. It’s both what I live for and what makes me nervous. The amount of time, effort, and money that is poured into fieldwork is enormous. The acquisition of the data is not as simple as it seems. When I briefly described my position on this research cruise to friends, they interpret it to be something akin to whale-watching. To some extent, this is true. But largely, it’s grueling hours that leave you fatigued. The differences between fieldwork and what I’ll refer to as “everything else” AKA data analysis, proposal writing, manuscript writing, literature reviewing, lab work, and classwork, are the unbroken smile, the vaguely tanned skin, the hours of laughter, the sea spray, and the magical moments that reassure me that I’ve chosen the correct career path.

Alexa photographing a gray whale at sunset near Newport, OR. Image source: Alexa K.

This cruise was the second leg of the Northern California Current Ecosystem (NCCE) survey, I was the sole Marine Mammal and Seabird Observer—a coveted position. Every morning, I would wake up at 0530hrs, grab some breakfast, and climb to the highest deck: the fly-bridge. Akin to being on the top of the world, the fly-bridge has the best views for the widest span. From 0600hrs to 2000hrs I sat, stood, or danced in a one-meter by one-meter corner of the fly-bridge and surveyed. This visual is why people think I’m whale watching. In reality, I am constantly busy. Nonetheless, I had weather and seas that scientists dream about—and for 10 days! To contrast my luck, you can read Florence’s blog about her cruise. On these same transects, in February, Florence experienced 20-foot seas with heavy rain with very few marine mammal sightings—and of those, the only cetaceans she observed were gray whales close to shore. That starkly contrasts my 10 cetacean species with upwards of 45 sightings and my 20-minute hammock power naps on the fly-bridge under the warm sun.

Pacific white-sided dolphins traveling nearby. Image source: Alexa K.

Marine mammal sightings from this cruise included 10 cetacean species: Pacific white-sided dolphin, Dall’s porpoise, unidentified beaked whale, Cuvier’s beaked whale, gray whale, Minke whale, fin whale, Northern right whale dolphin, blue whale, humpback whale, and transient killer whale and one pinniped species: northern fur seal. What better way to illustrate these sightings than with a map? We are a geospatial lab after all.

Cetacean Sightings on the NCCE Cruise in May 2018. Image source: Alexa K.

This map is the result of data collection. However, it does not capture everything that was observed: sea state, weather, ocean conditions, bathymetry, nutrient levels, etc. There are many variables that can be added to maps–like this one (thanks to my GIS classes I can start adding layers!)–that can provide a better understanding of the ecosystem, predator-prey dynamics, animal behavior, and population health.

The catch from a bottom trawl at a station with some fish and a lot of pyrosomes (pink tube-like creatures). Image source: Alexa K.

Being a Ph.D. student can be physically and mentally demanding. So, when I was offered the opportunity to hone my data collection skills, I leapt for it. I’m happiest in the field: the wind in my face, the sunshine on my back, surrounded by cetaceans, and filled with the knowledge that I’m following my passion—and that this data is contributing to the greater scientific community.

Humpback whale photographed traveling southbound. Image source: Alexa K.

The Land of Maps and Charts: Geospatial Ecology

By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab

I love maps. I love charts. As a random bit of trivia, there is a difference between a map and a chart. A map is a visual representation of land that may include details like topology, whereas a chart refers to nautical information such as water depth, shoreline, tides, and obstructions.

Map of San Diego, CA, USA. (Source: San Diego Metropolitan Transit System)
Chart of San Diego, CA, USA. (Source: NOAA)

I have an intense affinity for visually displaying information. As a child, my dad traveled constantly, from Barrow, Alaska to Istanbul, Turkey. Immediately upon his return, I would grab our standing globe from the dining room and our stack of atlases from the coffee table. I would sit at the kitchen table, enthralled at the stories of his travels. Yet, a story was only great when I could picture it for myself. (I should remind you, this was the early 1990s, GoogleMaps wasn’t a thing.) Our kitchen table transformed into a scene from Master and Commander—except, instead of nautical charts and compasses, we had an atlas the size of an overgrown toddler and salt and pepper shakers to pinpoint locations. I now had the world at my fingertips. My dad would show me the paths he took from our home to his various destinations and tell me about the topography, the demographics, the population, the terrain type—all attribute features that could be included in common-day geographic information systems (GIS).

Uncle Brian showing Alexa where they were on a map of Maui, Hawaii, USA. (Photo: Susan K. circa 1995)

As I got older, the kitchen table slowly began to resemble what I imagine the set from Master and Commander actually looked like; nautical charts, tide tables, and wind predictions were piled high and the salt and pepper shakers were replaced with pencil marks indicating potential routes for us to travel via sailboat. The two of us were in our element. Surrounded by visual and graphical representations of geographic and spatial information: maps. To put my map-attraction this in even more context, this is a scientist who grew up playing “Take-Off”, a board game that was “designed to teach geography” and involved flying your fleet of planes across a Mercator projection-style mapboard. Now, it’s no wonder that I’m a graduate student in a lab that focuses on the geospatial aspects of ecology.

A precocious 3-year-old Alexa, sitting with the airplane pilot asking him a long list of travel-related questions (and taking his captain’s hat). Photo: Susan K.

So why and how did geospatial ecology became a field—and a predominant one at that? It wasn’t that one day a lightbulb went off and a statistician decided to draw out the results. It was a progression, built upon for thousands of years. There are maps dating back to 2300 B.C. on Babylonian clay tablets (The British Museum), and yet, some of the maps we make today require highly sophisticated technology. Geospatial analysis is dynamic. It’s evolving. Today I’m using ArcGIS software to interpolate mass amounts of publicly-available sea surface temperature satellite data from 1981-2015, which I will overlay with a layer of bottlenose dolphin sightings during the same time period for comparison. Tomorrow, there might be a new version of software that allows me to animate these data. Heck, it might already exist and I’m not aware of it. This growth is the beauty of this field. Geospatial ecology is made for us cartophiles (map-lovers) who study the interdependency of biological systems where location and distance between things matters.

Alexa’s grandmother showing Alexa (a very young cartographer) how to color in the lines. Source: Susan K. circa 1994

In a broader context, geospatial ecology communicates our science to all of you. If I posted a bunch of statistical outputs in text or even table form, your eyes might glaze over…and so might mine. But, if I displayed that same underlying data and results on a beautiful map with color-coded symbology, a legend, a compass rose, and a scale bar, you might have this great “ah-ha!” moment. That is my goal. That is what geospatial ecology is to me. It’s a way to SHOW my science, rather than TELL it.

Would you like to see this over and over again…?

A VERY small glimpse into the enormous amount of data that went into this map. This screenshot gave me one point of temperature data for a single location for a single day…Source: Alexa K.

Or see this once…?

Map made in ArcGIS of Coastal common bottlenose dolphin sightings between 1981-1989 with a layer of average sea surface temperatures interpolated across those same years. A picture really is worth a thousand words…or at least a thousand data points…Source: Alexa K.

For many, maps are visually easy to interpret, allowing quick message communication. Yet, there are many different learning styles. From my personal story, I think it’s relatively obvious that I’m, at least partially, a visual learner. When I was in primary school, I would read the directions thoroughly, but only truly absorb the material once the teacher showed me an example. Set up an experiment? Sure, I’ll read the lab report, but I’m going to refer to the diagrams of the set-up constantly. To this day, I always ask for an example. Teach me a new game? Let’s play the first round and then I’ll pick it up. It’s how I learned to sail. My dad described every part of the sailboat in detail and all I heard was words. Then, my dad showed me how to sail, and it came naturally. It’s only as an adult that I know what “that blue line thingy” is called. Geospatial ecology is how I SEE my research. It makes sense to me. And, hopefully, it makes sense to some of you!

Alexa’s dad teaching her how to sail. (Source: Susan K. circa 2000)
Alexa’s first solo sailboat race in Coronado, San Diego, CA. Notice: Alexa’s dad pushing the bow off the dock and the look on Alexa’s face. (Source: Susan K. circa 2000)
Alexa mapping data using ArcGIS in the Oregon State University Library. (Source: Alexa K circa a few minutes prior to posting).

I strongly believe a meaningful career allows you to highlight your passions and personal strengths. For me, that means photography, all things nautical, the great outdoors, wildlife conservation, and maps/charts.  If I converted that into an equation, I think this is a likely result:

Photography + Nautical + Outdoors + Wildlife Conservation + Maps/Charts = Geospatial Ecology of Marine Megafauna

Or, better yet:

📸 + ⚓ + 🏞 + 🐋 + 🗺 =  GEMM Lab

This lab was my solution all along. As part of my research on common bottlenose dolphins, I work on a small inflatable boat off the coast of California (nautical ✅, outdoors ✅), photograph their dorsal fin (photography ✅), and communicate my data using informative maps that will hopefully bring positive change to the marine environment (maps/charts ✅, wildlife conservation✅). Geospatial ecology allows me to participate in research that I deeply enjoy and hopefully, will make the world a little bit of a better place. Oh, and make maps.

Alexa in the field, putting all those years of sailing and chart-reading to use! (Source: Leila L.)