From Bytes to Behaviors: How AI is Used to Study Whales

By Natalie Chazal, PhD student, OSU Department of Fisheries, Wildlife, & Conservation Sciences, Geospatial Ecology of Marine Megafauna Lab

In today’s media, artificial intelligence, or AI, has captured headlines that can stir up strong emotions and opinions. From promises of seemingly impossible breakthroughs to warnings of job displacement and ethical dilemmas, there is a lot of discourse surrounding AI. 

But what actually is artificial intelligence? The term artificial intelligence (or AI) was defined as “the science and engineering of making intelligent machines” and can generally describe a suite of methods used to simulate human information processing. 

AI actually began in the 1950s with puzzle solving robots and networks that identified shapes. But because the computational power required to run these complex networks was too high and funding cuts, there was an “AI winter” for the following decades. In the 1990’s there was a boom in advancement following renewed interest in AI, advancements in machine learning algorithms, and improved computational power. The 2010’s saw a resurgence of deep learning (a subfield of AI) designed because of the availability of large datasets and optimization algorithm improvements. Currently, AI is being used in extremely diverse ways because of its ability to handle large quantities of unstructured data.

Figure 1. An intuitive visualization of the nested relationship between AI, machine learning, and deep learning as subdomains (Rubbens et al. 2023)

To place AI in a better context, we should clarify some of the buzz words I’ve mentioned: artificial intelligence (AI), machine learning, and deep learning. There are a few schools of thought, but one that is generally accepted is that AI is a broad category of methods and techniques of systems that function to mimic human intelligence. Machine learning falls under this AI category but rather than using explicitly programmed rules to make decisions, we “train” these systems so that they are essentially learning from the data that we provide. Lastly, deep learning falls under machine learning because it uses the principles of “learning” from the data to build neural networks.

While AI is generally rooted in computer science, statistics provides the foundation for AI techniques. In particular, statistical learning is a combined field that adopts machine learning methods for more statistics based settings. Trevor Hastie, a leader in statistical learning, defines the field as “a set of tools for modeling and understanding complex datasets” (Hastie et al. 2009) and is used to explore patterns in data but within a statistical framework. 

Continuously improving methods like statistical learning and AI provide us with very powerful tools to collect data, automate processing, handle large datasets, and understand complex processes. 

How do marine mammal ecologists employ AI?

Even on small scales, marine mammal research often involves vast amounts of data collected from tons of different sources, including drone and satellite imagery, acoustic recordings, boat surveys, buoys, and many more. New deep learning tools, such as neural networks, are able to perform tasks with remarkable precision and speed that we traditionally needed to painstakingly do manually. For example, researchers spend hours poring over thousands of drone images and videos to understand the behavior and health of whales. In the GEMM Lab, postdoc KC Bierlich is leading the development of AI models to automatically measure important whale metrics from the images. These advancements streamline the process of understanding whale ecology and makes it easier to identify stressors that may be affecting these animals.

For photographic analyses, we can leverage Convolutional Neural Networks for tasks like feature extraction, where we can automatically get morphological measurements like body length and body area indices from drone imagery to understand the health of whales. This can provide valuable insight into the stressors placed on these animals. 

We can also identify whale species from boat and aerial imagery (Patton et al. 2023). Projects like Flukebook and Happywhale have even been able to identify individual humpback whales with techniques like this one. 

Figure 2. Flukebook neural networks can use the edges of flukes to identify individuals by mapping marks to a library of known individuals (Flukebook)

AI also excels at prediction especially with non-linear responses. Ecology is filled with thresholds, stepwise changes, and chaos that may not be captured by linear models. But being able to predict these responses is particularly important when we want to look at how whale populations respond to different facets of their environment. Ensemble machine learning algorithms like Random Forests or Gradient Boosting Machines are very common to model species-habitat relationships and can predict how whale distributions will change in response to changes in things like sea surface temperature or ocean currents (Viquerat et al. 2022). 

Even spatial data, which can be tricky to work with analytically, can be used in a machine learning framework. Data from satellite and acoustic tags can be analyzed from hidden Markov models and Gaussian mixture models. The results of these could potentially identify diving behaviors, habitat preferences, identify migration corridors, and aid in marine spatial planning (Quick et al. 2017; Lennox et al. 2019). 

While all of these projects and methods are very exciting, AI is not a panacea. We have to take into account the amount of data that AI models rely on. Some of these methods require very high resolutions of data and without adequate quantity to train the models, results can be biased or produce inaccurate predictions. Data deficiency can be especially problematic for rare, elusive, and quiet animals. Methods that utilize complex architectures and non-linear transformations can often be viewed as “black box” and difficult to interpret at first. However, there are some methods that can be used to retrace the steps of the model and create a pathway of understanding for the results that can help interpretability. AI also requires supervision. While AI methods can operate autonomously, oversight and evaluation are always necessary to validate their reliability in their application.  Lastly, there are also concerns about the use of AI (particularly Large Language Models) in scientific writing, but that’s a whole separate beast. 

With careful consideration, AI can be a powerful method for addressing the unique and challenging problems in marine mammal research. 

Using AI to find dinner

Last fall, I wrote a blog post to introduce my project that involves looking at echograms from the past 8 years of GRANITE effort to characterize prey availability within our study region of the Oregon coast. To automate the process of finding zooplankton swarms in 8 years of echosounder data, I’m planning to utilize deep learning methods to look for structures in our echogram that look like mysid swarms. Instead of reviewing over 500 hours of echosounder data to manually identify mysid swarms (which may produce biased or inaccurate results from human error), I can apply AI methods to process the echogram data with speed and consistent rules. I’ll specifically be using image segmentation, which can fall under any of the AI, machine learning, or deep learning umbrellas depending on the specific algorithms used. 

Another way AI can come into my project is after I gather the mysid swarm data from the image segmentation. While the exact structure of this resulting relative zooplankton abundance data will influence how I can use it, I could combine these prey data at a given place and time with a suite of environmental parameters to make predictions about the health and behavior of PCFG gray whales. This type of analysis could involve models that fall within AI and machine learning similar to the Boosted Regression Trees used by GEMM Labs postdoc, Dawn Barlow. Barlow et al. (2020) used Boosted Regression Trees to test the predictive relationships between oceanographic variables, relative krill abundance, and blue whale presence. Based on that work, Barlow et al. (2022) was able to develop a forecasting model based on these relationships to predict where blue whales will be in New Zealand’s South Taranaki Bight (read more about this conservation tool here!).

Hopefully by now you’ve gained a better sense of what AI actually is and its application in marine mammal ecology. AI is a powerful tool and has its value, but is not always a substitute for more established methods. By carefully integrating AI methodologies with other techniques, we can leverage the strengths of both and enhance existing approaches. The GEMM Lab aims to use AI methods to observe and understand the intricacies of whale ecology more accurately and efficiently to ultimately support effective conservation strategies.


  1. Rubbens, P., Brodie, S., Cordier, T., Destro Barcellos, D., Devos, P., Fernandes-Salvador, J.A., Fincham, J.I., Gomes, A., Handegard, N.O., Howell, K., Jamet, C., Kartveit, K.H., Moustahfid, H., Parcerisas, C., Politikos, D., Sauzède, R., Sokolova, M., Uusitalo, L., Van den Bulcke, L., van Helmond, A.T.M., Watson, J.T., Welch, H., Beltran-Perez, O., Chaffron, S., Greenberg, D.S., Kühn, B., Kiko, R., Lo, M., Lopes, R.M., Möller, K.O., Michaels, W., Pala, A., Romagnan, J.-B., Schuchert, P., Seydi, V., Villasante, S., Malde, K., Irisson, J.-O., 2023. Machine learning in marine ecology: an overview of techniques and applications. ICES Journal of Marine Science 80, 1829–1853.
  2. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Stanford, CA: Stanford University.
  3. Slonimer, A.L., Dosso, S.E., Albu, A.B., Cote, M., Marques, T.P., Rezvanifar, A., Ersahin, K., Mudge, T., Gauthier, S., 2023. Classification of Herring, Salmon, and Bubbles in Multifrequency Echograms Using U-Net Neural Networks. IEEE Journal of Oceanic Engineering 48, 1236–1254.
  4. Viquerat, S., Waluda, C.M., Kennedy, A.S., Jackson, J.A., Hevia, M., Carroll, E.L., Buss, D.L., Burkhardt, E., Thain, S., Smith, P., Secchi, E.R., Santora, J.A., Reiss, C., Lindstrøm, U., Krafft, B.A., Gittins, G., Dalla Rosa, L., Biuw, M., Herr, H., 2022. Identifying seasonal distribution patterns of fin whales across the Scotia Sea and the Antarctic Peninsula region using a novel approach combining habitat suitability models and ensemble learning methods. Frontiers in Marine Science 9.
  5. Patton, P.T., Cheeseman, T., Abe, K., Yamaguchi, T., Reade, W., Southerland, K., Howard, A., Oleson, E.M., Allen, J.B., Ashe, E., Athayde, A., Baird, R.W., Basran, C., Cabrera, E., Calambokidis, J., Cardoso, J., Carroll, E.L., Cesario, A., Cheney, B.J., Corsi, E., Currie, J., Durban, J.W., Falcone, E.A., Fearnbach, H., Flynn, K., Franklin, T., Franklin, W., Galletti Vernazzani, B., Genov, T., Hill, M., Johnston, D.R., Keene, E.L., Mahaffy, S.D., McGuire, T.L., McPherson, L., Meyer, C., Michaud, R., Miliou, A., Orbach, D.N., Pearson, H.C., Rasmussen, M.H., Rayment, W.J., Rinaldi, C., Rinaldi, R., Siciliano, S., Stack, S., Tintore, B., Torres, L.G., Towers, J.R., Trotter, C., Tyson Moore, R., Weir, C.R., Wellard, R., Wells, R., Yano, K.M., Zaeschmar, J.R., Bejder, L., 2023. A deep learning approach to photo–identification demonstrates high performance on two dozen cetacean species. Methods in Ecology and Evolution 14, 2611–2625.
  8. Quick, N.J., Isojunno, S., Sadykova, D., Bowers, M., Nowacek, D.P., Read, A.J., 2017. Hidden Markov models reveal complexity in the diving behaviour of short-finned pilot whales. Sci Rep 7, 45765.
  9. Lennox, R.J., Engler-Palma, C., Kowarski, K., Filous, A., Whitlock, R., Cooke, S.J., Auger-Méthé, M., 2019. Optimizing marine spatial plans with animal tracking data. Can. J. Fish. Aquat. Sci. 76, 497–509.
  10. Barlow, D.R., Bernard, K.S., Escobar-Flores, P., Palacios, D.M., Torres, L.G., 2020. Links in the trophic chain: modeling functional relationships between in situ oceanography, krill, and blue whale distribution under different oceanographic regimes. Marine Ecology Progress Series 642, 207–225.
Print Friendly, PDF & Email

One thought on “From Bytes to Behaviors: How AI is Used to Study Whales”

  1. Fascinating- best explanation of AI I’ve ever read. And so interesting to read how it’s used in GEMM Lab research.
    Many thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *