By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab
It all started with a paper. On Halloween, I sat at my desk, searching for papers that could answer my questions about bottlenose dolphin metabolism and realized I had forgotten to check my email earlier. In my inbox, there was a new message with an attachment from Dr. Leigh Torres to the GEMM Lab members, saying this was a “must-read” article. The suggested paper was Martin A. Schwartz’s 2008 essay, “The importance of stupidity in scientific research”, published in the Journal of Cell Science, highlighted universal themes across science. In a single, powerful page, Schwartz captured my feelings—and those of many scientists: the feeling of being stupid.
For the next few minutes, I stood at the printer and absorbed the article, while commenting out loud, “YES!”, “So true!”, and “This person can see into my soul”. Meanwhile, colleagues entered my office to see me, dressed in my Halloween costume—as “Amazon’s Alexa”, talking aloud to myself. Coincidently, I was feeling pretty stupid at that moment after just returning from a weekly meeting, where everyone asked me questions that I clearly did not have the answers to (all because of my costume). This paper seemed too relevant; the timing was uncanny. In the past few weeks, I have been writing my PhD research proposal —a requirement for our department— and my goodness, have I felt stupid. The proposal outlines my dissertation objectives, puts my work into context, and provides background research on common bottlenose dolphin health. There is so much to know that I don’t know!
When I read Schwartz’s 2008 paper, there were a few takeaway messages that stood out:
People take different paths. One path is not necessarily right nor wrong. Simply, different. I compared that to how I split my time between OSU and San Diego, CA. Spending half of the year away from my lab and my department is incredibly challenging; I constantly feel behind and I miss the support that physically being with other students provides. However, I recognize the opportunities I have in San Diego where I work directly with collaborators who teach and challenge me in new ways that bring new skills and perspective.
Drawing upon experts—albeit intimidating—is beneficial for scientific consulting as well as for our mental health; no one person knows everything. That statement can bring us together because when people work together, everyone benefits. I am also reminded that we are our own harshest critics; sometimes our colleagues are the best champions of our own successes. It is also why historical articles are foundational. In the hunt for the newest technology and the latest and greatest in research, it is important to acknowledge the basis for discoveries. My data begins in 1981, when the first of many researchers began surveying the California coastline for common bottlenose dolphins. Geographic information systems (GIS) were different back then. The data requires conversions and investigative work. I had to learn how the data were collected and how to interpret that information. Therefore, it should be no surprise that I cite literature from the 1970s, such as “Results of attempts to tag Atlantic Bottlenose dolphins, (Tursiops truncatus)” by Irvine and Wells. Although published in 1972, the questions the authors tried to answer are very similar to what I am looking at now: how are site fidelity and home ranges impacted by natural and anthropogenic processes. While Irvine and Wells used large bolt tags to identify individuals, my project utilizes much less invasive techniques (photo-identification and blubber biopsies) to track animals, their health, and their exposures to contaminants.
Struggling is part of the solution. Science is about discovery and without the feeling of stupidity, discovery would not be possible. Feeling stupid is the first step in the discovery process: the spark that fuels wanting to explore the unknown. Feeling stupid can lead to the feeling of accomplishment when we find answers to those very questions that made us feel stupid. Part of being a student and a scientist is identifying those weaknesses and not letting them stop me. Pausing, reflecting, course correcting, and researching are all productive in the end, but stopping is not. Coursework is the easy part of a PhD. The hard part is constantly diving deeper into the great unknown that is research. The great unknown is simultaneously alluring and frightening. Still, it must be faced head on. Schwartz describes “productive stupidity [as] being ignorant by choice.” I picture this as essentially blindly walking into the future with confidence. Although a bit of an oxymoron, it resonates the importance of perseverance and conviction in the midst of uncertainty.
Now I think back to my childhood when stupid was one of the forbidden “s-words” and I question whether society had it all wrong. Maybe we should teach children to acknowledge ignorance and pursue the unknown. Stupid is a feeling, not a character flaw. Stupidity is important in science and in life. Fascination and emotional desires to discover new things are healthy. Next time you feel stupid, try running with it, because more often than not, you will learn something.
Solène Derville, Entropie Lab, French National Institute for Sustainable Development (IRD – UMR Entropie), Nouméa, New Caledonia
Ph.D. student under the co-supervision of Dr. Leigh Torres
Species Distribution Models (SDM), also referred to as ecological niche models, may be defined as “a model that relates species distribution data (occurrence or abundance at known locations) with information on the environmental and/or spatial characteristics of those locations” (Elith & Leathwick, 2009). In the last couple decades, SDMs have become an indispensable part of the ecologists’ and conservationists’ toolbox. What scientist has not dreamed of being able to summarize a species’ environmental requirements and predict where and when it will occur, all in one tiny statistical model? It sounds like magic… but the short acronym “SDM” is the pretty front window of an intricate and gigantic research field that may extend way beyond the skills of a typical ecologist (even so for a graduate student like myself).
As part of my PhD thesis about the spatial ecology of humpback whales in New Caledonia, South Pacific, I was planning on producing a model to predict their distribution in the region and help spatial planning within the Natural Park of the Coral Sea. An innocent and seemingly perfectly feasible plan for a second year PhD student. To conduct this task, I had at my disposal more than 1,000 sightings recorded during dedicated surveys at sea conducted over 14 years. These numbers seem quite sufficient, considering the rarity of cetaceans and the technical challenges of studying them at sea. And there was more! The NGO Opération Cétacés also recorded over 600 sightings reported by the general public in the same time period and deployed more than 40 satellite tracking tags to follow individual whale movements. In a field where it is so hard to acquire data, it felt like I had to use it all, though I was not sure how to combine all these types of data, with their respective biases, scales and assumptions.
One important thing about SDM to remember: it is like a cracker section in a US grocery shop, there is sooooo much choice! As I reviewed the possibilities and tested various modeling approaches on my data I realized that this study might be a good opportunity to contribute to the SDM field, by conducting a comparison of various algorithms using cetacean occurrence data from multiple sources. The results of this work was just published in Diversity and Distributions:
Derville S, Torres LG, Iovan C, Garrigue C. (2018) Finding the right fit: Comparative cetacean distribution models using multiple data sources and statistical approaches. Divers Distrib. 2018;00:1–17. https://doi. org/10.1111/ddi.12782
If you are a new-comer to the SDM world, and specifically its application to the marine environment, I hope you find this interesting. If you are a seasoned SDM user, I would be very grateful to read your thoughts in the comment section! Feel free to disagree!
So what is the take-home message from this work?
There is no such thing as a “best model”; it all depends on what you want your model to be good at (the descriptive vs predictive dichotomy), and what criteria you use to define the quality of your models.
The predictive vs descriptive goal of the model: This is a tricky choice to make, yet it should be clearly identified upfront. Most times, I feel like we want our models to be decently good at both tasks… It is a risky approach to blindly follow the predictions of a complex model without questioning the meaning of the ecological relationships it fitted. On the other hand, conservation applications of models often require the production of predicted maps of species’ probability of presence or habitat suitability.
The criteria for model selection: How could we imagine that the complexity of animal behavior could be summarized in a single metric, such as the famous Akaike Information criterion (AIC) or the Area under the ROC Curve (AUC)? My study, and that of others (e.g. Elith & Graham H., 2009), emphasize the importance of looking at multiple aspects of model outputs: raw performance through various evaluation metrics (e.g. see AUCdiff; (Warren & Seifert, 2010), contribution of the variables to the model, shape of the fitted relationships through Partial Dependence Plots (PDP, Friedman, 2001), and maps of predicted habitat suitability and associated error. Spread all these lines of evidence in front of you, summarize all the metrics, add a touch of critical ecological thinking to decide on the best approach for your modeling question, and Abracadabra! You end up a bit lost in a pile of folders… But at least you assessed the quality of your work from every angle!
Cetacean SDMs often serve a conservation goal. Hence, their capacity to predict to areas / times that were not recorded in the data (which is often scarce) is paramount. This extrapolation performance may be restricted when the model relationships are overfitted, which is when you made your model fit the data so closely that you are unknowingly modeling noise rather than a real trend. Using cross-validation is a good method to prevent overfitting from happening (for a thorough review: Roberts et al., 2017). Also, my study underlines that certain algorithms inherently have a tendency to overfit. We found that Generalized Additive Models and MAXENT provided a valuable complexity trade-off to promote the best predictive performance, while minimizing overfitting. In the case of GAMs, I would like to point out the excellent documentation that exist on their use (Wood, 2017), and specifically their application to cetacean spatial ecology (Mannocci, Roberts, Miller, & Halpin, 2017; Miller, Burt, Rexstad, & Thomas, 2013; Redfern et al., 2017).
Citizen science is a promising tool to describe cetacean habitat. Indeed, we found that models of habitat suitability based on citizen science largely converged with those based on our research surveys. The main issue encountered when modeling this type of data is the absence of “effort”. Basically, we know where people observed whales, but we do not know where they haven’t… or at least not with the accuracy obtained from research survey data. However, with some information about our citizen scientists and a little deduction, there is actually a lot you can infer about opportunistic data. For instance, in New Caledonia most of the sightings were reported by professional whale-watching operators or by the general public during fishing/diving/boating day trips. Hence, citizen scientists rarely stray far from harbors and spend most of their time in the sheltered waters of the New Caledonian lagoon. This reasoning provides the sort of information that we integrated in our modeling approach to account for spatial sampling bias of citizen science data and improve the model’s predictive performance.
Many more technical aspects of SDM are brushed over in this paper (for detailed and annotated R codes of the modeling approaches, see supplementary information of our paper). There are a few that are not central to the paper, but that I think are worth sharing:
Collinearity of predictors: Have you ever found that the significance of your predictors completely changed every time you removed a variable? I have progressively come to discover how unstable a model can be because of predictor collinearity (and the uneasy feeling that comes with it …). My new motto is to ALWAYS check cross-correlation between my predictors, and do it THOROUGHLY. A few aspects that may make a big difference in the estimation of collinearity patterns are to: (1) calculate Pearson vs Spearman coefficients, (2) check correlations between the values recorded at the presence points vs over the whole study area, and (3) assess the correlations between raw environmental variables vs between transformed variables (log-transformed, etc). Though selecting variables with Pearson coefficients < 0.7 is usually a good rule (Dormann et al., 2013), I would worry of anything above 0.5, or at least keep it in mind during model interpretation.
Cross-validation: If removing 10% of my dataset greatly impacts the model results, I feel like cross-validation is critical. The concept is based on a simple assumption, if I had sampled a given population/phenomenon/system slightly differently, would I have come to the same conclusion? Cross-validation comes in many different methods, but the basic concept is to run the same model several times (number of times may depend on the size of your data set, hierarchical structure of your data, computation power of your computer, etc.) over different chunks of your data. Model performance metrics (e.g., AUC) and outputs (e.g., partial dependence plots) are than summarized on the many runs, using mean/median and standard deviation/quantiles. It is up to you how to pick these chunks, but before doing this at random I highly recommend reading Roberts et al. (2017).
The evil of the R2: I am probably not the first student to feel like what I have learned in my statistical classes at school is in practice, at best, not very useful, and at worst, dangerously misleading. Of course, I do understand that we must start somewhere, and that learning the basics of inferential statistics is a necessary step to, one day, be able to answer your one research questions. Yet, I feel like I have been carrying the “weight of the R2” for far too long before actually realizing that this metric of model performance (R2 among others) is simply not enough to trust my results. You might think that your model is robust because among the 1000 alternative models you tested, it is the one with the “best” performance (deviance explained, AIC, you name it), but the model with the best R2 will not always be the most ecologically meaningful one, or the most practical for spatial management perspectives. Overfitting is like a sword of Damocles hanging over you every time you create a statistical model All together, I sometimes trust my supervisor’s expertise and my own judgment more than an R2.
A few good websites/presentations that have helped me through my SDM journey:
Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., … Lautenbach, S. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 027–046. https://doi.org/10.1111/j.1600-0587.2012.07348.x
Elith, J., & Graham H., C. (2009). Do they? How do they? WHY do they differ? On ﬁnding reasons for differing performances of species distribution models . Ecography, 32(Table 1), 66–77. https://doi.org/10.1111/j.1600-0587.2008.05505.x
Elith, J., & Leathwick, J. R. (2009). Species Distribution Models: Ecological Explanation and Prediction Across Space and Time. Annual Review of Ecology, Evolution, and Systematics, 40(1), 677–697. https://doi.org/10.1146/annurev.ecolsys.110308.120159
Friedman, J. H. (2001). Greedy Function Approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. Retrieved from http://www.jstor.org/stable/2699986
Mannocci, L., Roberts, J. J., Miller, D. L., & Halpin, P. N. (2017). Extrapolating cetacean densities to quantitatively assess human impacts on populations in the high seas. Conservation Biology, 31(3), 601–614. https://doi.org/10.1111/cobi.12856.This
Miller, D. L., Burt, M. L., Rexstad, E. A., & Thomas, L. (2013). Spatial models for distance sampling data: Recent developments and future directions. Methods in Ecology and Evolution, 4(11), 1001–1010. https://doi.org/10.1111/2041-210X.12105
Redfern, J. V., Moore, T. J., Fiedler, P. C., de Vos, A., Brownell, R. L., Forney, K. A., … Ballance, L. T. (2017). Predicting cetacean distributions in data-poor marine ecosystems. Diversity and Distributions, 23(4), 394–408. https://doi.org/10.1111/ddi.12537
Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera-Arroita, G., … Dormann, C. F. (2017). Cross-validation strategies for data with temporal, spatial, hierarchical or phylogenetic structure. Ecography, 0, 1–17. https://doi.org/10.1111/ecog.02881
Warren, D. L., & Seifert, S. N. (2010). Ecological niche modeling in Maxent: the importance of model complexity and the performance of model selection criteria. Ecological Applications, 21(2), 335–342. https://doi.org/10.1890/10-1171.1
Wood, S. N. (2017). Generalized additive models: an introduction with R (second edi). CRC press.
By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab
I love maps. I love charts. As a random bit of trivia, there is a difference between a map and a chart. A map is a visual representation of land that may include details like topology, whereas a chart refers to nautical information such as water depth, shoreline, tides, and obstructions.
I have an intense affinity for visually displaying information. As a child, my dad traveled constantly, from Barrow, Alaska to Istanbul, Turkey. Immediately upon his return, I would grab our standing globe from the dining room and our stack of atlases from the coffee table. I would sit at the kitchen table, enthralled at the stories of his travels. Yet, a story was only great when I could picture it for myself. (I should remind you, this was the early 1990s, GoogleMaps wasn’t a thing.) Our kitchen table transformed into a scene from Master and Commander—except, instead of nautical charts and compasses, we had an atlas the size of an overgrown toddler and salt and pepper shakers to pinpoint locations. I now had the world at my fingertips. My dad would show me the paths he took from our home to his various destinations and tell me about the topography, the demographics, the population, the terrain type—all attribute features that could be included in common-day geographic information systems (GIS).
As I got older, the kitchen table slowly began to resemble what I imagine the set from Master and Commander actually looked like; nautical charts, tide tables, and wind predictions were piled high and the salt and pepper shakers were replaced with pencil marks indicating potential routes for us to travel via sailboat. The two of us were in our element. Surrounded by visual and graphical representations of geographic and spatial information: maps. To put my map-attraction this in even more context, this is a scientist who grew up playing “Take-Off”, a board game that was “designed to teach geography” and involved flying your fleet of planes across a Mercator projection-style mapboard. Now, it’s no wonder that I’m a graduate student in a lab that focuses on the geospatial aspects of ecology.
So why and how did geospatial ecology became a field—and a predominant one at that? It wasn’t that one day a lightbulb went off and a statistician decided to draw out the results. It was a progression, built upon for thousands of years. There are maps dating back to 2300 B.C. on Babylonian clay tablets (The British Museum), and yet, some of the maps we make today require highly sophisticated technology. Geospatial analysis is dynamic. It’s evolving. Today I’m using ArcGIS software to interpolate mass amounts of publicly-available sea surface temperature satellite data from 1981-2015, which I will overlay with a layer of bottlenose dolphin sightings during the same time period for comparison. Tomorrow, there might be a new version of software that allows me to animate these data. Heck, it might already exist and I’m not aware of it. This growth is the beauty of this field. Geospatial ecology is made for us cartophiles (map-lovers) who study the interdependency of biological systems where location and distance between things matters.
In a broader context, geospatial ecology communicates our science to all of you. If I posted a bunch of statistical outputs in text or even table form, your eyes might glaze over…and so might mine. But, if I displayed that same underlying data and results on a beautiful map with color-coded symbology, a legend, a compass rose, and a scale bar, you might have this great “ah-ha!” moment. That is my goal. That is what geospatial ecology is to me. It’s a way to SHOW my science, rather than TELL it.
Would you like to see this over and over again…?
Or see this once…?
For many, maps are visually easy to interpret, allowing quick message communication. Yet, there are many different learning styles. From my personal story, I think it’s relatively obvious that I’m, at least partially, a visual learner. When I was in primary school, I would read the directions thoroughly, but only truly absorb the material once the teacher showed me an example. Set up an experiment? Sure, I’ll read the lab report, but I’m going to refer to the diagrams of the set-up constantly. To this day, I always ask for an example. Teach me a new game? Let’s play the first round and then I’ll pick it up. It’s how I learned to sail. My dad described every part of the sailboat in detail and all I heard was words. Then, my dad showed me how to sail, and it came naturally. It’s only as an adult that I know what “that blue line thingy” is called. Geospatial ecology is how I SEE my research. It makes sense to me. And, hopefully, it makes sense to some of you!
I strongly believe a meaningful career allows you to highlight your passions and personal strengths. For me, that means photography, all things nautical, the great outdoors, wildlife conservation, and maps/charts. If I converted that into an equation, I think this is a likely result:
This lab was my solution all along. As part of my research on common bottlenose dolphins, I work on a small inflatable boat off the coast of California (nautical ✅, outdoors ✅), photograph their dorsal fin (photography ✅), and communicate my data using informative maps that will hopefully bring positive change to the marine environment (maps/charts ✅, wildlife conservation✅). Geospatial ecology allows me to participate in research that I deeply enjoy and hopefully, will make the world a little bit of a better place. Oh, and make maps.
By Alexa Kownacki, Ph.D. Student, OSU Department of Fisheries and Wildlife, Geospatial Ecology of Marine Megafauna Lab
This was the very first lecture slide in my population dynamics course at UC Davis. Population dynamics was infamous in our department for being an ultimate rite of passage due to it’s notoriously challenge curriculum. So, when Professor Lou Botsford pointed to his slide, all 120 of us Wildlife, Fish, and Conservation Biology majors, didn’t know how to react. Finally, he announced, “This [pointing to the slide] is all of you”. The class laughed. Lou smirked. Lou knew.
Lou knew that there is more truth to this meme than words could express. I can’t tell you how many times friends and acquaintances have asked me if I was going to be a park ranger. Incredibly, not all—or even most—wildlife biologists are park rangers. I’m sure that at one point, my parents had hoped I’d be holding a tiger cub as part of a conservation project—that has never happened. Society may think that all wildlife biologists want to walk in the footsteps of the famous Steven Irwin and say thinks like “Crikey!”—but I can’t remember the last time I uttered that exclamation with the exception of doing a Steve Irwin impression. Hollywood may think we hug trees—and, don’t get me wrong, I love a good tie-dyed shirt—but most of us believe in the principles of conservation and wise-use A.K.A. we know that some trees must be cut down to support our needs. Helicoptering into a remote location to dart and take samples from wild bear populations…HA. Good one. I tell myself this is what I do sometimes, and then the chopper crashes and I wake up from my dream. But, actually, a scientist staring at a computer with stacks of papers spread across every surface, is me and almost every wildlife biologist that I know.
There is an illusion that wildlife biologists are constantly in the field doing all the cool, science-y, outdoors-y things while being followed by a National Geographic photojournalist. Well, let me break it to you, we’re not. Yes, we do have some incredible opportunities. For example, I happen to know that one lab member (eh-hem, Todd), has gotten up close and personal with wild polar bear cubs in the Arctic, and that all of us have taken part in some work that is worthy of a cover image on NatGeo. We love that stuff. For many of us, it’s those few, memorable moments when we are out in the field, wearing pants that we haven’t washed in days, and we finally see our study species AND gather the necessary data, that the stars align. Those are the shining lights in a dark sea of papers, grant-writing, teaching, data management, data analysis, and coding. I’m not saying that we don’t find our desk work enjoyable; we jump for joy when our R script finally runs and we do a little dance when our paper is accepted and we definitely shed a tear of relief when funding comes through (or maybe that’s just me).
What I’m trying to get at is that we accepted our fates as the “scientists in front of computers surrounded by papers” long ago and we embrace it. It’s been almost five years since I was a senior in undergrad and saw this meme for the first time. Five years ago, I wanted to be that scientist surrounded by papers, because I knew that’s where the difference is made. Most people have heard the quote by Mahatma Gandhi, “Be the change that you wish to see in the world.” In my mind, it is that scientist combing through relevant, peer-reviewed scientific papers while writing a compelling and well-researched article, that has the potential to make positive changes. For me, that scientist at the desk is being the change that he/she wish to see in the world.
One of my favorite people to colloquially reference in the wildlife biology field is Milton Love, a research biologist at the University of California Santa Barbara, because he tells it how it is. In his oh-so-true-it-hurts website, he has a page titled, “So You Want To Be A Marine Biologist?” that highlights what he refers to as, “Three really, really bad reasons to want to be a marine biologist” and “Two really, really good reasons to want to be a marine biologist”. I HIGHLY suggest you read them verbatim on his site, whether you think you want to be a marine biologist or not because they’re downright hilarious. However, I will paraphrase if you just can’t be bothered to open up a new tab and go down a laugh-filled wormhole.
Really, Really Bad Reasons to Want to be a Marine Biologist:
To talk to dolphins. Hint: They don’t want to talk to you…and you probably like your face.
You like Jacques Cousteau. Hint: I like cheese…doesn’t mean I want to be cheese.
Hint: Lack thereof.
Really, Really Good Reasons to Want to be a Marine Biologist:
Work attire/attitude. Hint: Dress for the job you want finally translates to board shorts and tank tops.
You like it. *BINGO*
In summary, as wildlife or marine biologists we’ve taken a vow of poverty, and in doing so, we’ve committed ourselves to fulfilling lives with incredible experiences and being the change we wish to see in the world. To those of you who want to pursue a career in wildlife or marine biology—even after reading this—then do it. And to those who don’t, hopefully you have a better understanding of why wearing jeans is our version of “business formal”.