Demystifying AI: a brief overview of Image-Pre-Processing and a Machine Learning Workflow

Celest Sorrentino, MSc student, OSU Dept of Fisheries, Wildlife and Conservation Sciences, GEMM Lab

The first memory I have of A.I. (Artificial Intelligence) stems from one of my favorite movies growing up: I, Robot (2004). Shifting focus from the sci-fi thriller plot, the distinguished notion of a machine integrating into normal everyday life to perform human tasks, such as converse and automate labor, sparks intrigue. In 2014, my own realization of sci-fi fantasy turned reality initiated with the advertisements of self-driving cars by TESLA. But how does one go from a standard tool, like a vehicle, to an automated machine?

**Fig 1:** Tesla Self-Driving car, image by Bloomberg.com

For my first thesis chapter, I am applying a machine learning model to our lab’s drone video dataset to understand whale mother-calf associations, which is in continuation of my previous internship in 2022. A.I. has absolutely skyrocketed in marine science and hundreds of papers have confirmed the advantage in using machine learning models, such as in species abundance estimates (Boulent et al 2023), whale morphometrics (Bierlich et al 2024), and even animal tracking (Periera et al 2022). Specifically, Dr. KC Bierlich recently led a publication on an incredible A.I. model that can extract still images from drone footage to be subsequently used for body morphometric analysis. Earlier this year my lab mate Nat wrote an insightful blog introducing the history of A.I. and how she uses A.I. for image segmentation to quantify mysid swarms. For those of us who study animal behavior and utilize video-based tools for observation, A.I. is a sweet treat we’ve been craving to speed up and improve our analyses —but where do we start?

With a Venn Diagram and definitions of course!

**Figure 1:** Venn diagram demonstrating the relationships of 4 subsets of AI (Machine learning, Deep-learning, Computer Vision, and Natural Language Processing) and how they relate to one another.

Good terms to know:

Artificial Intelligence: a machine/model built to mimic human intelligence.

Machine Learning: a subset of A.I. that uses statistical algorithms to recognize patterns and form predictions, usually requiring human intervention for correction.

Deep-learning: a specific form of machine learning that is meant to mimic human neural networks through artificial neural networks (ANN) by recognizing hierarchal patterns with minimal to no human-intervention to correct.

Computer Vision: a type of machine learning that enables a machine/model to gather and retain information from images, video, etc.

Natural Language Processing: a subset of machine learning in which a machine/model to identify, understand, and create text and speech.

(Still a bit confused? A great example of the difference between machine learning and deep-learning can be found here)

So, you have a dataset, what’s the pipeline?

**Figure 2:** How to go from your research question and use your dataset to using an A.I. model.

First, we must consider what type of data we have and our question. In fact, you might find these two questions are complimentary: What type of questions does our dataset inspire and/or what type of dataset is needed to answer our question?

Responses to these questions can guide whether A.I. is beneficial to invest in and which type to pursue. In my case, we have an imagery dataset (i.e., drone videos) and our question explores the relationship of mom-calf proximity as an indicator of calf-independence. Therefore, a model that employs Computer Vision is a sensible decision because we need a model that extracts information from imagery. From that decision, I then selected SLEAP A.I. as the deep-learning model I’ll use to identify and track animals in video (Pereira et al 2022).

**Figure 3:** A broad schematic of the workflow utilizing a computer vision* model. As detailed above, a computer vision model is a **machine learning** model that uses images/videos as a dataset to retain information.

Why is image pre-processing important?

Image pre-processing is an essential step in “cleaning” the imagery data into a functional and insightful format for the machine learning model to extract information. Although tedious to some, I find this to be an exciting yet challenging step to push my ability to reframe my own perspective into another, a trait I believe all researchers share.

A few methods for image/video preprocessing include Resizing, Grayscaling, Noise Reduction, Normalization, Binarization, and Contrast enhancement. I found the following definitions and Python code by Maahi Patel to be incredibly concise and helpful. (Medium.com)

• Resizing: Resizing images to a uniform size is important for machine learning algorithms to function properly. We can use OpenCV’s resize() method to resize images.
• Grayscaling: Converting color images to grayscale can simplify your image data and reduce computational needs for some algorithms. The cvtColor() method can be used to convert RGB to grayscale.
• Noise reduction: Smoothing, blurring, and filtering techniques can be applied to remove unwanted noise from images. The GaussianBlur () and medianBlur () methods are commonly used for this.
• Normalization: Normalization adjusts the intensity values of pixels to a desired range, often between 0 to 1. This step can improve the performance of machine learning models. Normalize () from scikit-image can be used for this.
• Binarization: Binarization converts grayscale images to black and white by thresholding. The threshold () method is used to binarize images in OpenCV.
• Contrast enhancement: The contrast of images can be adjusted using histogram equalization. The equalizeHist () method enhances the contrast of images.

When deciding which between these techniques is best to apply to a dataset, it can be useful to think ahead about how you ultimately intend to deploy this model.

Image/Video Pre-Processing Re-framing

Notice the deliberate selection of the word “mimic” in the above definition for A.I. Living in an expeditiously tech-hungry world, losing sight of A.I. as a mimicry of human intelligence, not a replica, is inevitable. However, our own human intelligence derives from years of experience and exposure, constantly evolving – a machine learning model** does not have this same basis. As a child, we began with phonetics, which lead to simple words, subsequently achieving strings of sentences to ultimately formulate conversations. In a sense, you might consider these steps as “training” as we had more exposure to “data.” Therefore, when approaching image-preprocessing for the initial training dataset for an A.I. model, it’s integral to recognize the image from the lens of a computer, not as a human researcher. With each image, reminding ourselves: What is and isn’t necessary in this image? What is extra “noise”? Do all the features within this image contribute to getting closer to my question?

Model Workflow: What’s Next?

Now that we have our question, model, and “cleaned” dataset, the next few steps are: (II) Image/Video Processing, (III) Labeling, (IV) Model Training, (V) Model Predictions, and (VI) Model Corrections, which leads us to the ultimate step of (VII) A.I. Model Deployment. Labeling is the act of annotating images/videos with classifications the annotator (me or you) deems important for the model to recognize. Next, Model Training, Model Predictions, and Model Corrections can be considered an integrated part of the workflow broken down into steps. Model Training takes place once all labeling is complete, which begins the process for the model to perform the task assigned (i.e., object detection, image segmentation, pose estimation, etc.). After training, we provide the model with new data to test its performance, entering the stage of Model Predictions. Once Predictions have been made, the annotator reviews these attempts and corrects any misidentifications or mistakes, resulting in another round of Model Training. Finally, once satisfied with the model’s Performance, Model Deployment begins, which integrates the model into a “real-world” application.

In the ceaselessly advancing field of A.I., sometimes it can feel like the learning never ends. However, I encourage you to welcome the uncharted territory with a curious mind. Just like with any field of science, errors can happen but, with the right amount of persistence, so can success. I hope this blog has helped as a step forward toward understanding A.I. as an asset and how you can utilize it too!

**granted you are using a machine learning model that is not a foundation model. A foundation model is one that has been pre-trained on a large diverse dataset that one can use as a basis (or foundation) to perform specialized tasks. (i.e. Open A.I. ChatGPT).

References:

Bierlich, K. C., Karki, S., Bird, C. N., Fern, A., & Torres, L. G. (2024). Automated body length and body condition measurements of whales from drone videos for rapid assessment of population health. Marine Mammal Science, 40(4). https://doi.org/10.1111/mms.13137

Boulent, J., Charry, B., Kennedy, M. M., Tissier, E., Fan, R., Marcoux, M., Watt, C. A., & Gagné-Turcotte, A. (2023). Scaling whale monitoring using deep learning: A human-in-the-loop solution for analyzing aerial datasets. Frontiers in Marine Science, 10. https://doi.org/10.3389/fmars.2023.1099479

Deep Learning vs Machine Learning: The Ultimate Battle. (2022, May 2). https://www.turing.com/kb/ultimate-battle-between-deep-learning-and-machine-learning

Jain, P. (2024, November 28). Breakdown: Simplify AI, ML, NLP, deep learning, Computer vision. Medium. https://medium.com/@jainpalak9509/breakdown-simplify-ai-ml-nlp-deep-learning-computer-vision-c76cd982f1e4

Pereira, T.D., Tabris, N., Matsliah, A. et al. SLEAP: A deep learning system for multi-animal pose tracking. Nat Methods 19, 486–495 (2022). https://doi.org/10.1038/s41592-022-01426-1

Patel, M. (2023, October 23). The Complete Guide to Image Preprocessing Techniques in Python. Medium. https://medium.com/@maahip1304/the-complete-guide-to-image-preprocessing-techniques-in-python-dca30804550c

Team, I. D. and A. (2024, November 25). AI vs. Machine learning vs. Deep learning vs. Neural networks. IBM/Think. https://www.ibm.com/think/topics/ai-vs-machine-learning-vs-deep-learning-vs-neural-networks

The Tesla Advantage: 1.3 Billion Miles of Data. (2016). Bloomberg.Com. https://www.bloomberg.com/news/articles/2016-12-20/the-tesla-advantage-1-3-billion-miles-of-data?embedded-checkout=true

Wolfewicz, A. (2024, September 30). Deep Learning vs. Machine Learning – What’s The Difference? https://levity.ai/blog/difference-machine-learning-deep-learning

Two Leaders Wearing Two Hats: A wrap-up of the 2024 TOPAZ/JASPER Field Season

Celest Sorrentino, incoming master’s student, OSU Dept of Fisheries, Wildlife and Conservation Sciences, GEMM Lab

Allison Dawn, PhD student, Clemson University Dept of Forestry and Environmental Conservation, GEMM Lab Alum

Allison:

Celest and I were co-leaders this year, so it only feels fitting to co-write our wrap-up blog for the 2024 field season.

This was my first year training the project leader while also leading the field team. I have to say that I think I learned as much as Celest did throughout this process! This hand-off process requires the two team leaders to get comfortable wearing two different hats. For me, I not only made sure the whole team grasped every aspect of the project within the two training weeks, but also ensured Celest knew the reasoning behind those decisions AND got to exercise her own muscles in decision making according to the many moving parts that comprise a field season: shifts in weather, team needs, and of course the dynamics of shared space at a field site with many other teams. With the limited hours of any given day, this is no small task for either of us, and requires foresight to know where to fit these opportunities for the leader-in-training during our day-to-day tasks.

During this summer, I certainly gained even more respect for how Lisa Hildebrand juggled “Team Heck Yeah” in 2021 while she trained me as leader. Lisa made sure to take me aside in the afternoon to let me in on her thought process before the next days work. I brought this model forward for Team Protein this year, with the added bonus that Celest and I got to room together. By the end of the day, our brains would be buzzing with final thoughts, concerns, and excitement. I will treasure many memories from this season, including the memory of our end-of-day debriefs before bed. Overall, it was an incredibly special process to slowly pass the reins to Celest. I leave this project knowing it is ready for its new era, as Celest is full of positive energy, enthusiasm, and most importantly, just as much passion for this project as the preceding leaders.

Fig. 1: Two leaders wearing two (massive) hats. Field season means you have to be adaptable, flexible, and make the most out of any situation, including sometimes having to move your own bed! We had a blast using our muscles for this; we are Team Protein after all!

Celest:

As I sit down in the field station classroom to write this blog, I realize I am sitting in the same seat where just 12 hours ago a room full of community members laughed and divided delicious blueberry crumble with each other.

We kicked the morning of our final day together off with a Team Protein high powered breakfast in Bandon to have some delicious fuel and let the giggles all out before our presentation. When Dr. Torres arrived, the team got a chance to reflect on the field season and share ideas for next season. Finally, the moment we had all been waiting for: at 5 PM Team Protein wrapped up our 2024 field station with our traditional Community Presentation.

Fig 2: Team breakfast at SunnySide Cafe in Bandon, which have delicious GF/DF options.

Within a month and a half, I transitioned from learning alongside each of the interns at the start of the season knowing only the basics of TOPAZ/JASPER, to eventually leading the team for the final stretch. The learning spurts were quite rapid and challenging, but I attribute my gained confidence to observing Allison lead. To say I have learned from Allison only the nitty-gritty whats and whys of TOPAZ/JASPER would not suffice, as in truth I observed the qualities needed to empower a team for 6 weeks. I have truly admired the genuine magnetic connection she established with each intern, and I hope to bring forth the same in future seasons to come.

Witnessing each intern (myself included!) begin the season completely new, to now explaining the significance of each task with ease to the very end was unlike any other. Presenting our field season recap to the Port Orford Community side-by-side with Sophia, Eden, Oceana, and Allison provided an incredible sense of pride and I am thrilled for the second TOPAZ/JASPER Decadel party in 2034 when we can uncover where this internship has taken us all.

…Until next season (:

Fig 3: Team Protein all together at the start of season all together.

Fig 4: Team Protein all smiles after wrapping up the season with the Community Presentation.

Fig 4: Our season by numbers for the 2024 TOPAZ/JASPER season!

Speeding Up, Slowing Down, and Choosing My Fig

Celest Sorrentino, incoming master’s student, OSU Dept of Fisheries, Wildlife, and Conservation Sciences, GEMM Lab

It’s late June, a week before I head back to the West Coast, and I’m working one of my last shifts as a server in New York. Summer had just turned on and the humidity was just getting started, but the sun brought about a liveliness in the air that was contagious. Our regulars traded the city heat for beaches in the Hamptons, so I stood by the door, watching the flow of hundreds upon hundreds of people fill the streets of Manhattan. My manager and I always chatted to pass the time between rushes, and he began to ask me how I felt to move across the country and start my master’s program so soon.

“I am so excited!” I beamed, “Also a bit nervous–”

“Nervous? Why?

“Are you nervous you’ll become the person you’re meant to be?”

As a first-generation Hispanic student, I found solace in working in hospitality. Working in a restaurant for four years was a means to support myself to attain an undergraduate degree–but I’d be lying if I said I didn’t also love it. I found joy in orchestrating a unique experience for strangers, who themselves brought their own stories to share, each day bestowing opportunity for new friendships or new lessons. This industry requires you to be quick on your feet (never mess with a hungry person’s cacio e pepe), exuding a sense of finesse, continuously alert to your client’s needs and desires all the while always exhibiting a specific ambiance.

So why leave to start my master’s degree?

Fig 1: Me as a server with one of my regulars before his trip to Italy. You can never go wrong with Italian!

For anyone I have not had the pleasure yet to meet, my name is Celest Sorrentino, an incoming master’s student in the GEMM Lab this fall. I am currently writing to you from the Port Orford Field Station, located along the charming south coast of Oregon. Although I am new to the South Coast, my relationship with the GEMM Lab is not, but rather has been warmly cultivated ever since the day I first stepped onto the third floor of the Gladys Valley Building, as an NSF REU intern just two summers ago. Since that particular summer, I have gravitated back to the GEMM Lab every summer since: last summer as a research technician and this summer as a co-lead for the TOPAZ/JASPER Project, a program I will continue to spearhead the next two summers. (The GEMM Lab and me, we just have something– what can I say?)

In the risk of cementing “cornball” to my identity, pursuing a life in whale research had always been my dream ever since I was a little girl. As I grew older, I found an inclination toward education, in particular a specific joy that could only be found when teaching others, whether that meant teaching the difference between “bottom-up” and “top-bottom” trophic cascades to my peers in college, teaching my 11 year old sister how to do fun braids for middle school, or teaching a room full of researchers how I used SLEAP A.I. to track gray whale mother-calf pairs in drone footage.

Onboarding to the TOPAZ/JASPER project was a new world to me, which required me to quickly learn the ins and outs of a program, and eventually being handed the reins of responsibility of the team, all within 1 month and a half. While the TOPAZ/JASPER 2024 team (aka Team Protein!) and I approach our 5th week of field season, to say we have learned “so much” is an understatement.

Our morning data collection commences at 6:30 AM, with each of us alternating daily between the cliff team and kayak team.

For kayak team, its imperative to assemble all supplies swiftly given that we’re in a race against time, to outrun the inevitable windy/foggy weather conditions. However, diligence is required; if you forget your paddles back at lab or if you run out of charged batteries, that’s less time on the water to collect data and more time for the weather to gain in on you. We speed up against the weather, but also slow down for the details.

Fig 2: Throwback to our first kayak training day with Oceana (left), Sophia(middle), and Eden (right).

For cliff team, we have joined teams with time. At some point within the last few weeks, each of us on the cliff have had to uncover the dexterity within to become true marine mammal observers (for five or six hours straight). Here we survey for any slight shift in a sea of blue that could indicate the presence of a whale– and once we do… its go time. Once a whale blows, miles offshore, the individual manning the theodolite has just a few seconds to find and focus the reticle before the blow dissipates into the wind. If they miss it… its one less coordinate of that whale’s track. We speed up against the whale’s blow, but also slow down for the details.

Fig 3: Cliff team tracking a whale out by Mill Rocks!

I have found the pattern of speeding up and slowing down are parallels outside of field work as well. In Port Orford specifically, slowing down has felt just as invigorating as the first breath one takes out of the water. For instance, the daily choice we make to squeeze 5 scientists into the world’s slowest elevator down to the lab every morning may not be practical in everyday life, but the extra minute looking at each other’s sleepy faces sets the foundation for our “go” mode. We also sit down after a day of fieldwork, as a team, eating our 5th version of pasta and meatballs while we continue our Hunger Games movie marathon from the night prior. And we chose our “off-day” to stroll among nature’s gentle giants, experiencing together the awe of the Redwoods trees.

Fig 4A & 4B: (A) Team Protein (Sophia, Oceana, Allison, Eden and I) slow morning elevator ride down to the lab. (B) Sophia hugging a tree at the Redwoods!

When my manager asked the above question, I couldn’t help but think upon an excerpt, popularly known as “The Fig Tree” by Sylvia Plath.

Fig 5: The Fig Tree excerpt by Sylvia Plath. Picture credits to @samefacecollective on Instagram.

For my fig tree, I imagine it as grandiose as those Redwood trees. What makes each of us choose one fig over the other is highly variable, just as our figs of possibilities, some of which we can’t make out quite yet. At some point along my life, the fig of owning a restaurant in the Big Apple propped up. But in that moment with my manager, I imagined my oldest fig, with little Celest sitting on the living room floor watching ocean documentaries and wanting nothing more than to conduct whale research, now winking at me as I start my master’s within the GEMM Lab. Your figs might be different from mine but what I believe we share in common is the alternating pace toward our fig. At times we need speeding up while other times we just need slowing down.

Then there’s that sweet spot in between where we can experience both, just as I have being a part of the 2024 TOPAZ/JASPER team.

Fig 6A and 6B: (A) My sister and I excited to go see some dolphins for the first time! (~2008). (B) Taking undergraduate graduation pics with my favorite whale plushy! (2023)

Fig 7: Team Protein takes on Port Orford Minimal Carnival, lots of needed booging after finishing field work!

Why is image pre-processing important?

Image/Video Pre-Processing Re-framing

Model Workflow: What’s Next?

Share this:

Share this:

Share this: