The life of a PhD student

I was starting to get a little jealous of Becky and Stephanie writing all the recent posts. So I’ve decided to make my own return to the blogosphere, today, by posting my answers to a few questions sent to me as part of a high school assignment. I enjoyed taking the time to answer them, and think they fairly succinctly summarize what life is really like for me as a grad student. (hint: it’s not all splashing around in the tropics or making grand discoveries).  So, here they are, with some minor additions and corrections:

What does a day of work look like for you?

There isn’t really any ‘average’ day for me – during different times of the year or different stages of a project, I have very different tasks to do. I would guesstimate that I spend 40% of the year analyzing data, 20% doing lab work, 15% preparing for fieldwork, 15% doing fieldwork, and 10% doing other tasks.

Data analysis is huge for me because I work with many terabytes of genetic sequence information, and have to learn and write code in multiple programming languages in order to handle it all. Sifting through that data and finding interesting patterns takes a lot of time! So on those days (which include today), I generally wake up, check the status of scripts that were running overnight, have breakfast, and then go to my office for the rest of the day to do statistics and scripting work on my computer. Veteran readers of this blog might remember the last few posts of mine, where I detailed some light data-wrangling and mapping tasks that were occupying me at the time. Most of my computer work is slightly more dry than even that… But to me, the results of this work can be the most exciting part of the job! And getting scripts to run smoothly and efficiently is an extremely satisfying experience! 

I suppose an average day for me looks something like this.

Before I can do data analysis, though, I have to collect samples and generate the data. Prior to traveling and getting into the water for sample collection, I have to do a lot of paperwork and preparation: gathering supplies, applying for permits and travel visas, arranging housing and transportation, establishing on-site emergency procedures, brushing up on my SCUBA and first-aid skills, and researching what kinds of corals I expect to find and collect at each location. A lot of this is by far my least favorite part of the job, and is a major reason that I’m feeling a bit burnt-out from fieldwork, lately! Paperwork and bureaucracy were not what I signed up for! But following the law and being safe are extremely important, so I spend a lot of time trying to make sure I do everything right when I travel.

Acquiring permits is my favorite part of the job... Wait, no, that's not quite right...

Acquiring permits is my favorite part of the job… Wait, no, that’s not quite right…

Of course, the most fun portion of my work is the fieldwork itself! There’s no point in describing that here when you can instead watch this video: https://youtu.be/whQTKjexHCw. Again, veteran readers of this blog will also know what fieldwork is like for me from my previous posts and photos.

Ahh, that's right, diving is the most fun part of my job!

Ahh, that’s right, diving is the most fun part of my job!

After collecting samples in the field, I have to process those samples in the lab here in Oregon. I often say my lab work consists mostly of moving clear liquids from one tube to another – extracting, cleaning, aliquoting, and amplifying DNA, enzymes, and other colorless chemicals. It’s the kind of work that becomes rather mindless once you’re experienced, and can sometimes serve as a good time to get lost in thought. Ultimately, the result of all my liquid-mixing gets placed in a DNA sequencing machine, which spits out the aforementioned terabytes of data for analysis.

Moving liquids around in prettily-colored tubes isn’t so bad, either, though…

Why did you choose to become a marine biologist?

I had a pretty good idea that I wanted to be a scientist of some sort for as long as I can remember. I’ve also always been in love with tropical ecosystems, for some reason that I can’t explain. I think I decided more specifically on biology gradually throughout high school, during which I had a couple of important experiences. I remember in Sophomore biology class being fascinated by the workings of the cell; how diverse proteins are and how they act like such perfect miniature robots. My AP Biology course during my Junior year was exciting and incredible to me for all sorts of reasons. And between my Junior and Senior years, I worked as an assistant in a pathogen genetics laboratory here at OSU, which kind of sealed the deal for me. Choosing marine biology was a somewhat spontaneous decision that I made when I was applying for college… I don’t actually know why, but at some point during that process, I just decided that that was the program I was looking for. It probably had something to do with having had 5 fish tanks as a kid, playing water polo and swim team, and having loved learning to SCUBA dive on a family vacation to Hawaii. I just loved being in the water and seeing the beautiful and strange animals that inhabit it. Then, after I had started studying it, I realized just how amazing life in the ocean really is, and there really wasn’t any going back!

What are challenges you face in your studies/at work?

I think right now my biggest challenge is the communication of my work to my fellow scientists and the public. That communication is really the most important part of a scientist’s work; we get paid by publicly-funded grants so that we help everyone gain a better understanding of the world around us. Our primary mode of communication is through the publication of peer-reviewed manuscripts, and right now, I have a lot of discoveries that I need to share, that I’m struggling to write about. Part of that struggle is due to some perfectionism: nothing in science is ever 100% proven, and I always want to find better evidence and consider every possible alternative before declaring to the world that something is true in a publication. But at some point, I will need to write things up according to my current understanding, while simply acknowledging that parts of that understanding are bound to change.

This blog is another important form of communication for me, because I get to speak more freely to the public. But communicating our science to the public is also a major challenge. A lot of people want to live vicariously and hear about the fun parts of my job, but if that’s all I ever talk about, other people start to wonder whether my job is worth the tax money. So I try hard in my posts to blend fun stories with useful educational material, and it’s not at all easy for me.

Do you work more in a lab setting or out in the ocean?

As per question 1, the majority of my work is not out on the ocean. But a significant fraction of it has been! As my project progresses, I will be spending less time on the water, but I’m really not complaining at this point. It’s a lot of fun, but it’s exhausting and disruptive to my career and personal life. 

What is your favorite part about being a marine biologist?

I really like to think that my work can somehow contribute to a better understanding of the world. Coral reefs in particular are in a real mess at the moment, facing threats from pollution, overfishing, climate change, etc., and I want to do what I can to learn about them, and maybe even help them, before they’re gone. Of course, experiencing them in-person during fieldwork is also amazing!

If you are more curious, check out a summary of my work here: http://oregonstate.edu/microbiology/vegathurberlab/global-coral-microbiome-project, and also the videos that we’ve been producing during that project, here: http://marinestudies.oregonstate.edu/global-coral-microbiome-project/.

Thanks for the questions, Sophia!

Lab Accomplishments: Aliens, Predators, and Brains

I’m a little tardy in writing this, but our lab has a few pieces of stellar news from the last couple of weeks. First off, the most exciting:

Photo from Stephanie Rosales

Meet the man formerly known as Mr. Rory Welsh. He will now be referred to as Dr. Rory Welsh. Or, more likely, still just Rory. This guy is one of the most humble and most awesome guys around. Since he successfully defended his PhD dissertation last week, he is now also formally recognized as an expert in our field, and the foremost expert in his particular corner of it. I know I speak for our whole lab and many others when I say congratulations – you deserve it.

In the course of our tenure as PhD students, we must take classes, teach classes, perform research, and share that research through a number of public presentations. And, most importantly, we must make some verifiable contribution to the collective knowledge of our field. Which brings me to the other fun lab news. During Rory’s dissertation defense preparation, he wrapped up a couple of projects and wrote multiple papers. One was accepted to the influential ISME Journal and became available online just before his defense. Another (which was co-authored by a certain blogging scientist we all know…), he recently submitted to the open-access journal PeerJ and is undergoing the review process. Though it hasn’t yet been accepted, the pre-print also became available online last week. Both of these papers deal with the fascinating ecology of a particular coral-associated bacterial predator called Halobacteriovorax. I could tell you more about it, but I think it’d be best to hear that story straight from the Doctor’s mouth. Rory will tell you about them, soon!

Last week also saw the publication of yet another paper from the lab! Stephanie, who has previously written a post for the blog, had her paper published on the metagenomics of seal brains! It’s available now at another open-access journal, PLOS ONE. Stephanie is also working on a blog post talking about that paper.

Whew! The rest of the lab’s been quite prolific. I definitely feel like I need to step up my game…

Mapmaking: Part 3

In the first two parts of this series, I introduced Lightroom, the Lightroom plugins LR/Transporter and FTP Publisher, and the programming languages AWK and R. With those tools, I organized my photos and got some of their metadata into a format that I can easily manipulate with R code.

After getting the photo information organized, I had a few more pieces of metadata to get together. In particular, I wanted to organize the map based on the taxonomy of the corals, and I wanted to include some information about the site of collection that wasn’t included in my sample metadata file. We are keeping this information in separate files, for a couple of reasons. Over the course of the project, multiple people have collected replicates of the same species of coral in different locations. Every time we collect a coral, we need to fill in a line of data in the sample metadata table. Right now, we have 57 columns in that table, meaning we have to manually fill in 57 pieces of information for each sample. On a whirlwind trip where we collect 50 samples, that adds up quickly to 2850 values, or 2850 opportunities to make a typo or some other error.

If any two columns in our table are highly repetitive and are dependent on each other, we should be able to allow the computer to fill one in based on the other. For example, we could create seven columns in the sample metadata file that detail each sample’s species, genus, family, order, phylogenetic clade, NCBI taxonomy ID number, and perhaps some published physiological data. However, all of these pieces of information are dependent on the first value: the species of coral sampled. If we collect the same species, say, Porites lobata, 25 times throughout the project, all the information associated with that species is going to be repeated again and again in our metadata sheet. However, if instead we create a single column in our sample metadata table for the species ID, we can then create a separate table for all the other information, with only one row per species. We cut down on the amount of manual data entry we have to do by 144 values for that species alone!* Not only does that save time; it helps to avoid errors. The same general principle applies to each site we’ve visited: certain values are consistent and prone to repetition and error, such as various scales of geographical information, measurements of water temperature and visibility, and locally relevant collaborators. So we created another table for ‘sites’. **

Excerpt from 'species' metadata table
genus_speciesgenusspeciesfamilycladeTAXON_IDNCBI_blast_name
Tubastrea coccineaTubastreacoccineaDendrophyllidaeII46700stony corals
Turbinaria reniformisTurbinariareniformisDendrophyllidaeII1381352stony corals
Porites astreoidesPoritesastreoidesPoritidaeIII104758stony corals
Acropora palmataAcroporapalmataAcroporidaeVI6131stony corals
Pavona maldivensisPavonamaldivensisAgaricidaeVII1387077stony corals
Herpolitha limaxHerpolithalimaxFungiidaeXI371667stony corals
Diploastrea helioporaDiploastreahelioporaDiploastreidaeXV214969stony corals
Symphyllia erythraeaSymphylliaerythraeaLobophyllidaeXIX1328287stony corals
Heliopora coeruleaHelioporacoeruleaHelioporaceaeOutgroup86515blue corals
Stylaster roseousStylasterroseousStylasteridaeOutgroup520406stony corals
Excerpt from 'sites' metadata table
reef_namedatereef_typesite_namecountrycollected_byrelevant_collaboratorsvisibility
Big Vickie20140728Midshelf inshore reefLizard IslandAustraliaRyan McMindsDavid Bourne, Katia Nicolet, Kathy Morrow, and many others at JCU, AIMS, and LIRS12
Horseshoe20140731Midshelf inshore reefLizard IslandAustraliaRyan McMindsDavid Bourne, Katia Nicolet, Kathy Morrow, and many others at JCU, AIMS, and LIRS15
Al Fahal20150311Offshore reefKAUST House ReefsSaudi ArabiaRyan McMinds, Jesse ZaneveldChris Voolstra, Maren Ziegler, Anna Roik, and many others at KAUSTUnknown
Far Flats20150630Fringing ReefLord Howe IslandAustraliaJoe Pollock15
Raffles Lighthouse20150723Inshore ReefSingaporeSingaporeJesse Zaneveld, Monica MedinaDanwei Huang4.5
Trou d'Eau20150817Lagoon Patch ReefReunion WestFranceRyan McMinds, Amelia Foster, Jerome PayetLe Club de Plongee Suwan Macha, Jean-Pascal Quod10
LTER_1_Fringing20151109Fringing ReefMooreaFrench PolynesiaRyan McMinds, Becky Vega Thurberthe Burkepile Lab>35

Thus, after loading and processing the sample and photo metadata files as in the last post, I needed to load these two extra files and merge them with our sample table. This is almost trivial, using commands that are essentially in English:

sites <- read.table('sites_metadata_file.txt',header=T,sep='\t',quote="\"")
data <- merge(samples,sites)
species_data <- read.table('species_metadata_file.txt',header=T,sep='\t',quote="\"")
data <- merge(data,species_data)

And we now have a fully expanded table.

A couple of commands are needed to account for empty values that are awaiting completion when we get the time:

data$relevant_collaborators[is.na(data$relevant_collaborators)] <- 'many collaborators'
data$photo_name[is.na(data$photo_name)] <- 'no_image'

These commands subset the table to just rows that had empty values for collaborators and photos, and assign to the subset a consistent and useful value. Empty collaborator cells aren’t accurate – we’ve gotten lots of help everywhere we’ve gone, and just haven’t pulled all the information from all the teams together yet! As for samples without images, I created a default image with the filename ‘no_image.jpg’ and uploaded it to the server as a stand-in.

Default image shown when a sample has no pictures.

Default image shown when a sample has no pictures.

Now I need to introduce the R package that I used to build my map: Leaflet for R. Leaflet is actually an extensive Javascript package, but the R wrapper makes it convenient to integrate my data. The package allows considerable control of the map within R, but the final product can be saved as an HTML file that sources the online Javascript libraries. Once it’s created, I just upload it to our webpage and direct you there!

Note that although I usually use R from the Terminal, it’s very convenient to use the application RStudio with this package, because you can see the product progress as it’s built, and then easily export it at the end.

To make my map more interesting, I took advantage of the fact that each marker on the Leaflet map can have a popup with its own arbitrary HTML-coded content. Thus, for each sample I integrated all my selected metadata into an organized graphical format. The potential uses for this are exciting to me; it means I could put more markers on the map, with tables, charts, interactive media, or lots of other things that can be specified with HTML. For now, though, I decided I wanted the popups to look like this, with just some organized text, links, and a photo:



So, I wrote the HTML and then used R’s paste0() function to plug in the sample-specific data in between HTML strings.

data$html <- paste0('300px; overflow:auto;">',
'<div width="100%" style="clear:both;">',
'<p>',
'<a href="https://www.flickr.com/search/?text=GCMP%20AND%20',data$genus_species,'"target="_blank">',data$genus_species,'</a>: ',
'<a href="https://www.flickr.com/search/?text=',gsub('.','',data$sample_name,fixed=T),'"target="_blank">',data$sample_name,'</a>',
'</p>',
'</div>',
'<div width="100%" style="float:left;clear:both;">',
'<img src="http://files.cgrb.oregonstate.edu/Thurber_Lab/GCMP/photos/sample_photos/processed/small/',data$photo_title,'.jpg" width="50%" style="float:left;">',
'<div width="50%" style="float:left; margin-left:10px; max-width:140px;">',
'Site: <a href="https://www.flickr.com/search/?text=GCMP%20AND%20',data$reef_name,'" target="_blank">',data$reef_name,'</a>',
'<p>Date: <a href="https://www.flickr.com/search/?text=GCMP%20AND%20',data$date,'"target="_blank">',data$date,'</a></p>',
'<p>Country: <a href="https://www.flickr.com/search/?text=GCMP%20AND%20',data$country,'"target="_blank">',data$country,'</a></p>',
'</div>',
'</div>',
'<div width="100%" style="float:left;">',
'<p>',
'Collected by <a href="https://www.flickr.com/search/?text=GCMP%20AND%20(',gsub(', ','%20OR%20',data$collected_by,fixed=T),')"target="_blank">',data$collected_by,'</a>',
' with the help of ',data$relevant_collaborators,'.',
'</p>',
'</div>',
'<div style="clear:both;"></div>',
'</div>')

Yeesh! I hate HTML. It definitely makes it uglier having to build the code within an R function, but hey, it works. If you want, we can go over that rat’s nest in more detail another time, but for now, the basics: I’ve created another column in our sample metadata table (data$html) that contains a unique string of HTML code on each row. In blue, I create a container for the first line of the popup, which contains the species name and sample name, stitched together into a link to their photos on Flickr. In orange, I paste together a source call to the sample’s photo on our server. In green, I create a container with metadata information (and links to all photos associated with that metadata on Flickr), which sits next to the image. And in purple, I stitch together some text and links to acknowledge the people who worked to collect that particular sample. Looking at that code right now, I’m marveling at how much nicer it looks now that I’ve cleaned it up for presentation…

And now that I’ve gotten all the metadata together and prepared the popups, the only thing left to do is create the map itself. However, I’ll leave that for just one more post in the series.


*math not thoroughly verified.

**edit: My father points out that we are essentially building a relational database of our metadata. In fact, I did initially intend to do that explicitly by loading these separate tables into a MySQL database. For now, however, our data isn’t all that complex or extensive, and separate tables that can be merged with simple R or Python code are working just fine. I’m sure someday we will return to a discussion of databases, but that day is not today.

Mapmaking: Part 2

No, you didn’t miss Mapmaking: Part 1. Before getting interrupted by last-minute extra fieldwork with the Waitt Foundation (which was awesome!), I gave an intro to photo management in Lightroom. Today I’ll expand on that, beginning a series of posts explaining how I created this map. On the way, I’ll introduce a little bit of…

*shudder*

coding.

Some really ugly code that I once wrote.

If you’ve been following my blog just to look at pretty beach pictures, I apologize. But I encourage you to keep reading. If any of the code makes you go cross-eyed, don’t worry; it does the same to me. I would love to field some questions in the comment section to make things clearer.

So. I have all of my photos keyworded to oblivion, and those keywords include sample IDs. How did I get them into my map? First, I needed to make sure I could link a given sample with its photos programmatically. I have a machine-readable metadata table that stores all our sample information, which we’ll be using later for data analysis. Metadata just refers to ‘extra’ information about the samples, and by machine-readable, I mean it’s stored in a format that is easy to parse with code. I used this table to build the map because it specifies GPS coordinates and provides things like the site name to fill in the pop-ups. But I didn’t have any photo filenames in this table, because it’s easier to organize the photos by tagging them with their sample IDs, like I explained last post. I simply needed to extract sample IDs from the photos’ keywords and add the their filenames to my sample metadata table. And not by hand.

Excerpt from sample metadata table
sample_namereef_namedatetimegenus_specieslatitudelongitude
E1.3.Por.loba.1.20140724Lagoon entrance2014072411:23Porites lobata-14.689414145.468137
E1.19.Sym.sp.1.20140724Lagoon entrance2014072411:26Symphyllia sp-14.689414145.468137
E1.6.Acr.sp.1.20140726Trawler2014072610:35Acropora sp-14.683931145.466483
E1.15.Dip.heli.1.20140726Trawler2014072610:38Diploastrea heliopora-14.683931145.466483
E1.3.Por.loba.1.20140726Trawler2014072610:41Porites lobata-14.683931145.466483

A popup from the map on our webpage, displaying the sample ID, selected metadata information, and a photo.

To get started, I installed a Lightroom plugin called LR/Transporter. This plugin contains many functions for programmatically messing with photo metadata. Using it, I created a ‘title’ for all of my photos with a sequence of numbers in the order that they were taken. The first sample photo from the project was one that Katia took while I was working in Australia, and it’s now called ‘GCMP_sample_photo_1’. Katia and I also took 17 other photos that contained this same sample, incrementing up to ‘GCMP_sample_photo_18’. The last photo I have from the project is one from my last trip, to Mo’orea, and it now has the title ‘GCMP_sample_photo_3893’.

Then, I exported small versions of all my photos to a publicly accessible internet server that our lab uses for data. I did this with another Lightroom plugin called FTP Publisher, from the same company that made LR/Transporter. Each photo was uploaded to a specific folder and given a filename based on its new arbitrary title. Thus my first photo, GCMP_sample_photo_1, is now easily located at:

http://files.cgrb.oregonstate.edu/Thurber_Lab/GCMP/photos/sample_photos/processed/small/GCMP_sample_photo_1.jpg

Next, I used LR/Transporter to export a machine-readable file where the first item in every line is the new title of the photo, and the second item is a comma-separated list of all the photo’s keywords, which include sample IDs.

Excerpt from Lightroom photo metadata table
GCMP_sample_photo_1E1.3.Por.loba.1.20140724, Fieldwork, GCMP Sample, ID by Ryan McMinds, Lagoon Entrance, Pacific Ocean
GCMP_sample_photo_2E1.3.Por.loba.1.20140724, Fieldwork, GCMP Sample, ID by Ryan McMinds, Lagoon Entrance, Pacific Ocean, Ryan McMinds
GCMP_sample_photo_12420140807, E1.5.Gal.astr.1.20140807, GCMP Sample, ID by Ryan McMinds, Pacific Ocean, Trawler Reef
GCMP_sample_photo_1051Al Fahal, E4.3.Por.lute.1.20150311, GCMP Sample, ID by Ryan McMinds, KAUST, Red Sea
GCMP_sample_photo_3893E13.Out.Mil.plat.1.20151111, GCMP Sample, Mo'orea

Now comes the fun part.

To associate each sample with a URL for one of its photos, I needed to search for its ID in the photo keywords and retrieve the corresponding photo titles, then paste one of these titles to the end of the server URL. The only way I know to do this automatically is by coding, or maybe in Excel if I were a wizard. I’ve learned how to code almost 100% through Google searches and trial-and-error, so when I write something, it’s a mashing-together of what I’ve learned so far, and it’s made for results, not beauty. The first programming language I learned that was good for parsing tables was AWK, because I do a lot of work in the shell on the Mac terminal. I thus tackled my problem with that language first, in an excellent example of an inefficient method to get results:

while read -r line; do
search=$(awk '{print $1}' <<< $line)
awk -v search=$search 'BEGIN {list=""}
$0 ~ search && list != "" {list = list","$1}
$0 ~ search && list == "" {list = $1}
END {print search"\t"list}' photo-metadata-file.txt
done < sample-metadata-file.txt > output-file.txt

Ew.

I’ve been issuing my AWK commands from within the shell, which is a completely separate programming language. For the life of me, I couldn’t remember how to use AWK to read two separate files simultaneously while I was writing this code. I know I’ve done it before, but I couldn’t find any old scripts with examples, and rather than re-learn the efficient, correct way, I mashed together commands from two different languages. I then decided I needed to go back and do it the right way, so I rewrote the code entirely in AWK. That code snippet isn’t very long, but it took a lot of re-learning for me to figure it out. So it was about a week or so before I realized that since my map-making had to occur in yet another language (called R), it was ridiculous for me to be messing with AWK in the first place…

So I came to my senses and started over.

In R, I simply import the two tables, like so:

samples <- read.table('sample-metadata-file.txt',header=T,sep='\t',fill=T,quote="\"")
photo_data <- read.table('photo-metadata-file.txt',header=F,sep='\t',quote="\"")

Then use a similar process as in AWK to create a new column of photo titles in the sample metadata table (this time I simply add the first photo instead of the whole list):

samples$photo_name <- as.character(sapply(samples$sample_name, function(x) { photo_data[grep(x,photo_data[,2])[1],1] }))

And now, I have a single table that tells me the coordinates, metadata, and photo titles of each sample. With this, I can make the map, with one point drawn for each line in the table. I’ll continue explaining this process in another post.

Excerpt from sample metadata table
sample_namereef_namedatetimegenus_specieslatitudelongitudephoto_title
E1.3.Por.loba.1.20140724Lagoon entrance2014072411:23Porites lobata-14.689414145.468137GCMP_sample_photo_1
E1.19.Sym.sp.1.20140724Lagoon entrance2014072411:26Symphyllia sp-14.689414145.468137GCMP_sample_photo_17
E1.6.Acr.sp.1.20140726Trawler2014072610:35Acropora sp-14.683931145.466483GCMP_sample_photo_37
E1.15.Dip.heli.1.20140726Trawler2014072610:38Diploastrea heliopora-14.683931145.466483GCMP_sample_photo_37
E1.3.Por.loba.1.20140726Trawler2014072610:41Porites lobata-14.683931145.466483GCMP_sample_photo_40

By the way, I am working on translating my blog into Spanish and French, to make it more accessible and just to help myself learn. Si quieres ayudarme, puedes encontrar la traducción activa de esta entrada y otras en el sitio Duolingo. ¡Gracias!

Frequent Flier

Well, I just hit 50,000 miles that I’ve flown for this project. Since March. And to think, I laughed when Becky ‘warned’ me that the job would require a lot of travel…

Oh, btw, I’m in Montserrat. The volcanic island in the Caribbean that inspired Jimmy Buffett’s timeless classic ‘Volcano’. YouTube it.

Hopped on another little plane and landed on another little island.

Hopped on another little plane and landed on another little island.

I’m helping the Waitt Institute out with some ecological surveys, and in return, I hope to be able to get some samples for my project. But I found out about this just 7 days ago and am only just settling in to my accommodations. Such fun!

Because why not

Because why not

Antigua

Photo management

First off, go play with this interactive map of our sampling locations on our project homepage, because I’ve been working on it for the last week and I’m very proud of it :).

Now, I have a confession to make.

Despite the singular focus of my prior blog posts, my work is not entirely composed of swimming around in the tropics. In fact, most months of the year, you can find me right here, bathing instead in the light of my computer screen.

I’ve been meaning to write more posts while stateside, but the subject matter is a bit more difficult to ‘spice up’. So I’ve put it off. Today, however, I think I’ve got an interesting topic that will begin a new theme of post regarding the most interesting and time-consuming part of my job: computer work.

Since we returned from Reunion a couple of weeks ago, I’ve spent a considerable amount of time preparing the photos and data from our trips so that they are organized, useful, and publicly accessible. So far, the team has collected over 3,000 photos of more than 550 coral samples. Keeping these organized can become very difficult as we progress, so I’ve been working with a variety of tools to make it easier. When we’re in the field, we take tons of photos of each individual coral, from closeups that show small morphological details, to wide-angle photos that we can use later to determine the surroundings of the coral. We also take photos of the reef, photos of each other, and photos of that awesome creature that I’ve never seen before and it’s so close and so colorful and sooo cool and look at it feeding, it’s waving its antennae around and catching things and it’s so awesome!!

Seriously, this mantis shrimp was freaking cool

Seriously, this mantis shrimp was freaking cool

At the end of the day, I have hundreds of photos. Some are pretty, some need post-processing work to become pretty, some are definitely not pretty but can be used as data, and some might be useable as data with some post-processing of their own. Each photo might have one or multiple samples in it, or could be a great example of a particular disease, or maybe just it just has one of us making a funny face. To be useful, I need a way to find these photos again, somewhere in the midst of the 47,000 other photos on my hard drive (seriously).

Ummm... data?

Ummm… data?

The primary tool I use to manage the mess is Adobe Lightroom. Lightroom enables me to process my photos in bulk and add keywords to the photos so I can easily search for them later. When I import all the photos from a particular dive, for instance, I have Lightroom automatically add the GPS coordinates for the dive and keywords for the site name, project, photographer, etc. Then I go through the photos and add keywords to each one that include sample identification codes and everything interesting in the picture, like fish, diseases, or divers. Now, there are two very neat aspects about Lightroom keywords that I take advantage of. The first is that you can establish keyword synonyms so that every time you tag a photo with one word, its synonyms will automatically also be attached. I can tag a photo with ‘lionfish’, and that’s all well and good. But later, I might be thinking all sciency and want to find all my photos with ‘Pterois radiata‘ in them. If I have previously told Lightroom that the scientific name and common name are synonyms, my search will find exactly what I need.

But what if I want to find all photos of fish that belong to Scorpaeniformes (the group that includes both lionfish and stonefish)? The second handy aspect of Lightroom keywords comes in here: they can be placed in a hierarchy. I’ve placed the keyword ‘Pterois radiata‘ within ‘Pterois‘, within ‘Scorpaeniformes’, so every time I tag a photo with the simple term ‘lionfish’, it’s also tagged with its higher-level taxonomic groupings. For our samples, I even put the sample ID keyword within its corresponding species. In fact, I’ve set up an entire taxonomic tree of organism names within my keywords, so every time I tag a simple sample ID, the photo is made searchable with terms corresponding to all the different levels of the tree of life. It’s awwwesommmmeee.

Manual keywords (5): E10.17.Cyp.sera.1.20150628, North Bay, Octopus, Photo by Joe Pollock, GCMP Sample
Resulting keywords (29): Animal, Anthozoan, Australia, Cephalopoda, Cnidaria, Cnidarian, Cyphastrea, Cyphastrea serailia, E10.17.Cyp.sera.1.20150628, GCMP, GCMP Sample, Hard coral, Hexacorallian, Indo-Pacific, LH_282, Lord Howe Island, Merulinidae, Metazoan, Mollusc, North Bay, Octopus, Pacific Ocean, Photo by Joe Pollock, Protostome, Robust, Scleractinian, Stony Coral, XVII, AU

The next stage of photo management for me is post-processing. I am nowhere close to an expert photographer or image editor, but I’m learning. It’s still amazing to me how much a photo can be improved with a couple quick adjustments of exposure and levels. Most of the time, photos seem to come ‘off the camera’ with a washed-out and low-contrast look. Underwater photos always have their colors messed up. When we take photos of samples, we generally put a standard color card and CoralWatch Coral Health Chart in the frame so that we can make the right adjustments later. Fixing the color and exposure doesn’t just make the photos prettier, it can help us to understand the corals. It’s tough to spot patches of disease or the presence of bleaching when the whole photo is various dark shades of green. The best thing about Lightroom (at least compared to Photoshop and a number of other image editing programs)* is the ability to make adjustments in bulk. Often, a particular series of photos were all taken in very similar conditions. Say, all the photos from a single dive, where we were at 30 ft with a particular amount of visibility and cloud cover. I can play around with just one of the photos, getting the adjustments just right, then simply copy those adjustments and paste them to the rest of the photos from the dive. Voila! Hundreds of photos edited.

Before adjustments

After adjustments

Aaaand before

Aaaand before

Aaaannd after

Aaaannd after

Once I’ve got the photos edited and organized, I can do fun things with them, like export them to Flickr for your browsing pleasure, or embed them in the map you explored at the beginning of the post. But explaining that is for another day…

*A note about software. The next-best photo software I’ve used is Google’s free (free!) Picasa. Picasa will also allow you to batch-edit photos, and had facial recognition long before Lightroom. iPhoto also has these features. But as far as I know, the keywording in Picasa and iPhoto doesn’t support hierarchies or synonyms.

Merci Beaucoup!

Although we generally like to post all the fun details of our project, doing fieldwork internationally is hard. Mountains of paperwork and preparation go into our trips (much of it often stressful and last-minute), and when we arrive, we generally don’t know the local corals very well, don’t know the language as well as we think we do, and don’t know the area at all. We’re learning as we go about all the best ways to make our trips go smoothly.

But for now, as I sit in the Paris airport on my way home, I’d like to give a shout-out to all the people who have helped make this particular trip happen. One of the first contacts Jerome made on the island was with Le Club de Plongee Suwan Macha – an organization of SCUBA divers that works like a co-op, buying and maintaining resources that are shared by members at a very affordable price. This system worked great for us as a way to get many customized dives in and seems like an awesome set-up for scientific diving in general. We even borrowed a few tanks of air for some of our ‘labwork’, unrelated to diving. After we joined the club, the acting president, Pierre Grisoni, volunteered his time to drive the boat and refill tanks for us for all the dives we did on the West coast of the island. These dives were essential to our collections and formed the core of our trip! Merci beaucoup à Pierre and the rest of the club!

Thanks, Pierre!

Thanks, Pierre and Suwan Macha!

Another important contact was Dr. Jean-Pascal Quod, president of Reef Check France and manager of Pareto Ecoconsult. Jean-Pascal and the diving club SUBEST were instrumental in our collections on the East side of the island, and showed us some really great reefs over there.

Perhaps the most important local entity was The Natural Marine Reserve of La Réunion (RNMR), which provided us with local collections permits and prepared our CITES export permits. Dealing with this paperwork is often the most difficult part of our work, and being able to work with the local management authority is essential to our project.

Many other people have been helpful on this particular trip. For starters, I bummed a ride to and from the Portland airport with my parents, which is excellent. I also left my car with them and got lots of other help from them before leaving. I believe Amelia’s mother also took her and Jerome to the airport, after quickly sewing together my BCD weight pocket for me. Ummm, awesome!! Then there’s Jerome’s mom, who on multiple occasions hosted us all for outstanding dinners while we were in Reunion. Everything’s easier in life with parents like these!

Les parents

Les parents McMinds: merci for all you do

We also met many of Jerome’s friends and family while there, and a number of them provided us with delicious food, too. Thank you to all of you for showing us your island and making the trip great!

Since we first started planning the trip, there has been one person who made the right contacts, spoke the right language, and put in a lot of effort to get all the permitting and paperwork done on the French end of things: our postdoc Dr. Jerome Payet. In addition to pre-trip organization, he also acted as our guide, facilitator, translator, and co-director throughout the trip. I’ve worked with Jerome a lot in the last couple years, and he has been an integral part of the lab for a bit longer than me, but working on this particular project was generous of him. This trip came at a special time for Jerome, as well, since he is now moving on to work with a different lab at OSU. The work he put into it is thus very much appreciated. Thank you – we will miss you!!

Au revoir, Jerome

Au revoir, Jerome