{"id":762,"date":"2016-05-03T06:40:58","date_gmt":"2016-05-03T06:40:58","guid":{"rendered":"http:\/\/blogs.oregonstate.edu\/gemmlab\/?p=762"},"modified":"2016-05-03T15:49:20","modified_gmt":"2016-05-03T15:49:20","slug":"grad-school-headaches","status":"publish","type":"post","link":"https:\/\/blogs.oregonstate.edu\/gemmlab\/2016\/05\/03\/grad-school-headaches\/","title":{"rendered":"Grad School Headaches"},"content":{"rendered":"<p>By Florence Sullivan, MSc student GEMM lab<\/p>\n<p>Over the past few months I have been slowly (and I do mean SLOWLY \u2013 I don\u2019t believe I\u2019ve struggled this much with learning a new skill in a long, long time) learning how to work in \u201c<a href=\"https:\/\/www.r-project.org\/\">R<\/a>\u201d.\u00a0 For those unfamiliar with why a simple letter might cause me so much trouble, <a href=\"https:\/\/en.wikipedia.org\/wiki\/R_%28programming_language%29\">R<\/a> is a programming language and free software environment suitable for statistical computing and graphing.<\/p>\n<p>My goal lately has been to interpolate my whale tracklines (i.e. smooth out the gaps where we missed a whale\u2019s surfacing by inserting artificial locations).\u00a0 In order to do this I needed to know (1) How long does a gap between fixes need to be to identify a missed surfacing? (2) How many artificial points should be used to fill a given gap?<\/p>\n<p>The best way to answer these queries was to look at a distribution of all of the time steps between fixes.\u00a0 I started by importing my dataset \u2013 the latitude and longitude, date, time, and unique whale identifier for each point (over 5000 of them) we recorded last summer. I converted the locations into x &amp; y coordinates, adjusted the date and time stamp into the proper format, and used the package <a href=\"https:\/\/cran.r-project.org\/web\/packages\/adehabitatLT\/vignettes\/adehabitatLT.pdf\">adehabitatLT<\/a>\u00a0 to calculate the difference in times between each fix.\u00a0 A package known as <a href=\"http:\/\/www.statmethods.net\/advgraphs\/ggplot2.html\">ggplot2<\/a> was useful for creating exploratory histograms \u2013 but my data was incredibly skewed (Fig 1)! It appeared that the majority of our fixes happened less than a minute apart from each other. When you recall that gray whales typically take 3-4 short breathes at the surface between dives, this starts to make a lot of sense, but we had anticipated a bimodal distribution with two peaks: one for the quick surfacings, and one for the surfacings between 4-5 minutes dives. Where was this second peak?<\/p>\n<figure id=\"attachment_764\" aria-describedby=\"caption-attachment-764\" style=\"width: 771px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/hist-of-dt.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-764 size-full\" src=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/hist-of-dt.png\" alt=\"Histogram of the difference in time (in seconds) between whale fixes. \" width=\"771\" height=\"536\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/2115\/files\/2016\/05\/hist-of-dt.png 771w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/2115\/files\/2016\/05\/hist-of-dt-300x209.png 300w\" sizes=\"auto, (max-width: 771px) 100vw, 771px\" \/><\/a><figcaption id=\"caption-attachment-764\" class=\"wp-caption-text\">Fig. 1. \u00a0Histogram of the difference in time (in seconds on x-axis) between whale fixes.<\/figcaption><\/figure>\n<p>Sometimes, calculating the logarithm of one of your axes can help tease out more patterns in your data \u00a0&#8211; particularly in a heavily skewed distribution like Fig. 1. When I logged the time interval data, our expected\u00a0bimodal distribution pattern became\u00a0evident (Fig. 2). And, when I back-calculate from the center of the two peaks we see that the first peak occurs at less than 20 seconds (e^2.5 = 18 secs) representing the short, shallow blow intervals, or interventilation dives, and that the second peak of dives spans ~2.5 minutes to \u00a0~5 minutes (e^4.9 = 134 secs, e^5.7 = 298 secs). Reassuringly, these dive intervals are in agreement with the findings of Stelle et al. (2008) who described the mean interval between blows as 15.4 \u00b1 4.73 seconds, and overall dives ranging from 8 seconds to 11 minutes.<\/p>\n<figure id=\"attachment_766\" aria-describedby=\"caption-attachment-766\" style=\"width: 616px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/log-histogram-dt-everything.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-766 size-full\" src=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/log-histogram-dt-everything.png\" alt=\"Fig. 2. Histogram of the log of time difference between whale fixes. \" width=\"616\" height=\"409\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/2115\/files\/2016\/05\/log-histogram-dt-everything.png 616w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/2115\/files\/2016\/05\/log-histogram-dt-everything-300x199.png 300w\" sizes=\"auto, (max-width: 616px) 100vw, 616px\" \/><\/a><figcaption id=\"caption-attachment-766\" class=\"wp-caption-text\">Fig. 2. Histogram of the log of time difference between whale fixes.<\/figcaption><\/figure>\n<p>So, now that we know what the typical dive patterns in this dataset are, the trick was to write a code that would look through each trackline, and identify gaps of greater than 5 minutes.\u00a0 Then, the code calculates how many artificial points to create to fill the gap, and where to put them.<\/p>\n<figure id=\"attachment_765\" aria-describedby=\"caption-attachment-765\" style=\"width: 390px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/interpolation-check.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-765 size-full\" src=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/interpolation-check.png\" alt=\"Fig. 3. A check in my code to make sure the artificial points are being plotted correctly. The blue points are the originals, and the red ones are new. \" width=\"390\" height=\"335\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/2115\/files\/2016\/05\/interpolation-check.png 390w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/2115\/files\/2016\/05\/interpolation-check-300x258.png 300w\" sizes=\"auto, (max-width: 390px) 100vw, 390px\" \/><\/a><figcaption id=\"caption-attachment-765\" class=\"wp-caption-text\">Fig. 3. A check in my code to make sure the artificial points are being plotted correctly. The blue points are the originals, and the red ones are new.<\/figcaption><\/figure>\n<p>One of the most frustrating parts of this adventure for me has been understanding the syntax of the R language.\u00a0 I know what calculations or comparisons I want to make with my dataset, but translating my thoughts into syntax for the computer to understand has not been easy.\u00a0 With error messages such as:<\/p>\n<p><strong>Error in match.names(clabs, names(xi)) :<\/strong><\/p>\n<p><strong>\u00a0 names do not match previous names<\/strong><\/p>\n<p>Solution: \u00a0I had to go line by line and verify that every single variable name matched, but turned out it was a capital letter in the wrong place throwing the error!<\/p>\n<p><strong>Error in as.POSIXct.default(time1) :<\/strong><\/p>\n<p><strong>\u00a0 do not know how to convert &#8216;time1&#8217; to class \u201cPOSIXct\u201d<\/strong><\/p>\n<p>Solution: a weird case where the data was in the correct time format, but not being recognized, so I had to re-import the dataset as a different file format.<\/p>\n<p><strong>Error in data.frame(Whale.ID = Whale.ID, Site = Site, Latitude = Latitude,\u00a0 : \u00a0\u00a0arguments imply differing number of rows: 0, 2, 1<\/strong><\/p>\n<p>Solution: HELP! Yet to be solved\u2026.<\/p>\n<p>Is it any wonder that when a friend asks how I am doing, my answer is \u201cR is kicking my butt!\u201d?<\/p>\n<p>Science is a collaborative effort, where we build on the work of researchers who came before us. Rachael, a wonderful post-doc in the GEMM Lab, had already tackled this time-based interpolation problem earlier in the year working with albatross tracks. She graciously allowed me to build on her previous R code and tweak it for my own purposes. Two weeks ago, I was proud because I thought I had the code working \u2013 all that I needed to do was adjust the time interval we were looking for, and I could be off to the rest of my analysis!\u00a0 However, this weekend, the code has decided it doesn\u2019t work with any interval except 6 minutes, and I am lost.<\/p>\n<p>Many of the difficulties encountered when coding can be fixed by judicious use of <a href=\"https:\/\/www.google.com\/webhp?sourceid=chrome-instant&amp;ion=1&amp;espv=2&amp;ie=UTF-8#q=Error+in+na.fail.default:+missing+values+in+object\">google<\/a>, <a href=\"http:\/\/stackoverflow.com\/\">stackoverflow<\/a>, and the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/adehabitatLT\/vignettes\/adehabitatLT.pdf\">CRAN repository<\/a>.<\/p>\n<p>But sometimes, when you\u2019ve been staring at the problem for hours, what you really need is a little praise for trying your best. So, if you are an R user, go download this package: <a href=\"https:\/\/cran.r-project.org\/web\/packages\/praise\/praise.pdf\">praise<\/a>, load the library, and type praise() into your console. You won\u2019t regret it (See Fig. 4).<\/p>\n<figure id=\"attachment_767\" aria-describedby=\"caption-attachment-767\" style=\"width: 660px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/Screenshot-74.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-767 size-large\" src=\"http:\/\/blogs.oregonstate.edu\/gemmlab\/files\/2016\/05\/Screenshot-74-1024x576.png\" alt=\"Screenshot (74)\" width=\"660\" height=\"371\" \/><\/a><figcaption id=\"caption-attachment-767\" class=\"wp-caption-text\">Fig. 4. A little compliment goes a long way to solving a headache.<\/figcaption><\/figure>\n<p>Thank you to Rachael who created the code in the first place, thanks to Solene who helped me trouble shoot, thanks to Amanda for moral support. Go GEMM Lab!<\/p>\n<p><em>Why do pirates have a hard time learning the alphabet?\u00a0 It\u2019s not because they love aaaR so much, it\u2019s because they get stuck at \u201cc\u201d!<\/em><\/p>\n<p>Stelle, L. L., W. M. Megill, and M. R. Kinzel. 2008. Activity budget and diving behavior of gray whales (Eschrichtius robustus) in feeding grounds off coastal British Columbia. Marine mammal science <strong>24<\/strong>:462-478.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By Florence Sullivan, MSc student GEMM lab Over the past few months I have been slowly (and I do mean SLOWLY \u2013 I don\u2019t believe I\u2019ve struggled this much with learning a new skill in a long, long time) learning how to work in \u201cR\u201d.\u00a0 For those unfamiliar with why a simple letter might cause &hellip; <a href=\"https:\/\/blogs.oregonstate.edu\/gemmlab\/2016\/05\/03\/grad-school-headaches\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Grad School Headaches<\/span><\/a><\/p>\n","protected":false},"author":6597,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[636310],"tags":[1667,712846,634945,513,148762,712761,5],"class_list":["post-762","post","type-post","status-publish","format-standard","hentry","category-gray-whale-foraging-ecology-and-vessel-disturbance","tag-data-analysis","tag-florence-sullivan","tag-gray-whales","tag-marine-mammals","tag-oregon-coast","tag-r","tag-science"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"post_mailing_queue_ids":[],"_links":{"self":[{"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/posts\/762","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/users\/6597"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/comments?post=762"}],"version-history":[{"count":4,"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/posts\/762\/revisions"}],"predecessor-version":[{"id":771,"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/posts\/762\/revisions\/771"}],"wp:attachment":[{"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/media?parent=762"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/categories?post=762"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/gemmlab\/wp-json\/wp\/v2\/tags?post=762"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}