*Research Question*

My research focuses on the ecology of blue whales (*Balaenoptera musculus brevicauda*) in the South Taranaki Bight region (STB) of New Zealand (Figure 1). Despite the recent documentation of a foraging ground in the STB (Torres et al. 2015), blue whale distribution remains poorly understood in the southern hemisphere. The STB is New Zealand’s most industrially active marine region, and the site of active oil and gas extraction and exploration, busy shipping traffic, and proposed seabed mining (Torres 2013). This potential space-use conflict between endangered whales and industry warrants further investigation into the spatial and temporal extent of blue whale habitat in the region. My goal for this term was to begin to investigate the relationship between blue whales and their environment. Specifically, the question I am asking here is:

*Can the number of blue whales present in an area be predicted by remotely-sensed sea surface temperature and chlorophyll-a concentration?*

*Dataset*

For these analyses, I used blue whale sighting data from our 2017 survey of the STB. Between 2 and 21 February, we conducted 1,678 km of survey effort. During this time, we recorded 32 blue whale sighting events which consisted of 68 individual whales.

In addition to the blue whale sighting data from our survey, I downloaded two rasters containing average sea-surface temperature (SST) and chlorophyll-*a* (chl-*a*) concentration for the month of February 2017 from NASA’s Moderate Resolution Imaging Spectrometer (MODIS aqua) website (https://modis.gsfc.nasa.gov/data/). These layers use a 4 km^{2 }grid cell size.

* **Hypotheses*

The STB region is characterized by a wind-driven upwelling system which originates off of Kahurangi Point and curves around Farewell Spit and into the bight (Shirtcliffe et al. 1990). Previous work has found that blue whales associate with cold, well-mixed, productive waters (Croll et al. 1998). Additionally, previous work on cetacean habitat modeling has found that animals tend to associate with oceanic fronts and eddies, where water masses of different temperatures are meeting and mixing and appear to aggregate prey (Becker et al. 2010, Cotte et al. 2011). Therefore, I hypothesize that the number of blue whales present within a given spatial area would show a negative relationship with temperature, and a positive relationship with deviation in temperature and chl-*a* concentration.

*Approaches*

For my final analysis for this course, I was able to build a model which included both blue whale presence *and* absence data. First, I loaded the two raster layers in ArcMap and overlaid our research vessel’s trackline as well as the blue whale sighting locations (Figure 2).

I obtained remotely-sensed SST and chl-*a* values for every 4 km^{2 }grid cell that our research vessel transited through using the “extract by mask” tool in ArcMap to extract only the portion of the rasters for which we have actual observations. Subsequently, I used the “raster to point” tool to create a layer with a single point value at the center of each cell we transited through (Figure 3). I then used the “near” tool to sum the number of blue whales sighed in each grid cell and extract that value to the points containing the oceanographic information.

Because it appears that there may not be a simple linear relationship between blue whale presence and the values for SST and chl-*a*, which can be seen in the work I did for tutorials 1 and 2, I wanted to explore what other environmental features can be gleaned from these remotely-sensed raster datasets. In order to assess the possibility of fronts influencing blue whale distribution through a coarse proxy, I computed the standard deviation of both SST and Chl-*a* for each cell along the ship’s trackline. To do this, I used the “focal statistics” tool within the neighborhood toolbox in ArcMap. This allowed me to calculate the standard deviation for each cell across the surrounding 5×5 cell area. Since each of my grid cells is 4km^{2}, this means that I had a value for the standard deviation across a 20km^{2} area. The results of this calculation for SST are shown in figure 4.

I then used the “extract values to points” tool to add sd(SST) and sd(chl-*a*) to the points from the ship’s trackline. The resulting attribute table contained geographic location, number of whales, SST, chl-*a*, sd(SST), sd(chl-*a*) for every grid cell that our ship transited through. I exported this attribute table as a .csv file so that it could be read into R for further statistical analysis.

Finally, I built a generalized additive model (GAM) to assess the effect of these remotely-sensed oceanographic variables on the number of blue whales present in an area, which can be written as:

**Number of blue whales ~ Chl- a + SST + sd(Chl-a) + sd(SST)**

I used a GAM rather than a general linear model (GLM) because my assessments from previous exercises indicated clearly that the relationship is not linear. I chose to use a Poisson distribution, because my response variable contained many values of zero and one, causing a skew in the distribution. The R script I used to run the model is copied below:

*Results*

The results of the GAM are shown in Table 1. Both SST and sd(SST) were significant predictors of the number of blue whales present. There was no significant relationship between either chl-*a* or sd(chl-*a*) and the number of blue whales present. Of the two temperature variables, sd(SST) was a better predictor than just SST.

**Table 1. **Output from the generalized additive model. Statistically significant values are denoted by an *.

Parameter |
DF |
Chi-square |
p-value |

Chl-a
SST sd(Chl- sd(SST) |
1
1 1 1 |
0.336
4.078 0.696 6.182 |
0.5624
0.0434* 0.4040 0.0129* |

Overall, only 4.32% of the deviance in the data could be explained by the model.

*Significance*

Overall, the model I built here does not explain very much of the patterns that may exist in this dataset. That is not overly surprising to me, as I am working with remotely-sensed data that is averaged over a period of a month and I know from personal experience that conditions in this area fluctuate temporally. Moving forward on this project, I will incorporate *in situ* measurements of SST, fluorescence (a measure of chl-*a* concentration), as well as prey density as measured by acoustic backscatter strength. The work I did this term was a valuable exercise in working with these concepts, and I believe that my model will improve with the inclusion of finer-scale predictor variables.

I believe that the results presented here do have biological relevance, and follow what is currently established by the literature. Changes in temperature may not have a direct effect on blue whale distribution, however temperature can be used as a proxy to evaluate other oceanographic phenomena such as oceanographic fronts which may be of greater direct significance in defining blue whale habitat.

This habitat work is of particular interest as relatively little is known about blue whale distribution and habitat use in the Southern Hemisphere, including in New Zealand. The high concentration of industrial activity in the STB region increases the importance of knowing when and where blue whales are most likely to be present so that management strategies can be established accordingly.

*Statistical Approaches Learned*

I started the term off by using a hotspot analysis in exercise 1. For exercise 2, I used a geographically weighted regression (GWR). I think that the GWR is a useful and powerful tool, however I don’t think my dataset is big enough for it to be especially useful, at least in the way that I was applying it. For exercise 3, I used a GLM and practiced moving between ArcMap and R. For the final piece of this project I performed further analyses in both ArcMap and R, ultimately settling on the GAM presented here. There were numerous other trials of various tools that did not make it into any exercises or tutorials, but were useful practice nonetheless.

One thing I think I could have spent some more time on this term is investigating spatial autocorrelation between my variables of interest, through approaches such as correlograms or variograms. Perhaps this is something I will spend a bit more time on as I move forward with my analysis.

I can certainly say that my abilities and my confidence in my use of both Arc and R have improved over the course of this term. While I was working with a dataset which is minimal compared to what I hope to include moving forward, I think that I am now prepared with several tools which will be useful when I incorporate more data. Finally, I found it tremendously helpful to discuss approaches with and receive feedback from my classmates!

*Response to Comments from Previous Tutorials*

In my tutorial 1, I used a geographically weighted regression to examine the relationship between chl-*a* concentration and blue whale group size at each blue whale sighting location. What I found was that the regression model could not fit a linear relationship between the two. Based on my findings and feedback I received, I used a non-linear model for my final analysis. One piece of feedback I received was a question about whether the behavior of the animals may affect their distribution and habitat use. I think that this is an interesting question, but that since I am just using one year of data for this analysis my sample size for different behaviors may not be large enough to draw meaningful conclusions. However, this is something I may pursue when I include more years of survey data in my analysis.

For my tutorial 2, I used a GLM to evaluate chl-*a* and SST as predictors of blue whale group size at blue whale sighting locations. It was suggested that I use a Poisson distribution in my regression model, which I did for the final analysis. For tutorial 2, I was not yet able to extract all of the absence values and so I was advised that my results would likely be more meaningful with the inclusion of points without whales. I was able to do this for the final analysis presented here, and the significance and interpretability of my results did improve. Additionally, I got a suggestion to use principal component analysis in order to decide which oceanographic variables to retain in my model. I think that this approach will be very useful moving forward once I have processed additional data and have several more variables to include.

**References**

Becker EA, Forney KA, Ferguson MC, Foley DG, Smith RC, Barlow J, Redfern J V. (2010) Comparing California current cetacean-habitat models developed using in situ and remotely sensed sea surface temperature data. Mar Ecol Prog Ser 413:163–183

Cotte C, d’Ovidio F, Chaigneau A, Lèvy M, Taupier-Letage I, Mate B, Guinet C (2011) Scale-dependent interactions of Mediterranean whales with marine dynamics. Limnol Oceanogr 56:219–232

Croll DA, Tershy BR, Hewitt RP, Demer DA, Fiedler PC, Smith SE, Armstrong W, Popp JM, Kiekhefer T, Lopez VR, Urban J, Gendron D (1998) An integrated approach to the foraging ecology of marine birds and mammals. Deep Res II 45:1353–+

Shirtcliffe TGL, Moore MI, Cole AG, Viner AB, Baldwin R, Chapman B (1990) Dynamics of the Cape Farewell upwelling plume, New Zealand. New Zeal J Mar Freshw Res 24:555–568

Torres LG (2013) Evidence for an unrecognised blue whale foraging ground in New Zealand. New Zeal J Mar Freshw Res 47:235–248

Torres LG, Gill PC, Graham B, Steel D, Hamner RM, Baker S, Constantine R, Escobar-Flores P, Sutton P, Bury S, Bott N, Pinkerton M (2015) Population, habitat and prey characteristics of blue whales foraging in the South Taranaki Bight, New Zealand.