Background and Research Question
Mineral dust is the most important external source of phosphorus (P), a key nutrient controlling phytoplankton productivity and carbon uptake, to the offshore ocean (Stockdale et al., 2016). Paytan and MacLaughin (2007) emphasized that atmospheric P can be important as the major external supply to the offshore ocean, particularly in oligotrophic areas of the open ocean and areas that are P-limited, such as Bermuda Ocean. The most important source of atmospheric P is desert dust, which has been estimated to supply 83% (1.15 Tg⋅a−1) of the total global sources of atmospheric P (Mahowald et al., 2008). Of that dust, it is estimated that 10% is leachable P (Stockdale et al., 2016). In addition, Saharan dust supplies a significant fraction of the P budget of the highly weathered soils of America’s tropical forests and of the oligotrophic water of the Atlantic Ocean, increasing the fertility of these ecosystems (Gross et al., 2015).
Hence, this background is my starting point to try analyzing the correlation between the Particulate Organic Phosphorus (POP) and Primary Production (PP) in Bermuda Ocean. In exercise 2, I tried to answer these questions, however, the result was not likely that I expected. In this exercise 3, I explored a lot about my raw data and I realized that the way I divide the data will affect the result a lot. In this exercise I dug up a lot about the time series, regression, and auto-correlation function in R.
However, due to my misinterpret data in exercise 2, I ended up adding one more variable in this exercise 3 to broaden my analysis and convince my result. The previous study revealed that most phosphorus (P) and iron (Fe) are present as minerals that are not immediately soluble in water, hence not bioavailable (Lidewijde et al (2000), Shi et al (2012)). Lidewijde et al (2000) also stated that phosphorus (P) and iron (Fe), if deposited to the surface ocean, may pass through the photic zone with no effect on primary productivity, owing to their high settling velocity and low solubility. The photic zone has relatively low levels of nutrient concentrations, as a result, phytoplankton does not receive enough nutrients. Moreover, there are several factors that contribute to the primary production, such as physical factors (temperature, hydrostatic pressure, turbulent mixing), chemical factors (oxygen and trace elements), and biological factors. Hence, I added the temperature variable in this exercise 3.
In this exercise 3, I would like to find out what factor that contributes to the PP in Bermuda Ocean and how is the time cycle of three variables (PP, POP, and Temperature)?
Dr. Julia helped me in pre-processing data by using excel, and for further analysis, I used three tools in R:
- Time series function: This function will help us to plot the time series trend for every data.
Ex code: plot.ts(POPR['PP'], main="PP depth 0-7 meter", ylab="PP")
- Linear regression model: This function will help us to identify the correlation between two designated variables, in this exercise, the dependent variable is Primary Production and the independent variables are POP and Temperature.
Ex code: POPR.lm<-lm(POP~PP,data=POPR) and summary(POPR.lm).To plot it into a scatter and linear line, I used ggplot function.
Ex code: ggplot(POPR, aes(x=POP, y=PP))+ geom_point() + geom_smooth(method=lm,se=FALSE)
- Auto-correlation function: This function is used to find patterns in the data. Specifically, the autocorrelation function tells you the correlation between points separated by various time lags. In this exercise, the lag ranges from +1 to -1, where +1 is perfectly related and -1 is inversely related. In afc chart, the dashed line represents the boundary of the significance of correlation.
Ex code: acf(POPR['PP'],lag.max = 51,type = c("correlation", "covariance", "partial"),plot = TRUE,na.action = na.contiguous,demean = TRUE)In this code, we can change the lag.max to the number of the data if we want to access the lag coefficient individually, or uses NULL to default. For this exercise I used lag.max which according to the number of my data.
Steps of Analysis
- Divided the data based on the depth categories:
2. Regression analysis for between PP vs POP and PP vs Temperature. Since the data of PP and temperature is only for 0-8 meter depth, hence the regression is only conducted for in this depth category. In order to perform the regression, I pulled out four outliers:
3. Accessing the temporal pattern of variable PP, temperature, and POP by using the auto-correlation function and also time series function. Due to some missingness data, especially for POP and temperature in 0-7 meter depth, hence the date 2014/03/06, 2015/02/04, 2015/12/13 is pulled out for POP and date 2014/03/06 and 2014/12/11 is pulled out for temperature.
From figure 1, overall the POP has a positive correlation to the PP, and the temperature has a negative correlation to POP (R square PP vs POP is 0.3064 and R square PP vs temperature is -0.183). From this analysis I can assume that POP is a factor that contributes to the Primary Production in Bermuda ocean. However, to see the detail of this analysis I tried to see the regression analysis per month in a 5 year interval period. I did not perform the regression in January due to a lack of data. From figure 2, we can see that from April to December and February, the pattern of correlation between PP vs POP is similar to the pattern of PP vs temperature, where the highest is in May and the lowest is in December and February. The significant difference happens in March, where the gap is +0.98 for PP vs POP and -0.99 for PP vs Temperature.
2. Auto-correlation and Time Series PP, POP and Temperature depth 0-7 meter
From figure 3, as we can see the temporal pattern of PP and POP is likely similar, where the highest peak is at the beginning of the year for every year, except for 2012 and 2016. This is due to in 2012 the data start in June and March in 2016. This pattern is inversely for temperature, wherein the beginning of the year the temperature is very low and high from June to July. To confirm this pattern, we can see the autocorrelation (ACF) chart, where the temperature has significantly related to the function of time, The vertical line crosses the horizontal dash line which means there is a repetition cycle in time for temperature.
As a temporal pattern, there is a repetition pattern for POP and PP, where the highest value happens in January-March. However the difference from temperature is the repetition in temperature has the exact same value, which does not happen for POP and PP. There is a big gap between the value of POP and PP in January and February from 2013 to January and February in other years. I think that is why there is only one significant correlation line in the beginning ACF chart for POP and PP.
3. Auto-correlation and Time Series of POP by depth categories
From figure 4 we can see that the temporal pattern from lag 0-5 is similar for all depth categories where for overall lag only POP 1 to 3 is similar (depth 0-22 meter). It indicates that the deposition of the POP can reach 22 meters depth at the same time.
If we see at POP 4 to POP 6, there is no repetitive pattern both in the ACF chart and the time series chart. However the repetitive pattern happens from POP 7 to POP 10, or from 98-164 meter depth.
By looking at the number of POP 11 (depth 197-200 meter), there is only a small number of POP can reach this depth.
Critique of the method – what was useful, what was not?
Overall this exercise 3 made me realized the way we process the data will affect the result. In exercise 2, I was likely to simplify the process, hence I got difficulties in order to interpret my data and the result is not likely what I expected.
In this exercise 3 I learned a lot about the Auto-correlation function and time series function in R. In my opinion, acf is very useful for stable data like temperature, where the repetitive pattern is along with the value. The significance of ACF is based on the value, hence in the ACF temperature chart (figure 3) we can see that most ACF has a significant correlation overtime period (both positive and negative), which does not happen in POP and PP data.
POP and PP have a pattern, however, the value varies overtime period, hence although we can see the pattern in time-series and ACF charts, the significance only appears in the beginning month in 2013. In my opinion, if we would like to access the significance cycle, the unstable data like POP and PP where the is a big gap value overtime period, ACF is not really suitable. However, if we just would like to see the pattern we can relly on time-series and ACF function as well.
Gross, A., Goren, T., Pio, C., Cardoso, J., Tirosh, O., Todd, M. C., Rosenfeld, D., Weiner, T., Custódio, D., & Angert, A. (2015). Variability in Sources and Concentrations of Saharan Dust Phosphorus over the Atlantic Ocean. Environmental Science & Technology Letters, 2(2), 31–37. https://doi.org/10.1021/ez500399z
Eijsink LM, Krom MD, Herut B (2000) Speciation and burial flux of phosphorus in the surface sediments of the eastern Mediterranean. Am J Sci 300(6):483–503. doi: 10.2475/ajs.300.6.483.
Mahowald, N., Jickells, T. D., Baker, A. R., Artaxo, P., Benitez-Nelson, C. R., Bergametti, G., Bond, T. C., Chen, Y., Cohen, D. D., Herut, B., Kubilay, N., Losno, R., Luo, C., Maenhaut, W., McGee, K. A., Okin, G. S., Siefert, R. L., & Tsukuda, S. (2008). Global distribution of atmospheric phosphorus sources, concentrations and deposition rates, and anthropogenic impacts. Global Biogeochemical Cycles, 22(4). https://doi.org/10.1029/2008GB003240
Paytan, A., & McLaughlin, K. (2007). The Oceanic Phosphorus Cycle. Chemical Reviews, 107(2), 563–576. https://doi.org/10.1021/cr0503613
Stockdale, A., Krom, M. D., Mortimer, R. J. G., Benning, L. G., Carslaw, K. S., Herbert, R. J., Shi, Z., Myriokefalitakis, S., Kanakidou, M., & Nenes, A. (2016). Understanding the nature of atmospheric acid processing of mineral dusts in supplying bioavailable phosphorus to the oceans. Proceedings of the National Academy of Sciences, 113(51), 14639. https://doi.org/10.1073/pnas.1608136113
Zongbo Shi, Michael D. Krom, Timothy D. Jickells, Steeve Bonneville, Kenneth S. Carslaw, Nikos Mihalopoulos, Alex R. Baker, Liane G. Benning. Impacts on iron solubility in the mineral dust by processes in the source region and the atmosphere: A review. Aeolian Research, Volume 5, 2012, Pages 21-42. ISSN 1875-9637. https://doi.org/10.1016/j.aeolia.2012.03.001