- The research question that you asked (provide one question for each exercise).
Exercise 1: “How is Mean Sea Level Atmospheric Pressure distributed across the time period of 1980 to 2020?”
Exercise 2: “Is there a global relationship between Mean Sea Level Pressure and Geomagnetic Intensity (year 2020 specifically)?”
Exercise 3: “Is there a significant difference in the global distribution of the residual values for Mean Sea Level Pressure from Geomagnetic Intensity for the 2010-2020 period?”
- A description of the dataset you examined, with spatial and temporal resolution and extent.
Mean Sea Level Pressure data is retrieved from the Copernicus website from NOAA as a .grib global dataset with 0.25×0.25 for 1980 to 2020.
Global Geomagnetic data (specifically, Geomagnetic Intensity) was retrieved from the NOAA IGRF2015 model as a .csv file with a 0.5×0.5 resolution for the same periods.
Both variables are a set of classified values of a specific value, which is basically meteorological and geological data.
- Hypotheses: predictions of patterns and processes you looked for.
My hypothesis was that higher Magnetic Intensity would result in higher Mean Sea Level Pressure, attracting heavy air particles closer to the area.
- Approaches: analysis approaches you used.
I used hotspot analysis and Moran’s I autocorrelation for the first exercise.
I used Cross-Correlation for my second exercise.
Finally, I used the “Agreement/difference between two (raster or vector) layers” method, though I had to use several additional tools to substitute the “Confusion matrix”, due to data being too bulky to be able to be processed in ArcGIS.
- Results: what did you produce — maps? statistical relationships? other? Present the key, important results you created.
Several maps and statistical relationships were produced to describe the relationship between Mean Sea Level Pressure and Geomagnetic Intensity.
First, I used the ArcGIS geoprocessing tool called Spatial Autocorrelation (Moran’s I global) to assess if the data is dispersed or clustered.
I expected it to show a clustered distribution with a Moran’s Index of 0.99 and a p-value of 0 due to the data being a 0.25×0.25 degree grid.
The results proved to be as assumed.
In order to define future areas of interest, a hotspot analysis was conducted.
I expected it to show a few spots of aggressive fluctuation in atmospheric pressure.
The results presented were quite interesting and, in some cases, unexpected. As presumed, most of the ocean and sea surface didn’t have any significant fluctuations. However, generally, only the equatorial area didn’t present any hotpots. Analysis showed that there are areas of interest near Chile and some mountain formations across Africa. Most of the map presented a large area of hot spot with 99% confidence in both hemispheres.
For the second exercise, Initially, I thought of using Geographically Weighted Regression. However, I was only able to perform it later, when comparing two time periods, and it was calculated with a lot of errors. Therefore, I decided to use Cross-Correlation, since it would present me more valuable information.
I expected it to plot a map in a form of a taster to assess areas with high correlation and, therefore, the possible relationship between Mean Sea Level Pressure and Geomagnetic Intensity (year 2020 specifically). However, it was plotted as a feature layer, which I had to transform into a raster later (both for a more visually comprehensive picture and to being able to perform raster analysis).
By assessing the following information, it is seen that there are areas that present a high relationship between Mean Sea Level Pressure and Geomagnetic Intensity. As such, these areas are mostly around the Arctic and Antarctic areas, where the relationship is negative, with several other areas:
1) Eastern part of the Asia
2) European area
3) The middle part of the Pacific Ocean near the coast of North America
4) Southern part of South America
The correlation index R^2 is equal to 0.04, which can tell us that the relationship, though weak generally, is still present. However, certain areas still have a high correlation. By assessing the graphs, it is evident that further research is needed to look for a possible lag in the relationship.
Lastly, for the third exercise, I plotted Standard Deviation from predicted values maps to compare the temporal difference within the assessed relationship.
I transformed both of my feature layers of Standard Deviation for 2010 and 2020 into rasters, using “Kriging” tool.
Then I combined both layers to look for any differences/similarities between the two rasters.
After that, I used the “Change Detection” tool to specifically show how the two rasters differ from each other.
I also plotted two graphs in order to look into the comparison between the Standard Residual and Normal Distribution for each time period.
My initial hypothesis was accurate to some extent. The overall global trend is similar within given time periods, taking the graphs for Standard Residual vs Normal Value distribution into account. However, the strength of the relationship between Geomagnetic Intensity and Mean Sea Level Pressure differs depending on the region. Looking into the Change detection raster, the most stable regions are seen where there is no color. Currently looking into the Tropical Cyclone distribution as a side project, this tells me that the tropical region near North America will be a perfect fit for further assessment.
The difference within certain regions might be caused by constant changes in magnetic fields. Therefore, the overall process of change would probably be due to different polarization of the Earth and certain areas being more magnetized than the others. One of the other assumptions previously thought of is that such distribution change could be due to anthropological influence in specific areas, such as Asia and North America. Though this is just speculation.
- What did you learn from each of the analyses you conducted (i.e., from each exercise)?
Exercise 1: I learned that Moran’s Spatial Autocorrelation analysis makes no sense with grid data, because the cells themselves are already clustered. It can only show if there are any missing spots in the data itself. However, a hotspot is much more representative of the situation, though similar to the plotted map itself.
Exercise 2: I finally learned how to properly do correlation indices, though it took me a lot of steps. Most importantly, I learned that, apparently, Geographic Regression doesn’t work with that amount of data in ArcGIS, which requires a separate program written for it. Additionally, I learned how to properly set the same resolution for two different datasets manually since ArcGIS wouldn’t properly change the resolution with the geoprocessing tools it has.
Exercise 3: I learned that the best way to look for differences in distribution would be a hotspot analysis. Though, additionally, contour maps could be projected to see the movement of the variables. Furthermore, I learned that the temporal scale is very important, and future assessments should include a much larger period. In this exercise, I have also learned about Kriging, which I used to plot my rasters. It was a very useful tool to implement since my data was more visually presentable and didn’t have any missing data within it.
- Significance. How are these results important to science? to resource managers?
I believe that this can be important to hydrometeorology, since, if proven to be right, the approach of forecasting several weather variables through geomagnetism (or including it in the existing forecasting models) could potentially improve the forecast period or the quality of them. However, originally, I was looking into this relationship to learn whether it’s possible to predict the formation of Tropical Cyclones.
When talking about resource managers, it’s possible to be able to direct resources to specific areas of need. For example, if we know that there’s a high magnetic field in a specific area of the Pacific Ocean, we will know to look for cyclogenesis and add geomagnetism to track its probable route. Therefore, resource managers would know where to spend their resources as means of mitigating the damages caused by cyclones.
- Software learning. Your learning: what did you learn about software (a) Arc-Info, (b) GIS programming in Python, (c) programming in R, (d) Modelbuilder in Arc, or (e) other?
Throughout these exercises, I’ve used several different tools. However, not all of them were eventually used for the final assessment.
Excel: Terrible when data is too big. It’s more efficient and faster to either use Python, R, or SQL (which I might be learning next);
ArcGIS: I explored more tools that I can use in my future assessments. Moreover, some tools do not work as well, and it seems that Esri updates its software very rarely. I had a talk with one of the GIS specialists when applying for an internship. He said that Esri packages are very limited in their functionality and, due to that reason, they primarily use QGIS. Therefore, I might switch to QGIS, since it is Open Source, free, and will be easy to learn after using ArcGIS.
R: Initially, I thought that it would be useful to remember R and start learning it again. Seeing how everyone uses it made me think that it has improved. However, while trying to complete this assessment in R, I remembered why I quit coding in it and switched to Python. I know that R has a reputation for being a science/data oriented programming language, but it seems to be less intuitive than Python.
Python: I have never worked with GIS packages before in Python. I wasn’t able to use them to their fullest, but I was able to learn a new package for me, which is pandas. I remember using several other packages to be able to perform the same analysis. Now it requires less time and effort for me to perform data analysis, especially with big data. One of the most important things for me also was that I was able to return to programming once again, which got me into learning and exploring new packages.
- Statistics learning. What did you learn about statistics, including (a) hotspot, (b) spatial autocorrelation (including correlogram, wavelet, Fourier transform/spectral analysis), (c) cross-correlation/regression (cross-correlation, geographically weighted regression [GWR], regression trees, boosted regression trees), (d) multivariate methods (e.g., PCA, multiple component analysis), (e) other techniques (change detection/confusion matrices, other)?
Hotspots: A perfect tool to use when having grid data. It allows assessing certain points of interest better visually. And being able to combine 2 layers of hotspots, makes it easier to define changes in, for example, temporal scale.
Spatial Autocorrelation: This doesn’t provide much information with grid data, except for, perhaps, some errors or missing information within the dataset itself. However, could be a useful tool for other types of data.
Cross-correlation/GWR: Personally, I love this method, since it was able to provide specific relationship variables for my datasets. It takes a lot of time and requires a lot of preparation if datasets are different (which was my case). Though it shows very representative and informative results.
Agreement/difference between two (raster or vector) layers: I mostly used this term to describe how I was able to combine two raster layers of hotspots to be able to visualize changes. However, I used additional statistics, such as Standard Residuals distributions to look if the pattern in different time periods would be different.
- Evolving question. How did the results of each analysis lead you to change/refine your question? Write out the original question you stated at the beginning of the class, and restate the question(s) you now plan to address.
My initial question was: “How are the Atmospheric variables that describe the weather related to Geomagnetic properties of the Earth through the mechanism of heavy particles magnetization”.
After assessing the results of my analysis, I decided to be more specific in my variables and time periods. I wasn’t able to perform an analysis on a large temporal scale, so I shorten it quite much. I also wasn’t able to find data on some of the variables I initially wanted to look at (Kpa/AA).
Consequently, my question transformed and changed into the following: “Is there a relationship between Mean Sea Level Pressure and Geomagnetic Intensity for the time period of 2020?”
- Future techniques. What techniques would you like to explore to answer your research questions in the future?
I would love to explore Geographically Weighted Regression more in the future. Currently, I am not sure whether this is an ArcGIS problem, or if I am doing something wrong, but I wasn’t able to perform it within the program without errors. The map I was able to get in ArcGIS resulted in a feature layer with a lot of errors in the form of a checkerboard (it was basically a grid, where the values would be good inside the grids themselves, but the “lines” would be error data).
When dividing my global data into separate areas, I would also like to work with a confusion matrix, since it seems that it won’t compute with such a huge amount of data. Which forces me to, perhaps, explore the QGIS’s open-source code.
I would also like to explore more plotting and GIS related packages in Python since it seems that this topic is of interest to me. Furthermore, I want to know if it is possible to implement this knowledge into AI programming.