GEOG 566

         Advanced spatial statistics and GIScience

April 7, 2017

Comparison of near-shore and on-shore predictions of tsunami inundation

Filed under: Final Project,My Spatial Problem 2017 @ 1:06 pm

Research Question

A tsunami is a set of ocean waves caused by any large, abrupt disturbance of the sea surface. As evidenced by the 2004 Indian Ocean and 2011 Japan tsunamis, local tsunamis can destroy coastal communities in a matter of minutes. Tsunamis rank high on the scale of natural disasters. Since 1850 alone, tsunamis have been responsible for the loss of over 420,000 lives and billions of dollars of damage to coastal structures and habitats [1]. Predicting when and where the next tsunami will strike is currently impossible. Predictions for tsunami arrival time and impact based on the source of tsunami, however, can be predicted with modeling and measurement technology. These predictions are vital for coastal communities to make the necessary preparations to mitigate tsunami damages.

My research is involved with developing a methodology for making fast near-shore tsunami predictions based on arbitrarily defined off-shore conditions. The purpose of this methodology is to provide ensemble predictions of near-shore inundation to quantify the uncertainty of tsunami predictions. While novel, this methodology has the limitations of only being able to predict the inundation at near-shore locations and not on-shore locations. Thus the question becomes whether or not we can apply the uncertainty estimations near-shore to on-shore locations.

For this class project, I focused on comparing different near-shore locations to various on-shore locations. I examined what I defined as “major inundation events” and examined if each of these events corresponded to a significantly large wave in the near-shore. Thus my research question is:

“Do variations in the near-shore inundation time series correspond to variation for on-shore inundation?”

or more specifically for this case:

“Do predictions for major flooding events on-shore during tsunami inundation correspond to predictions of large waves near-shore?” 

Data Set

I used near-shore and on-shore tsunami simulation data that predicted sea surface elevation and fluid velocities for a stretch of coast in southern California near Port Hueneme. There were 27 possible data sets each with a different initial tsunami source. For this study I focused on the scenario where a large tsunami was generated near Alaska. This simulation data is provided by the Method of Splitting Tsunami (MOST) model which is the standard model used at the NOAA Center for Tsunami Research (NCTR) [2]. Inundation data for various simulated tsunami events over a duration of 10 hours (600 min) and over a geographical area of bounded by the coordiantes (-119.2469462, 34.1384259) to (-119.1949874, 34.2000000). There are 3600 time steps for each set of data on a 562 by 665 grid. The spacing between the longitude and latitudes are uneven but can be projected onto an even grid. A time series of sea surface elevation and fluid velocities is available for each geographical point (see Figure 1).

Figure 1 – Simulated tsunami inundation data using MOST v4 at Port Hueneme, CA for a simulated Cascadia source. Image is taken from the Performance Based Tsunami Engineering (PBTE) data explorer.

In addition to the tsunami data, I also have the topographical/bathymetric data for the region (which does not change in time) with the same spatial resolution as the gridded data. This data is presented in Figure 2.

Figure 2 – Topological/bathymetric data for Port Hueneme, CA in geographic coordinates (left) and UTM projected coordinates (right).


To strengthen my research, it would be quite nice if every major flooding event corresponded to a large near-shore wave such that I could extrapolate uncertainty predictions in the near-shore to the on-shore. Thus I will phrase my hypothesis like so:

“Each prediction for large on-shore flooding events relates to a near-shore large wave prediction.” 


Many approaches were used to analyze this data. Most of them, unfortunately, failed to produce any meaningful results. At first, I attempted to analyze all of the spatial data – at a given time step – all at once, treating the data essentially as raster layers. For the raster layers I attempted to calculate autocorrelation using Moran’s I and cross-correlations between the inundation data and the topographical data. In both cases, the procedures failed to produce any meaningful results.

After analyzing raster layers failed, I determined that perhaps extracting time series was the way to go. I then selected the points of interest along the coast near Port Hueneme, California. These points were arbitrarily selected but were also ensured to be somewhat well distributed along the coastline and were within the domain of the dataset. Figure 3 shows the locations of the points of interest (POIs) along the coast of Port Hueneme, CA. A total of 12 points were selected and were all meant to represent some building or port location. Their names and geographical coordinates were stored in a .csv file as shown in Figure 4.

Figure 3 – POIs where time series of inundation were extracted along coast near Port Hueneme, CA

Figure 4 – CSV file storing locations and their geographic coordinates

The .csv file was read into R by using the “read.csv” function and this table was subsequently converted to a data frame using the procedure highlighted in “Tutorial 2: Automated Autocorrelation of Tsunami Inundation Time Series at Various Points of Interest.”

After the time series of inundation were extracted for each point of interest, autocorrelations and cross-correlations between the different time series were calculated. Unfortunately these forms of analysis also failed to produce anything meaningful results. Mostly the data showed that there was autocorrelation in the data, but it was not useful for determining similarity or difference in the variations for the different time series. Additionally, differencing between different time series was also applied but was found to not be appropriate for the data.

All of these failed attempts led to less polished approach. For each of the times series of data, the 5th and 95th percentile of each of the time series was calculated and plotted on each of the time series plot as shown in figure 5.

Figure 5 – Time series of inundation at each location with 5th and 95th percentiles of inundation levels plotted as the blue and red dashed lines respectively.

What we do here is define a significantly large wave as any wave in the first 3 locations – near-shore locations – that have positive maximums that exceed the 95th percentile or the red line. For the 9 on-shore locations, flood events were defined as the inundation peaks in the time series. Here we define significant flood events as ones with heights that exceed the 95th percentile or the dashed red line for these. In each case, the timings for each of the big waves or inundation events were tabulated and we define these variations as related if they were within 10 minutes of one another. This 10 minute value was derived for the rough 10 minute periodicity of the waves in the near-shore time series. Additionally, we calculate the distance between each of these points by using a plane approximation since the locations are within 10 km of one another.


Table 1 shows the timings of significant waves for near-shore locations and significant inundation events for on-shore locations. Based on our definitions, it was found that all inundation events on land corresponded to some near-shore large wave. However, the reverse is not true in that not every large wave found in the near-shore corresponded to a large flooding event. Notice that almost every single big flooding event corresponded to large waves at all 3 near-shore locations of Port Hueneme Jetty, Channel islands Harbor Jetty, and Port Hueneme. The exceptions however are the red and light blue flooding events. The red event was observed at the Beachcomber Tavern, Manda’s Deli, and the Naval Construction Batallion Center. However, the red events were only considered significant waves at the Port Hueneme locations and the light blue event was only considered significant at the center of Port Hueneme and not at the jetty.

Table 1 – Timings (in minutes) for significant waves or inundation events. Cells highlighted with the same color indicate related events.

Using the distances between points calculated and shown in table 2, we hypothesize that a certain max distance should be used to match variations from near-shore to on-shore. However, we can see that the Port Hueneme Jetty was actually closer to the Beachcomber Tavern than the center of Port Hueneme. We do, however, see that the Channel Islands Harbor Jetty was much further away and it may be pertinent to not use locations that are more than 1 kilometer to one another for comparison.

Table 2 -Distances between near-shore locations and on-shore locations. Distance values in kilometers.


Tsunami inundation data has been very difficult to measure in the field throughout history due the inherent dangers of having people or instrumentation in the inundation zone. This fact, combined with the relative infrequency of tsunami events, forces managers and disaster planners to rely on prediction data from tsunami models [3]. The problem with this, however, is that these models are deterministic and do not give a good idea of what the uncertainty of these predictions are. This can be problematic when relying on these predictions to perform risk analyses.


The tool being developed in my other research has limitations for only being able to predict near-shore variation in tsunami inundation and not on-shore. Preliminary results from this study suggest that most variations of on-shore inundation can be captured by simply looking at large wave events in the near-shore which adds the usefulness of the aforementioned methodology that is beyond the scope of this class. However, it should be noted that one has to be careful when selecting which near-shore points to use to extrapolate variability for on-shore measurements. This area will require much extra study before actually extrapolating predictions for uncertainty.

Software Learning

Before this class began I had no experience in R or Python. Throughout this class I have had the opportunity to learn a great deal of R and some Python. Overall I think it has been a useful and rewarding experience.

Statistics Learning

I was able to pickup on some statistical techniques regarding auto-correlations, cross-correlations, and wavelet analysis in a general sense despite their lack of meaning for my overall project. The main thing I learned, however, was that working with big data is difficult and that most traditional data analysis techniques appear to fail when encountered with such a large data set. The experience of having to poke around and decompose the data into usable forms was very educational for me, but I still think I have a long way to go before being comfortable with big data analysis.

Response to Commentary

Most of the commentary I received was on the use of the techniques of auto-correlation, cross-correlation, and wavelet analysis. However, in the end I found that none of these techniques were useful for what I was trying to determine so unfortunately I could not apply any of the suggestions or comments mentioned.

There was however, one question in the comments that stated, “Finally, I was wondering if the areas with highest inundation are also the most the vulnerable?” Which is a good question because high inundation would usually highlight a dangerous situation, but vulnerability is not so simply defined. Vulnerability can really be thought of as risk*exposure. And thus, the area would only be vulnerable if there were people or structures of interest in the area with high inundation. Simply analyzing prediction of tsunami inundation are unfortunately unable to answer these questions and determine vulnerability. Assessing both the natural and human systems will be required to perform vulnerability analysis and that is beyond the scope of this study.


[1] NOAA. Web. Retrieved 20 March 2017.

[2] Numerical modeling of tidal wave runup, VV Titov, CE Synolakis (1999). Journal of Waterway, Port, Coastal, and Ocean Engineering 124 (4), 157-171,

[3] Wegscheider, S., Post, J., Zosseder, K., Mück, M., Strunz, G., Riedlinger, T., Muhari, A., and Anwar, H. Z.: Generating tsunami risk knowledge at community level as a base for planning and implementation of risk reduction strategies, Nat. Hazards Earth Syst. Sci., 11, 249-258, doi:10.5194/nhess-11-249-2011, 2011.

Print Friendly, PDF & Email

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

© 2019 GEOG 566   Powered by WordPress MU    Hosted by