The Science Behind Sofar Ocean Marine Weather Forecasts

Tim Janssen
By: 
Tim Janssen

To improve ocean insights and forecasts, Sofar operates the largest privately owned ocean weather sensor network in the world (see figure 1), currently consisting of nearly 200 real-time weather sensors in the Pacific. Each sensor measures wave conditions, surface winds, and drift currents, and transmits the data to the Sofar cloud in real-time. We assimilate the data in our wave forecast models to create the highest fidelity wave forecasts in the world. Compared to NOAA models, the Sofar data network reduces model rms errors by more than 20% across the board (see figure 3) and more than 50% in high-energy swell forecasts (see figure 2). The best part? All our real-time sensor data and enhanced model forecasts are publicly available at weather.sofarocean.com 

On December 27th 2019, an energetic storm system developed in the western Pacific about 1000km east of the Japanese coast (figure 1). The extreme surface winds generated 30m waves, taller than a 7-story building. These energetic waves radiate across the northern Pacific basin, first approaching Hawaii and then making their way to the west coast of the United States. 

Figure 1: The map shows part of the Sofar sensor network with a wave height overlay for December 27th (Sofar model). A large storm event is driving extreme waves in the northwest Pacific. The pentagon-shaped icons indicate locations of live Spotter weather buoys currently deployed in the Pacific. The yellow Spotter icons indicate sensors that are used in the data assimilation, the red circled Spotter icons represent Spotters that were not included in the assimilation (see figure 3). We compare wave height time series at Spotter 0323 and NDBC Station 51000 (figure 2), which are indicated on the map.

Predicting the size and arrival times of these waves at sites around the Pacific is important for coastal protection (e.g. inundation of islands), safety at sea, and offshore operations. The forecast centers of the National Oceanographic and Atmospheric Administration (NOAA) use statistical wave models (WaveWatchIII) to do this. For these types of energetic events, predictions are often inaccurate, predicting arrival times that are many hours too early (or late) and wave heights that can be off by 100% or more. To improve this, Sofar ingests all available data from its network in its own version of a global wave forecast model (see figure 1). This data-driven model generally performs much better than NOAA and for this specific event rms error reductions of more than 50% were achieved (see figure 2). To be clear, the Sofar global wave model is the exact same model as used by NOAA, in every detail. The difference is entirely due to the data.

Figure 2 Comparison of 1-day forecasts to wave height observations at two stations for the arrival of waves radiated by the December 27th storm event (see figure 1). Left panel: Spotter-0323. Right panel: NDBC station 51000. These two sites show about 50% reduction in RMS error for both sites. This implies that the data assimilation adds considerable skill to the 24-hr wave forecasts. Refinements in the assimilation strategy will further improve this.


And here’s why this is important. 

With over 70% of our planet covered in water, the surface of the ocean is one of the most important regions of our world. Nearly all economic activity on our oceans takes place near the surface (e.g. maritime shipping, fisheries, aquaculture, marine renewable energy, recreation) and the interface between ocean and atmosphere is critical in prediction of global weather and climate. In short, ocean weather matters. Yet, weirdly enough, we also know almost nothing about it.

Buoy networks that provide accurate data are invariably anchored near the coast in water depths typically less than a few hundred meters. Away from the coast, in deeper water, extensive buoy networks are not economically feasible. For weather information in open ocean we rely on a combination of visual observations made by ship crew, and satellite-based proxy measurements. These sources have limited accuracy, and are available at irregular space and time intervals. In most places and most of the time, we have absolutely no information about real-time ocean weather conditions. This complete lack of information affects safety at sea, and severely limits our ability to predict and forecast weather events developing and traveling across the ocean.

State-of-the-art weather models developed and used by world-leading operational forecast centers such as NOAA in the United States and the European Centre for Medium Range Weather Forecasts (ECMWF) in Europe, are numerical computer models that use the laws of Newtonian mechanics and thermodynamics to propagate weather information forward in space and time.  These models have gotten progressively better since the 1960s with the rapid development of the modern computer and computational techniques. In fact, our numerical weather models are now so good that the skill of our forecasts is entirely  limited by the lack of ocean weather data. After all, even if a weather model were perfect (and by definition, models are never perfect representations of reality), its predictions can only ever be as good as our knowledge of the current conditions. Any error in the ocean weather now-state will be propagated forward in our forecast models. The old adage “garbage in is garbage out” holds true.  Although all of this speaks to how much progress we have made in numerical weather models, it also illustrates the next barrier: the oceanic data gap. And that is not necessarily an easier problem to solve.

The ocean is a harsh environment. Most of it is too remote for wide-band communications, and it is filled with stuff that modern electronics do not get along with (salt water, violent motions, storms, breaking waves). And then there is the size. With a surface area approaching 140 million sq miles we need to get away from single ‘exquisite’ sensor systems that can provide extreme accuracy at a single point. Instead we should focus on developing massively distributed networks with low-cost nodes to provide unprecedented sensor density. In this distributed sensing paradigm, a single sensor is a (disposable) node, and the complete array is the instrument.

To develop this paradigm and test it, our team has recently deployed 200 Spotter sensors in the Pacific basin that collect real-time data on waves, wind, surface drift, and water temperature (SST). It is the first distributed sensor network of its kind, and represents the largest and most advanced privately owned ocean data network in the world. And of course we are rapidly expanding this network to provide the first direct-accessible global ocean weather sensor network.

Figure 3 Scatter plots of modeled vs observed wave height at randomly selected sites in the northern Pacific for the period October 1, 2019 through December 31, 2019. The randomly selected validation sites were not included in the assimilation (see figure 1, red-circled Spotters). Left panel: Sofar assimilation model. Right panel: NOAA model. The assimilation of the data network into the Sofar model produces a 25% reduction in RMS error across the board, and 50% reduction in model bias. 

To improve weather forecasts, the Sofar team runs its own global numerical weather models. We start with ocean waves, since they are an important part of ocean weather and affect both open ocean activities and coastal dynamics. To do that we run the same model as NOAA (WaveWatchIII + ST4 physics) but we feed our real-time sensor data to the model. This process of adding data to models is called data assimilation. If we would have only a few data points in the Pacific, assimilation would be ineffective. However, with the data density we have now, even simple assimilation strategies work much better than they should.  To keep things simple we started by deploying a widely used technique called optimized interpolation (OI). Basically, every time we start a forecast loop we adjust the initial condition (the now-state) to better match the observations in the regions where we have sensors. In this way we ‘nudge’ the model to follow the measurements as much as possible, and this information is then propagated forward in time (forecast) by the physics numerical model. It’s a very simple but efficient method to integrate observations in model forecasts. 

Even with a simple assimilation strategy, the sheer density of our network provides immediate and meaningful improvements to global wave predictions. To quantify our performance over a longer period of time, we compare our model and NOAA’s model to the observations. The data at the sites where we make the comparison are of course not included in the assimilation. And the results are impressive (see figure 3). Compared to the NOAA model the Sofar assimilation model has a 25% reduction in RMS errors across the board, and more than 50% reduction in model bias. Notably, during high-energy events (when it matters most) the differences are typically considerably larger than that. Again, there is nothing special about our model. It’s the data.

The Sofar team is committed to providing the world with access to unprecedented ocean data density. The Sofar global ocean weather sensor network is a first step toward that goal: our contribution to reducing the oceanic data gap. To disseminate the data globally, we built a modern API for seamless integration into modeling systems and real-time worldwide access. Moreover, we continue to work with academic research groups worldwide provide better ocean insights and improve weather and climate models by creating ocean data abundance.



Related posts

Webinar: Scaling Real-time Ocean Observation Networks Using Agile Sensor Platforms

The Perfect Trident Field Kit

After searching through the records of our own trips we’ve come up with what we consider a good basic field kit for any Trident deployment.

How to Stream Your Trident Dive Live to the Internet