High-Level Wrapper Function#

The diwasp function is the main entry point for analyzing wave data. It provides a simple interface for processing continuous sensor data over multiple time windows and returns wavespectra-compatible output.

Overview#

from diwasp import diwasp

result = diwasp(
    data,                    # DataFrame or Dataset
    sensor_mapping,          # Map variable names to sensor types
    window_length,           # Analysis window in seconds
    window_overlap,          # Overlap in seconds
    depth,                   # Water depth in meters
    method='imlm',           # Estimation method
)

Input Data Formats#

pandas DataFrame#

For DataFrame input:

Index: Must be a DatetimeIndex
Columns: Sensor variables (pressure, velocity, etc.)

import pandas as pd
from diwasp import diwasp

# Load data with datetime index
df = pd.read_csv('wave_data.csv', index_col='time', parse_dates=True)

# Or create from scratch
time = pd.date_range('2024-01-01', periods=7200, freq='500ms')
df = pd.DataFrame({
    'pressure': pressure_data,
    'u_velocity': u_data,
    'v_velocity': v_data,
}, index=time)

result = diwasp(
    df,
    sensor_mapping={'pressure': 'pres', 'u_velocity': 'velx', 'v_velocity': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

xarray Dataset#

For Dataset input:

Time dimension: Must contain datetime values
Data variables: Sensor measurements

import xarray as xr
from diwasp import diwasp

ds = xr.Dataset({
    'pres': (['time'], pressure_data),
    'velx': (['time'], u_data),
    'vely': (['time'], v_data),
}, coords={
    'time': time_values,  # datetime64 array
})

result = diwasp(
    ds,
    sensor_mapping={'pres': 'pres', 'velx': 'velx', 'vely': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Parameters#

Required Parameters#

Parameter	Type	Description
`data`	DataFrame or Dataset	Input sensor data
`sensor_mapping`	dict	Maps variable names to sensor types
`window_length`	float	Analysis window length in seconds
`window_overlap`	float	Overlap between windows in seconds
`depth`	float	Water depth in meters

Sensor Mapping#

The sensor_mapping parameter maps your variable/column names to DIWASP sensor types:

sensor_mapping = {
    'my_pressure_col': 'pres',    # Pressure sensor
    'east_velocity': 'velx',       # East velocity component
    'north_velocity': 'vely',      # North velocity component
}

Available sensor types:

Type	Description
`'elev'`	Surface elevation
`'pres'`	Pressure
`'velx'`	X-component velocity
`'vely'`	Y-component velocity
`'velz'`	Z-component velocity
`'vels'`	Surface vertical velocity
`'accs'`	Surface vertical acceleration
`'slpx'`	X-component surface slope
`'slpy'`	Y-component surface slope
`'accx'`	X-component acceleration
`'accy'`	Y-component acceleration
`'accz'`	Z-component acceleration
`'dspx'`	X displacement
`'dspy'`	Y displacement

Optional Parameters#

Parameter	Type	Default	Description
`method`	str	`'imlm'`	Estimation method: `'dftm'`, `'emlm'`, `'imlm'`, `'emep'`, `'bdm'`
`time_var`	str	`'time'`	Name of time dimension (Dataset only)
`x_var`	str	`'x'`	Name of x-coordinate variable
`y_var`	str	`'y'`	Name of y-coordinate variable
`z_var`	str	`'z'`	Name of z-coordinate variable
`z`	float or dict	None	Sensor z-positions (height above seabed)
`x`	float or dict	None	Sensor x-positions
`y`	float or dict	None	Sensor y-positions
`fs`	float	None	Sampling frequency (inferred if None)
`freqs`	ndarray	None	Output frequency grid
`dirs`	ndarray	None	Output direction grid
`dres`	int	180	Directional resolution
`nfft`	int	None	FFT length
`smooth`	bool	True	Apply spectral smoothing
`verbose`	int	1	Verbosity level (0, 1, 2)

Sensor Positions#

Single Position for All Sensors#

If all sensors are co-located:

result = diwasp(
    df,
    sensor_mapping={'p': 'pres', 'u': 'velx', 'v': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    z=0.5,  # All sensors 0.5m above seabed
)

Different Positions per Sensor#

For sensors at different locations:

result = diwasp(
    df,
    sensor_mapping={'p': 'pres', 'u': 'velx', 'v': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    x={'p': 0, 'u': 0, 'v': 0},
    y={'p': 0, 'u': 0, 'v': 0},
    z={'p': 0.5, 'u': 1.0, 'v': 1.0},
)

Positions from Dataset Attributes#

For xarray Datasets, positions can be stored as variable attributes:

ds = xr.Dataset({
    'pres': (['time'], data, {'x': 0, 'y': 0, 'z': 0.5}),
    'velx': (['time'], data, {'x': 0, 'y': 0, 'z': 1.0}),
    'vely': (['time'], data, {'x': 0, 'y': 0, 'z': 1.0}),
})

# Positions are read automatically from attributes
result = diwasp(ds, sensor_mapping={...}, ...)

Output Format#

The function returns a wavespectra-compatible xarray Dataset:

Dimensions#

time: Center time of each analysis window
freq: Frequency bins (Hz)
dir: Direction bins (degrees)

Variables#

Variable	Dimensions	Units	Description
`efth`	(time, freq, dir)	m²/Hz/degree	Spectral energy density
`hsig`	(time)	m	Significant wave height
`tp`	(time)	s	Peak period
`fp`	(time)	Hz	Peak frequency
`dp`	(time)	degree	Peak direction
`dm`	(time)	degree	Mean direction
`spread`	(time)	degree	Directional spread

Example Output Usage#

result = diwasp(df, ...)

# Access spectral data
spectrum = result['efth']  # 3D array (time, freq, dir)

# Access statistics time series
hsig = result['hsig'].values  # 1D array
tp = result['tp'].values

# Plot Hsig time series
import matplotlib.pyplot as plt
result['hsig'].plot()
plt.ylabel('Hsig (m)')
plt.show()

# Save to NetCDF
result.to_netcdf('wave_spectra.nc')

Window Configuration#

Window Length#

The window_length parameter sets the duration of each analysis window in seconds. Longer windows provide better frequency resolution but fewer output time points.

Frequency resolution ≈ 1 / window_length

Typical values:

1800 s (30 min): Standard for ocean waves
600-1200 s: Higher time resolution
3600 s (1 hour): Better frequency resolution

Window Overlap#

The window_overlap parameter controls how much consecutive windows overlap:

0: No overlap, independent windows
window_length/2: 50% overlap (common choice)
window_length * 0.75: 75% overlap for smoother time series

# 30-minute windows with 50% overlap
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Number of Windows#

The number of output windows is calculated as:

n_windows = 1 + (data_length - window_length) / (window_length - window_overlap)

Sampling Frequency#

The sampling frequency is automatically inferred from the time index. The function will automatically resample data if:

The time index has non-uniform spacing (irregular sampling)
An explicit fs parameter is provided that differs from the inferred frequency

Automatic Resampling#

If resampling is needed, the data is interpolated linearly to a uniform grid:

# Data with irregular sampling will be automatically resampled
result = diwasp(
    df_irregular,  # Non-uniform time spacing
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Override Sampling Frequency#

To force resampling to a specific frequency:

result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    fs=2.0,  # Force 2 Hz sampling frequency (triggers resampling if different)
)

Estimation Methods#

See Estimation Methods for detailed information on each method.

Quick guide:

'dftm': Fastest, lowest accuracy
'emlm': Fast, good for narrow spectra
'imlm': Good balance (default)
'emep': Robust, handles noise well
'bdm': Best accuracy, slowest

# Use EMEP for noisy data
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    method='emep',
)

Complete Examples#

PUV Sensor Analysis#

import pandas as pd
from diwasp import diwasp

# Load PUV data
df = pd.read_csv('puv_data.csv', index_col='time', parse_dates=True)

# Analyze with IMLM method
result = diwasp(
    df,
    sensor_mapping={
        'pressure': 'pres',
        'u_velocity': 'velx',
        'v_velocity': 'vely',
    },
    window_length=1800,      # 30 minutes
    window_overlap=900,       # 15 minutes overlap
    depth=15.0,              # 15m water depth
    z=0.5,                   # Sensors 0.5m above seabed
    method='imlm',
    dres=180,                # 2-degree directional resolution
    verbose=1,
)

# Print summary statistics
print(f"Number of spectra: {len(result.time)}")
print(f"Hsig range: {result.hsig.min().values:.2f} - {result.hsig.max().values:.2f} m")
print(f"Tp range: {result.tp.min().values:.1f} - {result.tp.max().values:.1f} s")

# Save results
result.to_netcdf('wave_spectra.nc')

Pressure Gauge Array#

import pandas as pd
from diwasp import diwasp

# Load data from 3 pressure gauges
df = pd.read_csv('pressure_array.csv', index_col='time', parse_dates=True)

# Define sensor positions (triangular array)
result = diwasp(
    df,
    sensor_mapping={
        'p1': 'pres',
        'p2': 'pres',
        'p3': 'pres',
    },
    window_length=1800,
    window_overlap=900,
    depth=10.0,
    x={'p1': 0, 'p2': 5, 'p3': -5},
    y={'p1': 0, 'p2': 5, 'p3': 5},
    z=0.0,  # All on seabed
    method='emep',
)

Integration with wavespectra#

from diwasp import diwasp
import wavespectra

# Run DIWASP analysis
result = diwasp(df, sensor_mapping=mapping, ...)

# Use wavespectra methods
hs = result.spec.hs()  # Significant wave height
tp = result.spec.tp()  # Peak period

# Plot using wavespectra
result.spec.plot()

Input Validation#

The function validates inputs and will raise errors for:

Empty Sensor Mapping#

# This will raise ValueError
result = diwasp(df, sensor_mapping={}, ...)

Error: ValueError: sensor_mapping cannot be empty

Invalid Depth#

# This will raise ValueError
result = diwasp(df, sensor_mapping=mapping, depth=-10.0, ...)

Error: ValueError: depth must be positive

Invalid Window Overlap#

# This will raise ValueError
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=300,
    window_overlap=300,  # Must be less than window_length
    depth=20.0,
)

Error: ValueError: window_overlap must be less than window_length

Troubleshooting#

“DatetimeIndex” Error#

ValueError: DataFrame index must be DatetimeIndex

Solution: Ensure your DataFrame has a datetime index:

df.index = pd.to_datetime(df.index)
# or
df = pd.read_csv('data.csv', index_col='time', parse_dates=True)

“Window exceeds data length” Error#

Solution: Reduce window length or get more data:

# Check data length
print(f"Data duration: {(df.index[-1] - df.index[0]).total_seconds()} seconds")

# Use shorter windows
result = diwasp(df, ..., window_length=600, ...)  # 10 minutes

Poor Results#

Check sensor positions are correct
Try a more robust method ('emep' or 'bdm')
Verify data quality (no gaps, correct units)
Ensure sufficient sensors for directional resolution

Slow Performance#

Use faster method ('imlm' or 'emlm')
Reduce directional resolution (dres=90)
Limit frequency range with freqs parameter
Reduce nfft for faster FFT computation