Convert raw files

Supported raw file types

Echopype currently supports conversion into netCDF4 or Zarr files from the following raw formats:

  • .raw files generated by Kongsberg-Simrad’s EK60 and EK80 echosounders and Kongsberg’s EA640 echosounder
  • .01A files generated by ASL Environmental Sciences’ AZFP echosounder
  • .ad2cp files generated by Nortek’s Signature series Acoustic Doppler Current Profilers (ADCPs) (beta)

Importing echopype

We encourage importing the echopype package with the alias ep:

import echopype as ep

In the examples below, we import open_raw as follows:

from echopype import open_raw

Conversion operation

File conversion for different types of echosounders is achieved by using the single function open_raw to parse the raw data and create a fully parsed EchoData object.

For data files from EK60, EK80 and EA640 echosounders, use the parameter sonar_model to indicate the echosounder type, since there is no specific information in the extension .raw that includes information about the echosounder type. In this example, open_raw is used to convert a raw EK80 file, return the EchoData object ed, and generate a converted netCDF file named FILENAME.nc saved to the directory path ./unpacked_files:

ed = open_raw('FILENAME.raw', sonar_model='EK80')  # for EK80 file
ed.to_netcdf(save_path='./unpacked_files')

Attention

  • Prior to version 0.5.0, conversion was carried out through the “Convert” interface. This interface is still available but will be deprecated in a future version.
  • Versions of echopype prior to 0.5.0 used raw2nc and raw2zarr in order to convert to netCDF4 or Zarr files respectively. These methods have been renamed to to_netcdf and to_zarr and the old names will be deprecated in a future version.
  • The EchoData class has been overhauled in 0.5.0, and the new open_raw function returns a fully parsed EchoData object that can be operated in memory or exported to a converted file. For more details see Open converted files.

For data files from the AZFP echosounder, the conversion requires an extra .XML file along with the .01A data file, specified using the parameter xml_path:

ed = open_raw('FILENAME.01A', sonar_model='AZFP', xml_path='XMLFILENAME.xml')
ed.to_netcdf(save_path='./unpacked_files')

The .XML file contains a lot of metadata needed for unpacking the binary data files. Typically a single .XML file is associated with all files from the same deployment.

Note

The EchoData instance contains all the data unpacked from the raw file, so it is a good idea to clear it from memory once done with conversion.

File access

open_raw can also accept paths to files on remote systems such as http (a file on a web server) and cloud object storage such as Amazon Web Services (AWS) S3. This capability is provided by the fsspec package, and all file systems implemented by fsspec are supported; a list of these file systems is available on the fsspec registry documentation.

Attention

fsspec-based access from file locations other than a local file system was introduced in version 0.5.0

https access

A file on a web server can be accessed by specifying the file url:

raw_file_url = "https://mydomain.com/my/dir/D20170615-T190214.raw"
ed = open_raw(raw_file_url, sonar_model='EK60')

AWS S3 access

Note

These instructions should apply to other object storage providers such as Google Cloud and Azure, but have only been tested on AWS S3.

A file on an AWS S3 “bucket” can be accessed by specifying the S3 path that starts with “s3://” and using the storage_options argument. For a publicly accessible file (“anonymous”) on a bucket called mybucket:

raw_file_s3path = "s3://mybucket/my/dir/D20170615-T190214.raw"
ed = open_raw(
   raw_file_s3path, sonar_model='EK60',
   storage_options={'anon': True}
)

If the file is not publicly accessible, the credentials can be specified explicitly through storage_options keywords:

ed = open_raw(
   raw_file_s3path, sonar_model='EK60',
   storage_options={key: 'ACCESSKEY', secret: 'SECRETKEY'}
)

or via a credentials file stored in the default AWS credentials file (~/.aws/credentials). For profile “myprofilename” found in the credential file:

import aiobotocore
aws_session = aiobotocore.AioSession(profile='myprofilename')
ed = open_raw(
   raw_file_s3path, sonar_model='EK60',
   storage_options={'session': aws_session}
)

File export

Converted data are saved to netCDF4 or Zarr files using EchoData.to_netcdf() and EchoData.to_zarr(). These methods accept convenient optional arguments. The examples below apply equally to both methods, except as noted.

A destination folder or file path should be specified with the save_path argument in these methods in order to control the location of the converted files. If the argument is not specified, the converted .nc and .zarr files are saved into a folder called temp_echopype_output under the current execution folder. This folder will be created if it doesn’t already exists.

Attention

The use of a default temp_echopype_output folder was introduced in versions 0.5.0. In prior versions, the default was to save each converted file into the same folder as the corresponding input file.

Specify metadata attributes

Before calling to_netcdf() or to_zarr(), you can manually set some data attributes that are not recorded in the raw data files but need to be specified according to the SONAR-netCDF4 convention. These attributes are metadata and include platform_name, platform_type, platform_code_ICES, and sometimes water_level, depending on the sonar model. These attributes can be set using the following:

ed.platform.attrs['platform_name'] = 'OOI'
ed.platform.attrs['platform_type'] = 'subsurface mooring'
ed.platform.attrs['platform_code_ICES'] = '3164'   # Platform code for Moorings

The platform_code_ICES attribute can be chosen by referencing the platform code from the ICES SHIPC vocabulary.

Save to AWS S3

Note

These instructions should apply to other object storage providers such as Google Cloud and Azure, but have only been tested on AWS S3.

Attention

Saving to S3 was introduced in version 0.5.0.

Converted files can be saved directly into an AWS S3 bucket by specifying storage_options as done with input files (see above, “AWS S3 access”). The example below illustrates a fully remote processing pipeline, reading a raw file from a web server and saving the converted Zarr dataset to S3. Writing netCDF4 to S3 is currently not supported.

raw_file_url = 'http://mydomain.com/from1/file_01.raw'
ed = open_raw(raw_file_url, sonar_model='EK60')
ed.to_zarr(
   overwrite=True,
   save_path='s3://mybucket/converted_file.zarr',
   storage_options={key: 'ACCESSKEY', secret: 'SECRETKEY'}
)

Note

Zarr datasets will be automatically chunked with default chunk sizes of 25000 for range_bin and 2500 for ping_time dimensions.

Non-uniform data

Due to flexibility in echosounder settings, some dimensional parameters can change in the middle of the file. For example:

  • The maximum depth range to which data are collected can change in the middle of a data file in EK60. This happens often when the bottom depth changes.
  • The sampling interval, which translates to temporal resolution, and thus range resolution, can also change in the middle of the file.
  • Data from different frequency channels can also be collected with different sampling intervals.

These changes produce different number of samples along range (the range_bin dimension in the converted .nc file), which are incompatible with the goal to save the data as a multi-dimensional array that can be easily indexed using xarray.

Echopype accommodates these cases by padding the “shorter” pings or channels with NaN to form a multi-dimensional array. We use the data compression option in xarray.to_netcdf() and xarray.to_zarr() to avoid dramatically increasing the output file size due to padding.