Convert raw files#

Supported raw file types#

Echopype currently supports conversion into netCDF4 or Zarr files from the following raw formats:

  • .raw files generated by Kongsberg-Simrad’s EK60, ES70, EK80, and ES80 echosounders and Kongsberg’s EA640 echosounder

  • .01A files generated by ASL Environmental Sciences’ AZFP echosounder

  • .ad2cp files generated by Nortek’s Signature series Acoustic Doppler Current Profilers (ADCPs) (beta)

Importing echopype#

We encourage importing the echopype package with the alias ep:

import echopype as ep

In the examples below, we import open_raw as follows:

from echopype import open_raw

Conversion operation#

File conversion for different types of echosounders is achieved by using the single function open_raw to parse the raw data and create a fully parsed EchoData object.

Use the parameter sonar_model to indicate the sonar type:
  • EK60: Kongsberg-Simrad EK60 echosounder

  • ES70: Kongsberg-Simrad ES70 echosounder

  • EK80: Kongsberg-Simrad EK80 echosounder

  • ES80: Kongsberg-Simrad ES80 echosounder

  • EA640: Kongsberg EA640 echosounder

  • AZFP: ASL Environmental Sciences AZFP echosounder

  • AD2CP: Nortek Signature series ADCP (tested with Signature 500 and Signature 1000)

EchoData objects are based on the SONAR-netCDF4 vers.1 convention, with some modifications introduced by echopype; see Interoperable data formats for details.

In the following example, open_raw is used to convert a raw EK80 file, and return an in-memory EchoData object ed. The to_netcdf method on ed is then used to generate a converted SONAR-netCDF4 vers.1 file named FILENAME.nc saved to the directory path ./unpacked_files:

ed = open_raw('FILENAME.raw', sonar_model='EK80')  # for EK80 file
ed.to_netcdf(save_path='./unpacked_files')

For data files from the AZFP echosounder, the conversion requires an extra .XML file along with the .01A data file, specified using the parameter xml_path:

ed = open_raw('FILENAME.01A', sonar_model='AZFP', xml_path='XMLFILENAME.xml')
ed.to_netcdf(save_path='./unpacked_files')

The .XML file contains a lot of metadata needed for unpacking the binary data files. Typically a single .XML file is associated with all files from the same deployment.

Note

The EchoData instance contains all the data unpacked from the raw file, so it is a good idea to clear it from memory once done with conversion.

Attention

In version 0.6.2 of echopype we improved the in-memory usage of open_raw by allowing users to directly write variables that may consume a large amount of memory into a temporary zarr store (see #774).

This feature is accessible through open_raw via arguments use_swap and max_mb and is only available for the following echosounders: EK60, ES70, EK80, ES80, EA640. See API reference for usage. This is currently a beta feature that will benefit from user feedback.

File access#

open_raw can also accept paths to files on remote systems such as http (a file on a web server) and cloud object storage such as Amazon Web Services (AWS) S3. This capability is provided by the fsspec package, and all file systems implemented by fsspec are supported; a list of these file systems is available on the fsspec registry documentation.

https access#

A file on a web server can be accessed by specifying the file url:

raw_file_url = "https://mydomain.com/my/dir/D20170615-T190214.raw"
ed = open_raw(raw_file_url, sonar_model='EK60')

AWS S3 access#

Note

These instructions should apply to other object storage providers such as Google Cloud and Azure, but have only been tested on AWS S3.

A file on an AWS S3 “bucket” can be accessed by specifying the S3 path that starts with “s3://” and using the storage_options argument. For a publicly accessible file (“anonymous”) on a bucket called mybucket:

raw_file_s3path = "s3://mybucket/my/dir/D20170615-T190214.raw"
ed = open_raw(
   raw_file_s3path, sonar_model='EK60',
   storage_options={'anon': True}
)

If the file is not publicly accessible, the credentials can be specified explicitly through storage_options keywords:

ed = open_raw(
   raw_file_s3path, sonar_model='EK60',
   storage_options={'key': 'ACCESSKEY', 'secret': 'SECRETKEY'}
)

or via a credentials file stored in the default AWS credentials file (~/.aws/credentials). For profile “myprofilename” found in the credential file (note that aiobotocore is installed by echopype):

import aiobotocore
aws_session = aiobotocore.AioSession(profile='myprofilename')
ed = open_raw(
   raw_file_s3path, sonar_model='EK60',
   storage_options={'session': aws_session}
)

File export#

Converted data are saved to netCDF4 or Zarr files using EchoData.to_netcdf() and EchoData.to_zarr(). These methods accept convenient optional arguments. The examples below apply equally to both methods, except as noted.

A destination folder or file path should be specified with the save_path argument in these methods in order to control the location of the converted files. If the argument is not specified, the converted .nc and .zarr files are saved into the directory ~/.echopype/temp_output. This folder will be created if it doesn’t already exists.

Specify metadata attributes#

Before calling to_netcdf() or to_zarr(), you can manually set some metadata attributes that are not recorded in the raw data files but need to be specified according to the SONAR-netCDF4 convention. Common attributes typically not found in the raw files include the following, in the Platform netCDF4 group: platform_name, platform_type and platform_code_ICES. These attributes can be set using the following:

ed['Platform']['platform_name'] = 'OOI'
ed['Platform']['platform_type'] = 'subsurface mooring'
ed['Platform']['platform_code_ICES'] = '3164'   # Platform code for Moorings

The platform_code_ICES attribute can be chosen by referencing the platform code from the ICES SHIPC vocabulary.

Save to AWS S3#

Note

These instructions should apply to other object storage providers such as Google Cloud and Azure, but have only been tested on AWS S3.

Converted files can be saved directly into an AWS S3 bucket by specifying output_storage_options, similar to storage_options with input files (see above, “AWS S3 access”). The example below illustrates a fully remote processing pipeline, reading a raw file from a web server and saving the converted Zarr dataset to S3. (As with storage_options when accessing raw data from S3, a profile-based session can also be used, passing the session to output_storage_options). Writing netCDF4 to S3 is currently not supported.

raw_file_url = 'http://mydomain.com/from1/file_01.raw'
ed = open_raw(raw_file_url, sonar_model='EK60')
ed.to_zarr(
   overwrite=True,
   save_path='s3://mybucket/converted_file.zarr',
   output_storage_options={'key': 'ACCESSKEY', 'secret': 'SECRETKEY'}
)

Note

Zarr datasets will be automatically chunked with default chunk sizes of 25000 for range_sample and 2500 for ping_time dimensions.