Converting raw files

Converting raw files#

Supported raw file types#

Echopype supports converting raw instrument-generated data files into netCDF or Zarr from the following echosounders:

.raw files generated by Kongsberg Simrad EK60, ES70, EK80, and ES80 echosounders and Kongsberg EA640 echosounder
.01A files generated by ASL Environmental Sciences AZFP echosounder
.ad2cp files generated by Nortek Signature series Acoustic Doppler Current Profilers (ADCPs) (beta)

Conversion operation#

File conversion for different types of echosounders is achieved by using the function open_raw to parse the raw data and create an EchoData object.

Use the parameter sonar_model to indicate the echosounder model:

EK60: Kongsberg Simrad EK60 echosounder
ES70: Kongsberg Simrad ES70 echosounder
EK80: Kongsberg Simrad EK80 echosounder
ES80: Kongsberg Simrad ES80 echosounder
EA640: Kongsberg EA640 echosounder
AZFP: ASL Environmental Sciences AZFP echosounder
AD2CP: Nortek Signature series ADCP (tested with Signature 500 and Signature 1000 files collected in 2021)

To convert a raw EK80 file to an in-memory EchoData object:

import echopype as ep  # we encourage importing echopype as ep
ed = ep.open_raw("FILENAME.raw", sonar_model="EK80")  # for EK80 file

For data from the AZFP echosounder, the conversion requires an extra .XML file (specified using xml_path) along with the .01A data file:

ed = open_raw("FILENAME.01A", sonar_model="AZFP", xml_path="XML_FILENAME.xml")  # AZFP data need an XML file
ed.to_netcdf(save_path="./unpacked_files")

The AZFP .XML file contains a lot of metadata needed for unpacking the binary .01A files. Typically a single .XML file is associated with all files from the same deployment.

Tip

The EchoData object contains all the data unpacked from the raw file, so it is a good idea to clear it from memory once done with conversion.

Attention

In Echopype v0.6.2 we improved open_raw by allowing users to directly write variables that may consume a large amount of memory into a temporary zarr store (#774).

This feature is accessible through open_raw via arguments use_swap and max_mb and is only available for the following echosounders: EK60, ES70, EK80, ES80, EA640. See the API reference for usage.

Local and remote file access#

open_raw can accept paths to files on both local and remote file systems (e.g., web http server and cloud object storage such as Amazon Web Services (AWS) S3). This capability is provided by the fsspec package, and all file systems implemented by fsspec are supported (see the list here).

For a file on a web server can be accessed by specifying the file url:

ed = open_raw(
    "https://mydomain.com/my/dir/D20170615-T190214.raw",  # file on http server
    sonar_model="EK80"
)

For a file in a publicly accessible S3 bucket:

raw_file_s3path = "s3://mybucket/my/dir/D20170615-T190214.raw"
ed = open_raw(
    "s3://mybucket/my/dir/D20170615-T190214.raw",  # file in S3 bucket
    sonar_model="EK80",
    storage_options={"anon": True}  # publicly accessible file ("anonymous")
)

For a file in a private S3 bucket:

raw_file_s3path = "s3://mybucket/my/dir/D20170615-T190214.raw"
ed = open_raw(
    "s3://mybucket/my/dir/D20170615-T190214.raw",  # file in S3 bucket
    sonar_model="EK80",
    storage_options={"key": "ACCESSKEY", "secret": "SECRETKEY"}  # access credentials
)

It is often safer to store a credential file so that the access credentials are not supplied directly in scripts or notebooks. For example, for AWS, a default AWS credentials file (~/.aws/credentials) can contain a with profile “myprofilename” and be used with aiobotocore to access data:

import aiobotocore
aws_session = aiobotocore.AioSession(profile="myprofilename")
ed = open_raw(
    raw_file_s3path, sonar_model="EK60",
    storage_options={"session": aws_session}
)

Note

These instructions should apply to other object storage providers such as Google Cloud Platform and Microsoft Azure, but have only been tested on AWS S3.

Saving converted data#

The converted EchoData object can be saved to netCDF4 (.nc) or Zarr (.zarr) files using the .to_netcdf or .to_zarr method. The destination folder or file path should be specified with the save_path argument. If left unspecified, the converted files will be saved to ~/.echopype/temp_output. This folder will be created if it does not already exists.

ed.to_netcdf(save_path="./unpacked_files")  # save to FILENAME.nc in the folder unpacked_files
ed.to_zarr(save_path="./unpacked_files/NEW_FILENAME.zarr")  # fully specify filename also works

The converted EchoData object can be also be saved directly to an AWS S3 bucket by specifying output_storage_options, similar to the storage_options argument in open_raw. The example below illustrates a workflow that reads a raw file from a web server and saving the converted Zarr dataset to S3. Writing netCDF4 to S3 is currently not supported.

ed = open_raw("http://mydomain.com/from1/file_01.raw", sonar_model="EK60")
ed.to_zarr(
    overwrite=True,
    save_path="s3://mybucket/converted_file.zarr",
    output_storage_options={"key": "ACCESSKEY", "secret": "SECRETKEY"}
)

Note

Zarr datasets will be automatically chunked with default chunk sizes of 25000 for range_sample and 2500 for ping_time dimensions.

Specify metadata attributes#

You can manually set some EchoData metadata attributes specified in the SONAR-netCDF4 convention that are not recorded in the raw instrument-generated files. For example, many Platform variables are not stored in the raw files, including platform_name, platform_type and platform_code_ICES. They can be set by:

ed["Platform"]["platform_name"] = "OOI"
ed["Platform"]["platform_type"] = "subsurface mooring"
ed["Platform"]["platform_code_ICES"] = "3164"   # Platform code for Moorings

platform_code_ICES can be chosen by referencing the ICES SHIPC vocabulary.