Using echopype

Installation

Echopype can be installed from PyPI:

$ pip install echopype

or through conda:

$ conda install -c conda-forge echopype

When creating an conda environment to work with echopype, do

$ conda create -c conda-forge --name echopype python=3.8 --file requirements.txt --file requirements-dev.txt

Echopype works for python>=3.7.

Test files

Echopype uses Git Large File Storage (Git LFS) to store the binary data and test files used. Git LFS enables the Github repository to remain small while still being able to access the large test files needed for testing. These files are only needed if you plan to work on the code and run the tests locally.

To access the test files, first install Git LFS.

Cloning echopype after installing Git LFS will automatically pull the test data, but if echopype was cloned first, then pull the files from Git LFS by running:

$ git lfs fetch

Note

Echopype has recently migrated to using Git LFS which required removing the large datasets from the history. It is recommended that those who have previously forked echopype delete their fork and fork a new one. Otherwise, pulling form the original repository will result in twice the number of commits due to the re-written history.

File conversion

Supported file types

Echopype currently supports conversion from

  • .raw files generated by Simrad’s EK60 echosounder

  • .raw files generated by Simrad’s EK80 echosounder

  • .01A files generated by ASL Environmental Sciences’ AZFP echosounder

into netCDF (stable) or zarr (beta) files.

Note

EK80 .raw conversion is now for alpha testing. Please report any bugs by creating issues on GitHub. Pull requests are always welcome!

We are currently addressing the issue with parsing transducer configuration from data collected by the 2-in-1 transducer (see #145 and #146 ).

We are considering implementing calibration routines for raw beam data from common-found Acoustic Doppler Current Profilers (ADCPs).

Conversion operation

File conversion for different types of echosounders is achieved by using a single interface through the Convert subpackage.

For data files from the EK60 echosounder, you can do the following in an interactive Python session:

from echopype import Convert
dc = Convert('FILENAME.raw')
dc.raw2nc()

This will generate a FILENAME.nc file in the same directory as the original FILENAME.raw file.

For data files from the EK80 broadband echosounder, use

dc = Convert('FILENAME.raw', model='EK80')

to indicate the echosounder type, since the filename extension .raw does not contain echosounder type information.

For data files from the AZFP echosounder, the conversion requires an extra .XML file along with the .01A data file. The .XML file contains a lot of metadata needed for unpacking the binary data files. Typically one single .XML file is associated with all files from the same deployment.

This can be done by:

from echopype import Convert
dc = Convert('FILENAME.01A', 'XMLFILENAME.xml')
dc.raw2nc()

Before calling raw2nc() to create netCDF4 files, you should first set platform_name, platform_type, and patform_code_ICES, as these values are not recorded in the raw data files but need to be specified according to the netCDF4 convention. These parameters will be saved as empty strings unless you specify them following the example below:

dc.platform_name = 'OOI'
dc.platform_type = 'subsurface mooring'
dc.platform_code_ICES = '3164'   # Platform code for Moorings

The platform_code_ICES attribute can be chosen by referencing the platform code from the ICES SHIPC vocabulary.

Note

  1. For conversion to zarr files, call method .raw2zarr() from the same Convert object as shown above.

  2. The Convert instance contains all the data unpacked from the raw file, so it is a good idea to clear it from memory once done with conversion.

More conversion options

There are optional arguments that you can pass into Convert.raw2nc() that may come in handy.

  • Save converted files into another folder:

    By default the converted .nc files are saved into the same folder as the input files. This can be changed by setting save_path to path to a directory.

    raw_file_path = ['./raw_data_files/file_01.raw',   # a list of raw data files
                     './raw_data_files/file_02.raw',
                     ...]
    dc = Convert(raw_file_path)                        # create a Convert object
    dc.raw2nc(save_path='./unpacked_files')            # set the output directory
    

    Each input file will be converted to individual .nc files and stored in the specified directory.

  • Combine multiple raw data files into one .nc file when unpacking:

    raw_file_path = ['./raw_data_files/file_01.raw',   # a list of raw data files
                     './raw_data_files/file_02.raw',
                     ...]
    dc = Convert(raw_file_path)                        # create a Convert object
    dc.raw2nc(combine_opt=True,                        # combine all input files when unpacking
              save_path='./unpacked_files/combined_file.nc')
    

    save_path has to be given explicitly when combining multiple files. If save_path is only a filename instead of a full path, the combined output file will be saved in the same folder as the raw data files.

Non-uniform data

Due to flexibility in echosounder settings, some dimensional parameters can change in the middle of the file. For example:

  • The maximum depth range to which data are collected can change in the middle of a data file in EK60. This happens often when the bottom depth changes.

  • The sampling interval, which translates to temporal resolution, and thus range resolution, can also change in the middle of the file.

  • Data from different frequency channels can also be collected with different sampling intervals.

These changes produce different number of samples along range (the range_bin dimension in the converted .nc file), which are incompatible with the goal to save the data as a multi-dimensional array that can be easily indexed using xarray.

Echopype accommodates these cases in the following two ways:

  1. When there are changes in the range_bin dimension in the middle of a data file, echopype creates separate files for each consecutive chunk of data with the same number of samples along range and append _partXX to the converted filename to indicate the existence of such changes. For example, if datafile.raw contains changes in the number of samples along range, the converted output will be datafile_part01.nc, datafile_part02.nc, etc.

  2. When the number of samples along the range_bin dimensions are different for different frequency channels, echopype pads the shorter channels with NaN to form a multi-dimensional array. We use the data compression option in xarray.to_netcdf() and xarray.to_zarr() to avoid dramatically increasing the output file size due to padding.

Data processing

Warning

The model subpackage and the data processing interface EchoData have been renamed to process and Process, respectively. Attempts to import echopype.model and use EchoData will still work at the moment but will be deprecated in the future.

Functionality

  • EK60 and AZFP narrowband echosounders:

    • calibration and echo-integration to obtain volume backscattering strength (Sv) from power data.

    • Simple noise removal by removing data points (set to NaN) below an adaptively estimated noise floor 1.

    • Binning and averaging to obtain mean volume backscattering strength (MVBS) from the calibrated data.

  • EK80 broadband echosounder:

    • calibration based on pulse compression output in the form of average over frequency.

The steps of performing these analysis for EK60 and AZFP echosounders are summarized below. Additional information will be added for broadband EK80 echosounder as additional functionality is developed.

from echopype import Process
nc_path = './converted_files/convertedfile.nc'  # path to a converted nc file
ed = Process(nc_path)   # create a processing object
ed.calibrate()           # Sv
ed.remove_noise()        # denoise
ed.get_MVBS()            # calculate MVBS

By default, these methods do not save the calculation results to disk. The computation results can be accessed from ed.Sv, ed.Sv_clean and ed.MVBS as xarray Datasets with proper dimension labels.

To save results to disk:

ed.calibrate(save=True)     # output: convertedfile_Sv.nc
ed.remove_noise(save=True)  # output: convertedfile_Sv_clean.nc
ed.get_MVBS(save=True)      # output: convertedfile_MVBS.nc

There are various options to save the results:

# Overwrite the output postfix from _Sv to_Cal: convertedfile_Cal.nc
ed.calibrate(save=True, save_postfix='_Cal')

# Save output to another directory: ./cal_results/convertedfile_Sv.nc
ed.calibrate(save=True, save_path='./cal_results')

# Save output to another directory with an arbitrary name
ed.calibrate(save=True, save_path='./cal_results/somethingnew.nc')

By default, for noise removal and MVBS calculation, echopype tries to load Sv already stored in memory (ed.Sv), or tries to calibrate the raw data to obtain Sv. If ed.Sv is empty (i.e., whe calibration operation has not been performed on the object), echopype will try to load Sv from *_Sv.nc from the directory containing the converted .nc file or from the user-specified path. For example:

  1. Try to do MVBS calculation without having previously calibrated data

    from echopype import Process
    nc_path = './converted_files/convertedfile.nc'  # path to a converted nc file
    ed = Process(nc_path)   # create a processing object
    ed.get_MVBS()  # echopype will call .calibrate() automatically
    
  2. Try to do MVBS calculation with _Sv_clean.nc file previously created in folder ‘another_directory’

    from echopype import Process
    nc_path = './converted_files/convertedfile.nc'  # path to a converted nc file
    ed = Process(nc_path)   # create a data processing object
    ed.get_MVBS(source_path='another_directory', source_postfix='_Sv_clean')
    

Note

Echopype’s data processing functionality is being developed actively. Be sure to check back here often!

Environmental parameters

Environmental parameters, including temperature, salinity and pressure, are critical in biological interpretation of ocean sonar data. They influence

  • Transducer calibration, through seawater absorption. This influence is frequency-dependent, and the higher the frequency the more sensitive the calibration is to the environmental parameters.

  • Sound speed, which impacts the conversion from temporal resolution of (of each data sample) to spatial resolution, i.e. the sonar observation range would change.

By default, echopype uses the following for calibration:

  • EK60: Environmental parameters saved with the data files

  • AZFP: salinity = 29.6 PSU, pressure = 60 dbar, and temperature recorded at the instrument

These parameters should be overwritten when they differ from the actual environmental condition during data collection. To update these parameters, simply do the following before calling ed.calibrate():

ed.temperature = 8   # temperature in degree Celsius
ed.salinity = 30     # salinity in PSU
ed.pressure = 50     # pressure in dbar
ed.recalculate_environment()  # recalculate related parameters

This will trigger recalculation of all related parameters, including sound speed, seawater absorption, thickness of each sonar sample, and range. The updated values can be retrieved with:

ed.seawater_absorption  # absorption in [dB/m]
ed.sound_speed          # sound speed in [m/s]
ed.sample_thickness     # sample spatial resolution in [m]
ed.range                # range for each sonar sample in [m]

For EK60 data, echopype updates the sound speed and seawater absorption using the formulae from Mackenzie (1981) 2 and Ainslie and McColm (1981) 3, respectively.

For AZFP data, echopype updates the sound speed and seawater absorption using the formulae provided by the manufacturer ASL Environmental Sci.

Calibration parameters

Calibration here refers to the calibration of transducers on an echosounder, which finds the mapping between the voltage signal recorded by the echosounder and the actual (physical) acoustic pressure received at the transducer. This mapping is critical in deriving biological quantities from acoustic measurements, such as estimating biomass. More detail about the calibration procedure can be found in 4.

Echopype by default uses calibration parameters stored in the converted files along with the backscatter measurements and other metadata parsed from the raw data file. However, since careful calibration is often done separately from the data collection phase of the field work, accurate calibration parameters are often supplied in the post-processing stage. Currently echopypy allows users to overwrite calibration parameters for EK60 data, including sa_correction, equivalent_beam_angle, and gain_correction.

As an example, to reset the equivalent beam angle for 18 kHz only, one can do:

ed.equivalent_beam_angle.loc[dict(frequency=18000)] = -18.02  # set value for 18 kHz only

To set the equivalent beam angle for all channels at once, do:

ed.equivalent_beam_angle = [-17.47, -20.77, -21.13, -20.4 , -30]  # set all channels at once

Make sure you use ed.equivalent_beam_angle.frequency to check the sequence of the frequency channels first, and always double check after setting these parameters!


1

De Robertis A, Higginbottoms I. (2007) A post-processing technique to estimate the signal-to-noise ratio and remove echosounder background noise. ICES J. Mar. Sci. 64(6): 1282–1291.

2

Mackenzie K. (1981) Nine‐term equation for sound speed in the oceans. J. Acoust. Soc. Am. 70(3): 806-812

3

Ainslie MA, McColm JG. (1998) A simplified formula for viscous and chemical absorption in sea water. J. Acoust. Soc. Am. 103(3): 1671-1672

4

Demer DA, Berger L, Bernasconi M, Bethke E, Boswell K, Chu D, Domokos R, et al. (2015) Calibration of acoustic instruments. ICES Cooperative Research Report No. 326. 133 pp.