Data processing

Data processing#

Echopype data processing functionalities are structured into different subpackages with expandability and a series of data processing levels in mind. Once the data is converted from the raw instrument data files to standardized EchoData objects (or stored in .zarr or .nc format) and calibrated, the core input and output of most subsequent functions are generic xarray Datasets. This design allows new processing functions be easily added without needing to understand specialized objects, other than functions needing access of data stored only in the raw-converted EchoData objects.

The section Data processing functionalities provides information for current processing functions and their usage.

The section Additional information for processed data provides on some aspects of processed data that may require additional explanation to fully understand the representation and underlying operations.

Format of processed data#

Once raw data (represented by the EchoData objects) are calibrated (via compute_Sv), the calibrated data and the outputs of all subsequent processing functions are generic xarray Datasets. We currently do not follow any specific conventions for processed data, but we retain provenance information in the dataset, including the data processing levels. However, whether and how data variables used in the processing will be stored remain to be determined.