Geophysics Community

Page tree

Time series metadata standards

  1. FDSN Station XML Schema - StationXML is an XML representation of metadata that describes the data collected by geophysical instrumentation. It was developed through the International Federation of Digital Seismograph Networks (FDSN) to provide a standardized format for geophysical metadata.

Time series data formats

  1. SEED (2012): “Standard for the Exchange of Earthquake Data” is the international standard format for the exchange of digital seismological data. It was developed by members of the “Federation of Digital Seismographic Networks” (FDSN) primarily for the exchange of unprocessed seismological data between different institutions and agencies. The format is suitable for continuous, event- or station-based waveform data from field stations, observatories, networks or arrays.

  2. miniSEED (2012) is the subset of the SEED standard that is used for time series data. Very limited metadata for the time series is included in miniSEED beyond time series identification and simple state-of-health flags. In particular, geographic coordinates, response/scaling information and other information needed to interpret the data values are not included.

  3. ASDF (2016): The Adaptable Seismic Data Format (ASDF) is a modern file format intended for researchers and analysts. It combines the capability to create comprehensive data sets including all necessary meta information with high-performance parallel I/O for the most demanding use cases.

  4. SAC (2013): Seismic Analysis Code (SAC) was developed at Lawrence Livermore National Laboratory and University of California in the early 1980's. SAC's data format is one of the principle data formats used today in storing, transferring and manipulating (eqrthquake) seismological time series data.

  5. QuakeML (2013) is a data model and XML-based data exchange format for seismology. Developed by the Swiss Seismological Service, GFZ, USGS, University of Washington, KNMI, EMSC.

  6. SEG-Y_r2 (2017): The SEG-Y format is developed by the “Society of Exploration Geophysicists” (SEG) as a standard format for exchange and archiving of geophysical, particularly seismic data.

  7. PH5 - PH5 is a seismic data format and software suite created for IRIS PASSCAL using HDF5. PH5 can handle controlled source, passive source, onshore-offshore, OBS and mixed source seismic data sets. In PH5, meta-data is separated from the time series data and then stored in a performance efficient manner that allows for easy user interaction and output of the meta-data in a format appropriate for the data set.

  8. TileDB Embedded (2022): TileDB Embedded has been developed by TileBD Inc. and is an open-source storage engine architected around dense and sparse multi-dimensional arrays. TileDB Embedded enables storing and accessing Dense arrays (e.g., images, video and more), Sparse arrays (e.g., LiDAR, genomics and more), Dataframes (any tabular data, as either dense or sparse arrays) and any data that can be modeled as arrays (e.g., graphs, key-values, ML models, etc.). Time-series data can be modeled by a 2D array, either dense with labeled dimensions or sparse with datetime and string dimensions. The TileDB Embedded storage format is particularly optimized for storing and retrieving data on cloud object stores such as AWS S3, Azure Blob Storage and Google Cloud Storage, as well as on any other object store, such as MinIO. All data writes handled by TileDB Embedded are immutable and timestamped allowing for the ability to time travel and the absolute control over writes and reads on arrays for maximum reproducibility.

    The Data Services of IRIS and the Geodetic Data Services of UNAVCO are designing, developing and implementing a cloud-based platform that will provide services for data queries across their internal repositories allowing researchers  to conduct their data processing in the same, or data-proximate, cloud as the platform (Tradbant et al., 2022). This cloud-based platform will adopt cloud-optimized data containers such as TileDB Embedded, which will allow for more efficient processing.


References

Trabant, C., Berglund, H., Carter, J., and Mencin, D.: Developing a Next Generation Platform for Geodetic, Seismological and Other Geophysical Data Sets and Services, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8905, https://doi.org/10.5194/egusphere-egu22-8905, 2022.

  • No labels