Identification

Title

Informing the Prediction of Compression Method and Level for Climate Model Data Using Variable Features

Abstract

Increased computing power makes it possible to simulate larger Earth system model ensembles with higher output frequency, finer spatial resolution, and extended simulation length. These improvements produce massive datasets and are straining institutional storage resources. Therefore, different compression methodologies have been studied to address this issue. It is possible to implement lossless compression methods, where the original data is perfectly preserved. However, lossy compression methods, where part of original data may not be preserved, are a more promising option due to the higher compression rates they can achieve. Previous work has demonstrated that using a combination of different lossy compression methods and levels produces better results overall because the choice of method and level can be tailored to the characteristics of each variable. Currently, determining the optimal compression method and level for each variable is computationally expensive because it involves compressing and reconstructing each variable exhaustively for each possible compression method and level. The optimal combination is then determined by assessing which method/level produces the highest data compression while still satisfying the quality criteria. The goal of this project is to streamline this process by characterizing the variables through features that will be used in a regression model to predict the optimal compression level automatically. We analyze a large ensemble of annual averages of 198 variables from the Community Earth System Model (CESM) with the final goal of informing a multinomial regression model to predict different compression levels for the fpzip compression method. Here we describe and summarize the different features that range from simple statistics to smoothness and clustering indicators, analyze their variability across ensemble members, and preliminarily evaluate their correlation with the different compression levels from fpzip.

Resource type

document

Resource locator

Unique resource identifier

code

http://n2t.net/ark:/85065/d7c82csx

codeSpace

Dataset language

eng

Spatial reference system

code identifying the spatial reference system

Classification of spatial data and services

Topic category

geoscientificInformation

Keywords

Keyword set

keyword value

Text

originating controlled vocabulary

title

Resource Type

reference date

date type

publication

effective date

2016-01-01T00:00:00Z

Keyword set

keyword value

EARTH SCIENCE SERVICES > DATA MANAGEMENT/DATA HANDLING > DATA COMPRESSION

EARTH SCIENCE SERVICES > MODELS > COUPLED CLIMATE MODELS

originating controlled vocabulary

title

U.S. National Aeronautics and Space Administration Global Change Master Directory

reference date

date type

revision

effective date

2021-09-17

Geographic location

West bounding longitude

East bounding longitude

North bounding latitude

South bounding latitude

Temporal reference

Temporal extent

Begin position

End position

Dataset reference date

date type

publication

effective date

2017-09-01T00:00:00Z

Frequency of update

Quality and validity

Lineage

Conformity

Data format

name of format

version of format

Constraints related to access and use

Constraint set

Use constraints

Copyright Author(s). This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Limitations on public access

None

Responsible organisations

Responsible party

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata on metadata

Metadata point of contact

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata date

2023-08-18T18:06:47.944050

Metadata language

eng; USA