Identification

Title

Refactoring Data-Driven Model Selection Code for Improvements in Interpretability, Generality, and Computational Expense

Abstract

Buchholz et al. used observations of total column carbon monoxide (CO) from the Measurements Of Pollution In The Troposphere (MOPITT) satellite instrument to build a record of monthly anomalies between 2001 and 2016, focusing on 7 biomass burning regions in the Southern Hemisphere and tropics. CO anomalies in each of the regions were modeled using climate indices for influential climate modes. A linear modeling approach was used, where de-trended, de-seasonalized, regionally aggregated CO measurements were taken as the response variable, and the climate index anomaly values (at various time lags) were taken as explanatory variables. Initial analyses were completed in MATLAB using serial algorithms carried out over non-functionalized scripts. We sought to refactor this codebase, with 3 specific improvement goals; first, to improve code interpretability in preparation for public release; second, to improve code generality, so that the techniques and code used in this application can be easily adapted for similar problems; and third, to utilize parallel computing to substantially speed up program executions. During the early phase of this refactoring, data structures and algorithms were selected to work with the parallel computing tools in the MATLAB Parallel Computing Toolbox. When the codebase was sufficiently developed, a series of parallel timing studies were performed to assess the extent of realizable time savings; in general, these savings were substantial.

Resource type

document

Resource locator

Unique resource identifier

code

http://n2t.net/ark:/85065/d76976dm

codeSpace

Dataset language

eng

Spatial reference system

code identifying the spatial reference system

Classification of spatial data and services

Topic category

geoscientificInformation

Keywords

Keyword set

keyword value

Text

originating controlled vocabulary

title

Resource Type

reference date

date type

publication

effective date

2016-01-01T00:00:00Z

Keyword set

keyword value

EARTH SCIENCE SERVICES > MODELS > ATMOSPHERIC CHEMISTRY MODELS

EARTH SCIENCE > HUMAN DIMENSIONS > ENVIRONMENTAL IMPACTS > BIOMASS BURNING

EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC CHEMISTRY > CARBON AND HYDROCARBON COMPOUNDS > ATMOSPHERIC CARBON MONOXIDE

originating controlled vocabulary

title

U.S. National Aeronautics and Space Administration Global Change Master Directory

reference date

date type

revision

effective date

2021-09-17

Geographic location

West bounding longitude

East bounding longitude

North bounding latitude

South bounding latitude

Temporal reference

Temporal extent

Begin position

End position

Dataset reference date

date type

publication

effective date

2018-08-22T00:00:00Z

Frequency of update

Quality and validity

Lineage

Conformity

Data format

name of format

version of format

Constraints related to access and use

Constraint set

Use constraints

Copyright Author(s). This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Limitations on public access

None

Responsible organisations

Responsible party

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata on metadata

Metadata point of contact

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata date

2023-08-18T18:06:43.351726

Metadata language

eng; USA