Identification

Title

Accelerating CMIP data analysis with parallel computing in R

Abstract

In this Technical Note we examine eight schemes for parallelizing Extreme Value Analysis (EVA) on Coupled Model Intercomparison Project data via R foreach, doParallel, and doMPI packages. We perform strong scaling studies to delineate the performance impacts of factors such as R cluster type (TCP/IP sockets and MPI), communication protocol (Ethernet, IP over InfiniBand, and MPI), loop parallelization (outer or inner loop), and approaches to reading data from the NCAR GLADE parallel filesystem. We elucidate peculiarities of R memory management and overhead associated with interprocess communication and discuss broadcast limitations of Rmpi. The best performing scheme parallelizes the outer EVA loop across latitude and reads only the subset of the data operated on in the inner loop over longitude; the different cluster types and communication protocols all perform about equally for this scheme. This configuration represents a parallel speedup of 50 with 96 R workers, and is scalable for EVA on larger problem sizes than those presented here.

Resource type

document

Resource locator

Unique resource identifier

code

http://n2t.net/ark:/85065/d7z89fpb

codeSpace

Dataset language

eng

Spatial reference system

code identifying the spatial reference system

Classification of spatial data and services

Topic category

geoscientificInformation

Keywords

Keyword set

keyword value

Text

originating controlled vocabulary

title

Resource Type

reference date

date type

publication

effective date

2016-01-01T00:00:00Z

Keyword set

keyword value

EARTH SCIENCE SERVICES > DATA ANALYSIS AND VISUALIZATION > STATISTICAL APPLICATIONS

EARTH SCIENCE SERVICES > MODELS > COUPLED CLIMATE MODELS

originating controlled vocabulary

title

U.S. National Aeronautics and Space Administration Global Change Master Directory

reference date

date type

revision

effective date

2021-09-17

Geographic location

West bounding longitude

East bounding longitude

North bounding latitude

South bounding latitude

Temporal reference

Temporal extent

Begin position

End position

Dataset reference date

date type

publication

effective date

2017-06-30T00:00:00Z

Frequency of update

Quality and validity

Lineage

Conformity

Data format

name of format

version of format

Constraints related to access and use

Constraint set

Use constraints

Copyright Author(s). This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Limitations on public access

None

Responsible organisations

Responsible party

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata on metadata

Metadata point of contact

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata date

2023-08-18T18:06:48.405376

Metadata language

eng; USA