CEDA document repository

Semantic storage of climate data on object stores

Massey, Neil R. Semantic storage of climate data on object stores. In: RCUK Cloud Workshop 2018. (Unpublished)

[img] Slideshow
RCUK_cloud_2018_nrmassey.pptx - Presentation
Available under License Creative Commons Attribution Non-commercial.

Download (3MB)

Abstract

Using object stores, and cloud based storage utilising the Amazon S3 HTTP API, presents a number of opportunities and challenges in the storing of climate data. The distributed nature of the objects allows large data sets to be broken into “fragments”, each fragment containing a subset of the data. This allows for parallel access to the fragments, improving the performance of reading the data across a network. However, this presents a number of problems. Firstly, determining the fragment size and the optimum method of splitting multi-dimensional data. Secondly, enabling meaningful search of data, when the data may be widely distributed. This talk will present a new method of splitting netCDF files into fragments and storing each fragment as an object in an S3 compatible object store. The location of the fragments, the metadata and dimensions for each climate variable are stored in a master file, which can be written to a location not within the object store, for example a POSIX file system on a SSD. This allows fast search of the metadata without having to reassemble the fragments. Each fragment is stored as a self-contained netCDF file allowing reconstruction of the data if the master file is lost.

Item Type: Conference or Workshop Item (Lecture)
Subjects: Atmospheric Sciences
Computer Science
Data and Information
Depositing User: Dr Neil Massey
Date Deposited: 28 Feb 2018 11:56
Last Modified: 28 Feb 2018 11:56
URI: http://cedadocs.ceda.ac.uk/id/eprint/1361

Actions (login required)

View Item View Item