|
|
|
SI7: Data pre-processing system for the secondary storage
SI7 Report [PDF 1,379Kb].
Objectives
The aim of this workpackage is to establish a cost-effective data pre-processing system. This system will be used for refining, integrating and storing synchronous and asynchronous data streams from instruments and sensors into the secondary storage for later use in research. This workpackage focuses primarily on the large scale datasets from Protein crystallography and Climate modelling research group. These datasets require computationally intensive pre-processing.
This workpackage provides the following services:
- Data processing service
- Data security service
- Data transfer service
- Data archiving service
- Data compression service
- Data replication service
Descriptions
This project would initially develop an asynchronous data (data sourced from CD/DVD media and MonashSunGrid) pre-processing service and later this service will be extended for synchronous data (data sourced from real time instruments & sensors) if time permits.
Basic Flowchart
Case 1: Protein Crystallography Data Processing
The primary aim of this research group is to understand the role of proteins in biology and disease, by knowing their atomic structures using X-ray Crystallographythat requires massive computational power.
Some sample outputs from processed Protein Crystallography data is shown below (Source: "The Critical Role of Computer Power in Structural Biology" presentation by Ashley Buckle):
Case 2: Regional Climate Modelling Data Processing
The primary aims of this research group are to:
- Simulate complete climate model for consistency and efficiency that requires massive computational power
- Simulate for the timespan 2000 to 2005 that produces about 250 GB storage per experiment
- Control various scenario experiments including real-world setups that requires about 1.5 TB of storage
Some sample outputs from processed climate model data are shown below (Source: "The impact of abrupt land cover changes by savanna fire on contemporary north Australian climate" presentation by K. Grgen, A. Lynch, C. Enticott, J. Beringer, D. Abramson, P. Uotila, A. Marshall, N. Tapper):
|