SI6: Simulation data dynamic retrieval or regenerationSI5 & 6 Report [PDF 105Kb].In SI6 we propose a life cycle, which starts when the data is first generated, and tracks its progress through replication, distribution, deletion and possible re-computation. We have designed and implemented an infrastructure, called Active Data, which combines existing Grid middleware to support the scientific data lifecycle in a platform-neutral environment. OverviewWe have developed a prototype system that manages the GDLC of the computational models and workflows in a platform neutral environment. This system, called Active Data, provides mechanisms to existing applications and workflows to run on the Grid. It also allows them to access data stored in different replica management systems, to associate metadata that describes how the data is computed across multiple replica systems, to ensure that data cannot be removed unless sufficient metadata for regeneration is associated, and to regenerate the required data transparently when needed during execution. Importantly, because Active Data is built under our GriddLeS system, no source code modification is needed.
Grid Data Life Cycle (GDLC) For full details about the software developed for this Work Package see SI5. |
