FRIEDA

FRIEDA stands for Flexible Robust Intelligent Elastic Data Management. Scientific applications are increasingly using cloud resources for their data analysis workflows. We use the cloud loosely to signify transient environments. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance, cost trade-offs, complex application choices and complexity associated with elasticity, failure rates in these environments. The different data access patterns for data-intensive scientific applications require a more flexible and robust data management solution than the ones currently in existence.  FRIEDA is a Flexible Robust Intelligent Elastic Data Management framework that employs a range of data management strategies approaches in elastic environments.

FRIEDA was initially developed as a way to manage scientific data on top of end-to-end provisioned clouds over provisioned network resources as part of the SDCI project The Missing Link. FRIEDA has since evolved as a data management framework to understand various trade-offs of scientific data management in elastic transient environments. Specifically, we are investigating

  • Semi-automated storage choices and data management strategies for data analysis science workflows
  • Management of the life cycle of scientific data movement and management in hybrid environments using HPC and cloud resources.

Specifically, FRIEDA will provide a) an interface for semi-automated storage choices for the application, b) managing data life cycle in elastic and transient environments c) provide a hierarchical data system that includes transient VM components in conjunction with HPC resources d) consider trade-offs in performance, consistency, cost and fault tolerance