Alex Szalay to build petabyte-scale geographically-distributed Open Storage Network

Picture of Alex Szalay

IDIES Director Alex Szalay will lead the two-year, $1.8 million dollar project funded by the National Science Foundation (NSF) to create what may become the world’s largest scientific data storage network. NSF’s investment in Open Storage Network (OSN) builds on $1 million in seed money from Schmidt Futures.

Open Storage Network systems will be placed at universities across the nation to form a geographically-distributed network supporting big data research.

Each university’s network node will be uniform and expandable. Universities will be encouraged to expand their starter systems. Preliminary projections indicate the buildout could cost more than $20 million dollars, growing the network to 200 PB and beyond, making the OSN the largest distributed scientific data storage network in the world.

The Open Storage Network “could completely change the academic big data landscape,” said Alex Szalay. Increased data availability and transparency of data stored on the network will boost research by letting researchers know which datasets are available and how to access them. Future big data research programs will be able focus on data management and stewardship rather than reinventing storage solutions.

The project leverages key data storage partners throughout the U.S., including the National Data Service and four NSF-funded Big Data Regional Innovation Hubs including the San Diego Supercomputer Center (SDSC), the Midwest BD Hub at the National Center for Supercomputer Applications (NCSA), the Southern BD Hub at the Renaissance Computing Institute (RENCI), and the Northeast BD Hub at the Massachusetts Green High Performance Computing Center (MGHPCC) and Pittsburgh Supercomputing Center (PSC).