SciServer Compute Bringing Analysis Close to the Data

SciServer Compute features Jupyter notebooks running in server-side Docker containers attached to large relational databases and file storage to bring advanced analysis capabilities close to the data. SciServer Compute is a component of SciServer, a big-data infrastructure project developed at Johns Hopkins University that will provide a common environment for computational research.

With SciServer, launched in May, 2016, users use CasJobs and SkyServer to create freeform SQL queries on SciServer-hosted large relational databases, such as the Sloan Digital Sky Survey’s astronomy dataset. In addition to downloading query results, users save their results in the cloud in several ways: in their personal MyDB, in the shared MyScratch Database and File temporary storage, or in the file storage service SciDrive. Each of these SciServer-provided storage options minimizes data-movement, keeping the data close to SQL-based analysis tools.

SciServer System
Major components and data sources for the SciServer System.

SciServer Compute, added in June, expands users’ capabilities, enabling them to perform scientific analysis with scripting languages such as Python, R, and Matlab with Jupyter Notebooks deployed in Docker containers on a dedicated, scalable, cluster of servers. SciServer Compute integrates with other SciServer tools and data stores, allows users to query SciServer-hosted and other databases and file systems, and read and write their results to MyDB, MyScratchDB and FileScratch, and SciDrive. Compute also provides direct access to large data archives and additional libraries and tools through shared, read-only data volumes.

SciServer currently supports and collaborates with researchers in astronomy, cosmology, genomics, turbulence, environmental science, oceanography, and materials science, and plans to expand to more scientific domains to support researchers at Johns Hopkins and beyond.

SciServer is a collaborative research environment for large-scale data-driven science. It is developed at, and administered by, the Institute for Data Intensive Engineering and Science at Johns Hopkins University. The Sciserver Team includes Dmitry Medvedev, Barbara J. Souter, Lance Joseph, Jai Won Kim, Gerard Lemson, Victor Paul, M. Jordan Raddick, Michael Rippin, Alexander S. Szalay, Manuchehr Taghizadeh-Popp, Aniruddha Thakar, Suzanne Werner, Alainna White, Jan Vandenberg. SciServer is funded by the National Science Foundation Award ACI-1261715. For more information about SciServer, please visit www.sciserver.org.