Science

A vast storage network for deep sea discovery

GEOMAR, a leading oceanography research facility partner with ELEMENTS

Topic
Storing large data sets
Project
GEOMAR
Location
Kiel (GER)
Challenge
Preserving huge amounts of raw images from research vessels and remotely operated underwater vehicles
System used
ELEMENTS ONE + JBOD, CUBE, WORKER, Media Library

As global concerns grow about our impact on our planet and its environment, research into the natural world becomes a critical process for our future. GEOMAR, the Helmholtz Centre for Ocean Research based in Kiel, Germany, is one of the leading institutions in marine research, focusing on critical analysis on behalf of governments and public bodies.  Half of the staff are scientists, including 40 professors.

“The challenge of studying the deep ocean is that it is an inhospitable place for people” said Dr Carsten Schirnick of GEOMAR. “Our understanding comes from detailed mapping and imagery of the ocean floor, using remote vehicles capable of moving precisely at depths of 1000 metres or more. Our scientists then study the images and data back in the labs.”

ROV (remotely operated underwater vehicles). Photo: ROV-Team, GEOMAR

The probes deployed to the ocean can either be tethered, such as remotely operated vehicles (ROVs) or camera frames controlled from a research ship, or free moving, like autonomous underwater vehicles (AUVs), pre-programmed for specific missions.

Regardless of whether they are ROVs or AUVs, the primary goal of any underwater scientific mission is to capture high-resolution images of the ocean, which should include a) the water column and b) the video imagery, along with precise surface coordinates of the seabed. Images, enriched with metadata, are then meticulously stitched together, creating an accurate map of the ocean’s structure and geography, including marine life, animals, and biology in the ocean from surface to floor, at all scales ranging from microscopic to macroscopic.

Storing large data sets is a necessity

GEOMAR acquired its first ROV in 2007, and back then storing the high-resolution images was impractical due to insufficient data recording methods. Since then, GEOMAR has worked tirelessly to ensure that all collected information can be stored effectively.

„We developed storage systems based on proprietary technology,“ said Schirnick. „However, these systems gradually became harder to support, and we recognised the need for a new solution. We required secure, large-scale storage built on open-source models wherever possible.“

Collecting this vast array of images and videos is essential for providing valuable resources to researchers within GEOMAR and beyond. Since these scientists each have specialist tools, it’s crucial to simplify access to the data using application-specific software compatible with MacOS, Windows, and Linux.

Dr. Carsten Schirnick, Data Management, GEOMAR. Photo: Ilka Thomsen, GEOMAR

While the work is driven initially by specific projects, it is also essential that the images and information become available to whoever needs the data. GEOMAR is committed to the scientific community’s FAIR principle: all research data and results must be Findable, Accessible, Interoperable and Reusable. Again, this means using open formats for access.

The data archive is continuously growing, which is crucial for earth sciences, where much research focuses on understanding changes over time, to comprehend these changes, it’s essential to compare data spanning years or even decades.

Implementing large-scale, open storage

Having identified the need for a new storage platform, GEOMAR researched what was available on the market. Dr Schirnick and his fellow data engineers first encountered ELEMENTS at a trade exhibition.

“Talking to ELEMENTS, we realised that they understood our specific requirements’” he recalled. “When we said we wanted a storage network that researchers could access using Python, they nodded and said, “Of course”.

ELEMENTS offer vast storage capacities, which are crucial for our needs. A recent GEOMAR research vessel returned with over 80 terabytes of raw data from a single expedition, showcasing the substantial storage demands of our research.

AUV (autonomous underwater vehicles). Photo: Nikolas Linke, GEOMAR

Equally important is the need to preserve that original data precisely as captured. In the future, there may be new ways of transforming the raw images to get more detail, to look at some factors we do not consider today, or to apply AI to derive further information.

Researchers will first process the raw data to generate a set of images, performing tasks like color correction, matching, and some rotation and geometric adjustments to ensure alignment. These processed versions are also crucial to store.

Following that, positional metadata will be employed to seamlessly stitch together multiple images, resulting in a comprehensive depiction of expansive oceanic regions.

Given the intensive processing involved, storing the resulting mosaic is more efficient than repeating the compositing process, further reinforcing the need for extensive storage capabilities.

In addition to coordinates, other metadata such as weather conditions, water temperatures, temperature gradients, and proportions of specific nutrients and minerals are collected. These details are stored alongside images and videos, anticipating the potential correlations future researchers might seek and acknowledging the evolving insights enabled by technologies like AI.

The ELEMENTS team proved ideal partners for us. They showed us how the storage, Media Library software and other tools can be used to achieve our specific goals. While we developed most of the implementation ourselves, they were always on hand with active support when we needed it, ensuring we created the exact functionality we needed.

Dr. Carsten Schirnick, Data Management, GEOMAR

Collaborating with ELEMENTS as technology partner

„Engaging with ELEMENTS was instrumental in shaping our requirements“, Schirnick explained. „They visited GEOMAR to provide a live demonstration, not only showcasing the current capabilities of their product but also freely discussing its future trajectory, including both prominent and visionary features.“

During these discussions, the critical importance of efficiently capturing large volumes of data during research cruises became apparent. ELEMENTS proposed their portable data storage solution – CUBE – as the ideal platform for rapid data transfer and secure storage.

ELEMENTS Worker nodes. Photo: Ilka Thomsen, GEOMAR

Given the high costs of sending research vessels to remote oceanic regions, enhancing productivity within narrow time frames is paramount. Capitalizing on the downtime when ROVs and AUVs are brought onboard for recharging provides an ideal opportunity to offload data, freeing up space for new data from upcoming missions.

Onboard storage must adhere to stringent volume and weight constraints, necessitating a minimalist approach for ROVs and AUVs. Here, the open access provided by ELEMENTS proves invaluable: data can be swiftly transferred to the CUBE, ready for replication and processing once onboard.

Similarly, when the research vessel returns to base, the CUBE interconnects with the master ELEMENTS storage network to transfer all the data as quickly as possible so researchers have immediate access. The amount of data is vast, so automating much of this work – integrating metadata, controlling access, and transferring to primary and backup stores is crucial. It allows the GEOMAR team to focus resources on science rather than the administration as much as possible.

“The ELEMENTS team proved ideal partners for us,” Schirnick concluded. “They showed us how the storage, Media Library software and other tools can be used to achieve our specific goals. While we developed most of the implementation ourselves, they were always on hand with active support when we needed it, ensuring we created the exact functionality we needed.”

Following this installation, two more major marine research centres with solid links to GEOMAR – AWI at Bremerhaven and Hereon at Geesthacht – have also implemented ELEMENTS systems to support their research.

Glossar

COBIT

COBIT ist ein international anerkanntes Rahmenwerk für das Management und die Governance von Informationstechnologie. Es bietet ein umfassendes Regelwerk von Prinzipien, Praktiken und analytischen Instrumenten und Modellen zur Steuerung der unternehmensweiten IT.