Loading…
This event has ended. Create your own event on Sched.
For over 20 years, ESIP meetings have brought together the most innovative thinkers and leaders around Earth science data, forming a community dedicated to making Earth science data more discoverable, accessible and useful to researchers, practitioners, policymakers, and the public. The theme of the July meeting is "Data for All People: From Generation to Use and Understanding."

Registered attendees can join us virtually at https://2022julyesipmeeting.qiqochat.com/.
Friday, July 22 • 11:00am - 12:30pm
HDF Town Hall

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Zoom Recording

Earth Science data in the HDF5 format is prevalent although sometimes under different names. As cloud computing gains wider adoption in geosciences, the migration of HDF5 data from on-prem to cloud-based storage and its impact on data analysis workflows poses unique set of challenges. The session’s goal is to provide the latest technical information and best practice suggestions relevant for HDF data generation and migration scenarios. Data producers, cloud data managers, DevOps engineers, and geoscientists should be aware of this information in order to reduce the amount of data duplication, achieve quicker migration time, and avoid any data usability loss.

Session plan:
Dana Robinson (HDF Group): HDF5 Roadmap and New Features
This talk will cover the HDF5 roadmap into 2023 and new features, including the new implementation of the single-writer/multiple-readers (SWMR) functionality and the Onion VFD. The talk will also cover API changes that will be made after the 1.14.0 release ("HDF5 2.0").

H. Joe Lee (NASA EED-3/HDF Group): Accessing Cloud Data and Services Using EDL, Pydap, MATLAB


James Gallagher (OPeNDAP): Hyrax: Serving Data from S3
This presentation will cover how to use the Hyrax OPeNDAP server with HDF5/NetCDF4 files stored in S3. Covered topics will be building the metadata files that enable access and subsetting in-place in S3 and customizing the metadata files for unusual datasets. In addition configuration options for use with generic Web Object Store (WOS) configurations, the special options for use with NASA’s NGAP-based Earthdata cloud system will be described.
 

Kent Yang (NASA EED-3/HDF Group): HDF5 OPeNDAP Handler Updates, and Performance Discussion
The OPeNDAP Hyrax service has been in operational use by NASA Earth data centers for more than a decade. The HDF4 and HDF5 OPeNDAP handlers are the core components for Hyrax to serve HDF4, HDF5, HDF-EOS2, HDF-EOS5 and netCDF-4 products. In this presentation, we will give the update in the latest HDF5 handler development, mainly the feature of mapping HDF5/netCDF-4 to DAP4. We will also share a proof-of-concept study result on using the HDF5 handler and OPeNDAP Hyrax's fileout netCDF module to access NASA HDF5/netCDF-4 data. It shows that significant performance improvement can be achieved by using an advanced HDF5 library feature inside the Hyrax and the netCDF library.

Aleksandar Jelenak (NASA EED-3/HDF Group): Creating Cloud-Optimized HDF5 Files
John Readey (HDF Group): Highly Scalable Data Service (HSDS) Performance Features
HSDS (REST-based HDF Service) is an open-source,  cloud-native implementation of HDF that can be deployed in Docker, Kubernetes, or serverless (with AWS Lambda).   HSDS is designed to work effectively with object-based storage platforms such as AWS S3 and scale from 1 to 100's of cores.  In this talk we'll cover some of the recent HSDS developments with a focus on how they can improve performance while eliminating some of the restrictions that were present in previous versions.  We'll show how this works in practice with some example applications.



Speakers

Friday July 22, 2022 11:00am - 12:30pm EDT
King's Garden 3