-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Relates to #264 as possibly avoidable via complete avoidance of fetching an .nwb file in full twice. Also might be of interest in the scope of the https://github.com/OpenSourceBrain/DANDIArchiveShowcase @anhknguyen96 is working on.
https://pynwb.readthedocs.io/en/stable/tutorials/advanced_io/streaming.html gives an example of how to use ros3 HDF5 driver to access remote file on S3 bucket (e.g. dandiarchive) without downloading it in full.
Another approach is HDF5 agnostic, using some fsspec but it would require pynwb to be able to open from an existing file handle which I am not sure if possible -- filed NeurodataWithoutBorders/pynwb#1525 . (well -- alternative is a fuse file system like the one provided by https://github.com/datalad/datalad-fuse/ for that file -- but might be too ad-hoc/heavy although quite possible via FUSE'ing an entire bucket whenever request comes in, and using local cache with some garbage-collection routines to prune it down once in a while).