File Systems & Storage
Unified Distributed Storage
★ 4.4
Scalable Network File System
★ 4.0
pip install cephN/A — system package, install via package managerpip install cephN/A — system package, install via package managerPython data engineers in on-premise or private cloud environments use Ceph's S3-compatible RADOS Gateway as a drop-in replacement for AWS S3 — boto3 and awswrangler work unchanged by pointing them at the Ceph endpoint URL. CephFS is mounted as a shared file system that multiple Python pipeline worker nodes read from and write to simultaneously.
Python data engineers in HPC and on-premise environments use GlusterFS as a shared storage layer accessible by multiple pipeline worker nodes simultaneously. Python jobs write output files to a GlusterFS mount point, and other nodes in the cluster can immediately read those files without data movement — simplifying distributed batch processing without object storage dependencies.
Individual Tool Pages