A distributed file system designed to run on commodity hardware as part of the Apache Hadoop ecosystem. HDFS provides high-throughput access to application data and is the foundation for storing massive datasets in Hadoop-based data platforms.
A memory-centric distributed storage system that acts as a caching layer between compute frameworks and storage systems. Alluxio accelerates data access by serving hot data from memory, bridging the gap between compute and storage.
A unified, distributed storage system providing object, block, and file storage in a single platform. CEPH is designed for excellent performance, reliability, and scalability, widely used in cloud infrastructure and data center environments.
A high-performance, cloud-native file system driven by object storage. JuiceFS provides a POSIX-compatible interface backed by cloud storage like S3, making it easy to mount cloud storage as a local file system for data processing workloads.
A scalable, distributed network file system suitable for data-intensive tasks such as cloud storage and media streaming. GlusterFS aggregates disk storage from multiple servers into a single global namespace for large-scale data access.
A simple and highly scalable distributed file system designed for fast, efficient storage and retrieval of billions of files. SeaweedFS supports S3 API compatibility, erasure coding, and FUSE mounting for flexible data access.
A file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL provides a standard POSIX file system interface with features like deduplication, compression, and encryption.
A software-defined storage solution that is distributed, parallel, scalable, fault-tolerant, and geo-redundant. LizardFS provides a highly available file system with automatic data replication and self-healing capabilities.