The Kubernetes platform offers a variety of storage systems and which option you choose depends on storage characteristics like scalability, performance, and cost. Seán McCord from Sidero Labs spoke on Wednesday at KubeCon + CloudNativeCon North America 2022 Conference about the tools the teams can use to evaluate when to use which storage solution.
He said storage is very unlike hosting apps on the cloud. It's very stateful and not easily or quickly replicated. It also takes big chunks of network and CPU usage to move data to various places. In general, storage eats up a lot of infrastructure resources. The storage options on Kubernetes typically fall into three categories:
- Object Stores
- Block Stores (good for persistent volume (PV) storage; supports standards like iSCSI and NVMEoF), and
- Shared File Systems (like NFS which has been used for decades, it's the least common denominator, easy to set up but the locking problem is a challenge)
Other factors like location also matter in deciding the cloud storage solution. If you are using a single cloud vendor in your organization, it's better to use their system. Another important consideration is whether the storage is managed in-cluster or out-of-cluster.
McCord discussed three characteristics of storage: scalability, performance, and cost.
Scalability: This includes traditional RAID technology with a single controller, highly centralized with limited replication factors. There are also standard SaaS expanders with redundant controllers, still highly centralized using a single SaaS channel, offering limited tiering. Storage clusters: more interesting stuff happening in storage cluster space. These eliminate single points of failure (SPoF), are horizontally scalable, and can be faster as they grow. They also offer dynamic, fine-grained replication and topology awareness.
Performance: In terms of performance, benchmarks are misleading especially in storage. It depends on factors like drives themselves, controllers and interfaces, workload needs, and unexpected scaling effects. He advised the app teams to test as precisely as possible before using any performance metrics for decision-making.
Cost: With hardware components like disks and controllers, the storage infrastructure can get complex very quickly. Maintenance is also an important part because drives will fail often. Growth and scalability can affect the overall cost. The more centralized the infrastructure, the more likely you will reach a limit where you can't grow anymore. The benefits of horizontal scaling are huge in terms of cost.
McCord also talked about the storage interfaces like iSCSI, NVMEoF, Ceph, and NFS. iSCSI is an old standard; it's slow but used by many vendors. The Linux implementation called OpeniSCSI requires local sockets and a local config file to set it up. NVMEoF is a new standard; it's cleaner, simpler, and faster. Ceph is another storage interface that supports RBD and CephFS file system solutions. NFS is also an option. Shared file system contenders include NFS, Gluster via kadalu, CephFS, MinIo, Linstor which is a popular one and highly pluggable.
He then discussed storage cluster options like OpenEBS family which is based on Amazon's Elastic Block Store or EBS solution. It's just a block storage with limited replication and topology control. He also talked about other solutions like cStor, Jiva, Rancher's Longhorn, Mayastor, Seaweed FS, and Rook/ceph.
McCord summarized his presentation with the following K8s storage recommendations:
- If you can delegate someone else to handle the storage, then pay that vendor for it (e.g. PortWorx).
- If you don't have any special storage requirements, just store it using a solution like Linstor.
- If you need control of scaling, but Ceph is too complicated, use OpenEBS/Mayastor if you need performance over ruggedness, and use OpenEBS/cStor if you need ruggedness over performance.
- For best storage features, scaling, and fault tolerance, use Ceph for overall stability. Otherwise use Rook/Ceph.
For more information on this and other sessions, check out the conference's main website.