Tuesday, September 10, 2019

Tech Tip: Why CephFS is Great

We use the term Ceph a lot, but have you ever wondered what it actually stands for? Our lead engineer Brett Kelly, gives a great explanation in this week's tech tip! He also talks about CephFS and why he thinks it's the best-distributed file system out there. Hear what he has to say below!

Don't forget to subscribe to our Youtube channel, where we release a new tech tip every Tuesday at 3pm!

Hey guys, Brett Kelly here. Welcome to another tech tip Tuesday at 45Drives. Today we are talking about Ceph, in particular, CephFS and why it is the best-distributed file system.
There are a couple of things that make CephFS the best-distributed file system but first of all, it’s fully POSIX compliant and has full support for Linux extended attributes. This allows for endless support into Linux applications. If you’re working in an environment with Linux clients you can almost mount Ceph like a local file system (despite it being a network filesystem) without having any problems.
Ceph also integrates with samba which gives it flawless access to Windows clients into the cluster. Not only can you have windows access, you can have full windows access control list control over the permissions which is a big draw for a lot of end users.
Another great thing in Ceph is its quick and easy snapshots. Nowadays almost all distributed filesystems take snapshots, but in GlusterFS for example, snapshots can take a few minutes to complete. Compared to Gluster, Ceph’s snapshots are very quick. As your filesystem fills it will take longer to complete, but Ceph’s snapshots are always comparatively quick.
Ceph’s snapshots are also very easy to access, unlike other applications where you may have to clone or mount the snapshot before it is accessible. To access the Ceph snapshots, simply input the path of what your snapshotting and append “.snap” to the end along with the snapshot’s name.
Ceph’s integration with samba functions very well with shadow copy. Shadow copy gives users the ability to access their own snapshots to get their files back. It can really lighten the load on administrators if they would often need to be accessing snapshots to retrieve users’ files.
The last thing covered in this video that makes CephFS great is directory pinning. To understand directory pinning you need to understand how Ceph organizes its data into storage pools. Storage pools are separated by:
  • Data protection type – erasure coding or replication
  • Device type – HDD or SSD
This gives Ceph the ability to use multiple storage pools under one file system namespace. You can have hard disks in the same filesystem as SSDs, or you can have scratch storage with one replica and a big pool with erasure coding in the same filesystem. You can pin certain directories to different pools.
For example, if your use case involved video editors who’s main concern is speed while the rest of your data is colder storage you don’t require high speed’s for, you would be able to pin one directory of SSDs for your editors to work with while the rest is more economical erasure-coded HDDs.
Fun fact about Ceph, its name stands for cephalopod, which is the family for octopuses and squid. Cephalopods have many legs and if it loses one it can rebuild and continue, just like Ceph.
To learn more check out the series of videos we did on Ceph, or the knowledge base for articles about of Ceph. To learn more about 45 Drives clustering solutions, click here! Or reach out to an account manager who would be happy to help you with any questions.

No comments:

Post a Comment