We use the term Ceph a lot, but have you ever wondered what it actually stands for? Our lead engineer Brett Kelly, gives a great explanation in this week's tech tip! He also talks about CephFS and why he thinks it's the best-distributed file system out there. Hear what he has to say below!
Don't forget to subscribe to our Youtube channel, where we release a new tech tip every Tuesday at 3pm!
Hey guys, Brett Kelly here. Welcome to another tech tip Tuesday at 45Drives. Today we are talking about Ceph, in particular, CephFS and why it is the best-distributed file system.
There are a couple of things that make CephFS the best-distributed file system but
first of all, it’s
fully POSIX compliant and has full support for Linux extended
attributes. This allows for endless support into Linux applications.
If you’re working in an environment with Linux clients you can
almost mount Ceph like a local file system (despite it being a
network filesystem) without having any problems.
Ceph also
integrates with samba which gives it flawless access to Windows
clients into the cluster. Not only can you
have windows access, you can have full windows access control list
control over the permissions which is a big draw for a lot of end
users.
Another great
thing in Ceph is its quick and easy snapshots. Nowadays almost all
distributed filesystems take snapshots, but in GlusterFS for example,
snapshots can take a few minutes to complete. Compared to Gluster,
Ceph’s snapshots are very quick. As your filesystem fills it will
take longer to complete, but Ceph’s snapshots are always
comparatively quick.
Ceph’s
snapshots are also very easy to access, unlike other applications where
you may have to clone or mount the snapshot before it is accessible.
To access the Ceph snapshots, simply input the path of what your
snapshotting and append “.snap” to the end along with the
snapshot’s name.
Ceph’s
integration with samba functions very well with shadow copy. Shadow
copy gives users the ability to access their own snapshots to get
their files back. It can really lighten the load on administrators if
they would often need to be accessing snapshots to retrieve users’
files.
The last thing
covered in this video that makes CephFS great is directory pinning.
To understand directory pinning you need to understand how Ceph
organizes its data into storage pools. Storage pools are separated
by:
- Data protection type – erasure coding or replication
- Device type – HDD or SSD
This gives Ceph
the ability to use multiple storage pools under one file system
namespace. You can have hard disks in the same filesystem as SSDs, or
you can have scratch storage with one replica and a big pool with
erasure coding in the same filesystem. You can pin certain
directories to different pools.
For example, if
your use case involved video editors who’s main concern is speed
while the rest of your data is colder storage you don’t require
high speed’s for, you would be able to pin one directory of SSDs
for your editors to work with while the rest is more economical
erasure-coded HDDs.
Fun fact about
Ceph, its name stands for cephalopod, which is the family for
octopuses and squid. Cephalopods have many legs and if it loses one
it can rebuild and continue, just like Ceph.
To learn more
check out the series
of videos we did on Ceph, or
the knowledge
base for articles about of
Ceph. To learn more about 45 Drives clustering solutions, click
here! Or reach out to an
account manager who would be
happy to help you with any questions.
No comments:
Post a Comment