MooseFS Introduction

I have been a user of storage products for years, and as my data grew, so did my storage systems. I started off as you did, one hard drive in my computer, to which I stored pictures, music and movies. Over time, I separated my storage off into a dedicated server, running a RAID, which was great, right until I outgrew that solution. I found myself buying larger servers and more hard drives almost every 2 years.

Each time I went to a larger solution, I could not simply add that space to the previous solution, I had to migrate the data from the older, smaller server to the larger server. As my data grew, so did the time it took to perform that process, eventually taking as long as 3-5 days to copy my data to the new solution.

Here is a breakdown of my storage over the years:

Floppy Disk

This was a huge improvement over the cassette tape. You could load your program quite a bit faster, and because the floppy disk was not linear, like the tape, you could write anywhere on the disk: the beginning, middle, or end.

Starting with the first practical floppy disk, the 5 1/4 inch, became very popular with platforms such as the Commodore 64. Disks were cheap, and this one even "floppy", made from vendors such as Elephant, but had limited capacity. In those days, we'd say "Gee, why would you need any more space?", but soon, we were clicking notches in the disk to write to both sides (giving you 340KB) and wanting more space.

Enter the 31/2 inch (non) floppy disk. Compact, durable, faster, they could hold anywhere from 720KB to 1440KB on average. We were using these until the early '90's, and still are in use today.

Hard Disk

The first practical hard disks started to enter the market about this time, and popular on platforms such as the IBM PS/2. This was so much faster than floppy disks, and compact enough where you could store it inside of the computer. Now, you could practically boot a PC from a hard disk, and allow for the storage of files, on a massive scale.

Large capacity back in those days was 20MB, and was delivered via serial connections still. You could really hear the drive "spin" up, sounded like it was going to take off!

Over time, capacity grew. The interface changed from RLL to IDE to SATA over the years, as SATA 2/3 is what I am using today!

4TB hard drives are $85 today, OEM.

RAID

So, now that you have 1 hard drive, how do you ensure that your data is safe? You add more hard drives.

A RAID is basically a collection of hard disks that are formatted in such a way that provides redundancy based on your needs. Traditionally offered using a RAID card installed into the computer, with the drive members hooked up to this. This is still used today, with vendors such as Adaptec.

More recently, the Software RAID has become popular, where instead of having a RAID card, you'd present the hard drives to the computer in JBOD (Just a bunch of disks) and have software such as MDADM or ZFS to manage the RAID. The big advantage here is portability, where as if your system drive dies, you can replace it and expect the RAID you built on the failed drive to become usable on the new drive, thus (hopefully) preserving your data.

Distributed Storage

Now, how can you protect yourself if the server that has the RAID you are using dies? Simply add more servers!

Distributed storage is fairly new, but has been around for a while now. Popular platforms are Cephs or Gluster, and can easily be interpreted as RAIDing systems together, or more technically, clustering systems together to provide yet more redundancy.

You have the option of using RAID on these servers, but in a lot of cases, these are considered "No RAID" systems, as they do not conform to the standards of RAID. Each has it's own way of distributing information to make your data redundant.

MooseFS is distributed storage. It does not use RAID at all, but provides a very high level of performance and redundancy.

It's bullet proof! MooseFS has provided me worry free, stable storage now for 2 years. It has lived through power failures, drive failures, and server failures, and has never lost data for me.

It's fast. I am averaging 75MB/s write speeds on large files, where my WD EX4 NAS with RAID5 can only achieve 18MB/s average on the same operation.

It's free, or at least the version that I am using, MFS 3.X. If you want the features that the PRO version has, and want the support, you can buy it from MooseFS. I have had no issues running the free version.

It's Scalable. If I need more space, I add another hard drive to a chunk server, or add a new chunk server to the cluster, the space will change on the fly. You don't need to manage LUNs or make special workarounds to add more space. This is one of the only file systems I have seen that does this!

It runs on old hardware. I am using HP G6 series servers that were purchased on Ebay for under $150 each. I have no issues with performance, and have built MFS clusters on faster servers with about the same performance.

MooseFS consists of 3 different parts, meant to be run on different servers.

Master Server:

The master server is the "head" of the cluster, and contains the database that keeps track of which bit is where on the chunk servers.

The Pro version of MooseFS has the ability to run 2 heads, in a High Availability (HA) cluster, so if you loose a master server, services will still be available via the backup master server. I will be using the free version of MooseFS, which also has some protection built in that is available via a meta logger server. I have found in the 2 years of running MooseFS that simply running a meta logger server is protection enough for my needs.

The master server's requirements in pending the type of data you are going to use. For larger files, you need less resources, but for a large amount of small files, you need more resources. I am running 110TB on a Master Server with 2 X 2.8Ghz 4 core x5000 series Xeon processors, maybe 73GB of storage, and 32GB of PC-8500R memory. I am barely breaking a sweat on heavy loads: Maybe 1-2% CPU and 750MB of RAM used with 50% full.

Registered ECC RAM is highly recommended.

All other servers involved in running MooseFS need to communicate with the Master Server via IP networking.

Chunk Server:

The chunk server is the storage unit of MooseFS, and is meant to be filled with hard drives for maximum effect.

You can put any size drive, mix and match sizes and brands, it does not matter. Use old hard drives if you want, if you loose a drive, the data will automatically reshuffle onto the remaining drives. You can even have multiple failures, as long as you have space remaining to copy the chunks from the failed drives onto.

I recommend building at least 3 of these servers. I am using HP SE316M1 and SE326M1 servers with 12GB-16GB of ECC Registered memory. I also have a "white box" in the mix with an Asus server board and 32GB of RAM.

You can have as many chunk servers as you want, in fact, the more you have, the more redundancy you get. If you loose an entire host, which has happened to me a lot, the data will, again, reshuffle around to the other hosts, granted you have the space.

The chunk servers will also balance space based on size available, so if you use uneven servers, like 2 X 16TB and 1 X 32TB chunk servers, and you fill it 50%, each 16TB server will have 8TB stored, and the 32TB server will have 16TB stored.

Meta Logger Server:

The Meta Logger server simply stores a copy of the database from the master server. That's it. It cannot be a master server itself, nor provides any other function to MooseFS.

This server is not required, but is highly recommended.

I have lost the database or hard drives in the master server and have had to copy if back from the meta logger server plenty of times, and have never lost data! I might have a little down time while I copy the data back, it's small, and can probably be automated.

MooseFS is redundant by breaking all of your data into "chunks", then duplicating these chunks and distributing them among the chunk servers. By doing this, we can loose a whole chunk server and be able to recover using the copied chunks available on the other servers.

By default, each chunk is copied twice, providing a RAID1-like redundancy. You can change this value to a higher one pending your needs, such as if this is your company's accounting information, we can set the number of copied chunks to 3 or 4, thus making 3 or 4 copies of that data. If you have 4 chunk servers and make 4 copies, you can safely loose the other 3 chunk servers, but where's your ROI?

Each chunk takes space. If you have the default 2 copy method, for each 1 TB of data, you'll use 2 TB of space in MooseFS. For a 4 copy method, for 1TB of data, you'll use 4TB of space in MooseFS.

We use the terms Raw Space and Real Space to differentiate between how much space your data is and how much it will take in MooseFS.

In my case, I am using the default 2 copy method, and have 110 TB of Raw Space on my MooseFS cluster. All mounted partitions show 110 TB of space. However, I have 25 TB Real Space of data, which is taking up 50 TB of Raw Space on my MooseFS cluster.

Storage Evolution

1976-1982

Cassette Tape

1983-1992

Floppy Disk

1984-present

Hard Disk

1992-Present

RAID

2011-Present

Distributed Storage

Why Moose?

Three Types Of Servers

Master Server:

Chunk Server:

Meta Logger Server:

MooseFS Requirements

Small Deploy

Medium Deploy

Large Deploy

Master Server

Chunk Server

Meta Logger

MooseFS Redundancy

Leave a Reply Cancel reply