Using two USB Sticks as a Raid on a Raspberry Pi

Oct 21, 2023

tech 100DaysToOffload

Reading time: 9 minutes

The other day I mentioned in passing on Mastodon that I am using two USB-Sticks as a RAID array on my homeserver, which is a Raspberry Pi 4. Much to my surprise, two people actually asked how this was done, so I promised I would write it up as a blog post. And so here we are. Let’s have a look at how it works.

Disclaimer: I’m not an IT expert, some things in this tutorial could be wrong or even lead to data loss, so use the information here at your own risk. At the bottom of this post are links to more in-depth articles and tutorials, I highly recommend giving those a read as they contain much more information than this.

Introduction

First, what is a RAID? RAID stands for “redundant array of independent disks”, which simply means that more than one physical drive are connected together to form one logical drive.

There’s a variety of different ways to set this up, called RAID levels, which I won’t get into here, but wikipedia and many other pages have you covered.

I set up my USB sticks as a RAID 1 array, which means the data is mirrored across two (could also be more than two for extra redundancy) drives and they all contain the exact same information. This way, if one drive fails, the other still has everything saved and no data is lost. Of course when this happens, the faulty drive needs to be replaced as soon as possible before the other one fails as well.

I got two inexpensive Sandisk 64GB USB drives from Amazon that I connected to my Raspi and set up as a RAID 1 array. This is not a buying recommendation by the way, but I never had a problem with Sandisk drives in the past and these were reasonably cheap, so I went with them.

I plugged them into my Raspi, and that’s the hardware setup taken care of.

Software configuration

I’ll walk through the installation here, for reference a list of all the commands is at the bottom of the post.

The raid is configured through a utility called ‘mdadm’, which you can install through the packet manager of your distro or compile from scratch if you’re hardcore. I’m not hardcore, so I simply installed it.

$ sudo apt install mdadm

Next, I checked if the USB drives are available.

$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    1 57,3G  0 disk 
sdb           8:16   1 57,3G  0 disk 
mmcblk0     179:0    0 29,1G  0 disk 
├─mmcblk0p1 179:1    0  256M  0 part /boot
└─mmcblk0p2 179:2    0 28,9G  0 part /

I can see they’re connected as drives sda and sdb, so I’m good to continue.

The next command creates the RAID array, level 1 (mirrored) from the two USB devices. Make sure you use the correct device names!

$ sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb

Here’s the output from the command, it tells me that the drives already have a partition table, which doesn’t matter because they will be reformatted, and asks me to confirm. Then it returns to the console and the array is created in the background:

$ sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb
mdadm: partition table exists on /dev/sda
mdadm: partition table exists on /dev/sda but will be lost or
       meaningless after creating array
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: partition table exists on /dev/sdb
mdadm: partition table exists on /dev/sdb but will be lost or
       meaningless after creating array
mdadm: size set to 60029952K
Continue creating array? (y/n) y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Reading the file ‘/proc/mdstat’ shows the progress of this operation, depending on the size and speed of the drives this can take a long time, for my two 64GB USB sticks it took about an hour.

$ cat /proc/mdstat
Personalities : [raid1] 
md0 : active raid1 sdb[1] sda[0]
      60029952 blocks super 1.2 [2/2] [UU]
      [>....................]  resync =  0.5% (336448/60029952) finish=62.0min speed=16021K/sec
      
unused devices: <none>

I followed the process with the ‘watch’ command:

$ watch 'cat /proc/mdstat'

According to the tutorial I followed, I can start using the array even before it is fully assembled, so the next thing I needed to do was to format the new file system:

$ sudo mkfs.ext4 -F /dev/md0
mke2fs 1.46.2 (28-Feb-2021)
/dev/md0 contains a ext4 file system
	last mounted on Fri Oct 20 11:48:52 2023
Creating filesystem with 15007488 4k blocks and 3751936 inodes
Filesystem UUID: 2361e08e-3703-46f5-b425-ad4d519b555c
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (65536 blocks): 
done
Writing superblocks and filesystem accounting information: done   

I then created a mount point and mounted the new drive:

$ sudo mkdir -p /mnt/md0
$ sudo mount /dev/md0 /mnt/md0

And looking at my drives with ‘df’, I can see that the array is mounted and ready to use (shortened for readability):

$ df -h
Dateisystem    Typ      Größe Benutzt Verf. Verw% Eingehängt auf
...
/dev/md0       ext4       57G     24K   54G    1% /mnt/md0

That’s almost all, just a few small things need to be done.

First, save the array configuration:

$ sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
ARRAY /dev/md0 metadata=1.2 name=pi4srv:0 UUID=402c7741:7e70698e:b2ed9be5:c6e3531e

Add the array to fstab so it is automatically mounted at startup:

$ echo '/dev/md0 /mnt/md0 ext4 rw,user,exec,defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab

And change the access rights of the mount folder so a regular user can write to it:

$ sudo chmod 777 /mnt/md0/

And with that, I have a working RAID 1 array consisting of two cheap USB drives running on my Raspberry Pi homeserver, which I can use for backup.

How reliable these drives are and how long they last I have no idea, time will tell. But that’s why I got two, if one fails I’ll hopefully have time to replace it before the other fails as well.

I’m planning to mirror my server config files and docker setup from the internal SD card to this external storage, and also use it to backup and distribute important files across my home network.

Before I do this though, I want to test what happens if one of the drives ‘fails’ and how the data can be accessed/restored in this case. A backup is only a backup if it can be restored after all.

But I’m still in the process of setting this all up, so that might be a topic for another day.

Caveat

In all of this, ‘mdadm’ is functioning as the RAID controller, and the data on the drives is only available going through mdadm. Simply plugging in one of the drives into a computer and expecting to see all the data that is stored on the RAID doesn’t work, it needs to be opened through mdadm first and only then is the data visible.

The links at the bottom of this post give more information about setting up and using RAIDs in this way.

Summary

Here’s the list of commands I ran for future reference.

sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb #make sure you have the correct devices selected
cat /proc/mdstat
sudo mkfs.ext4 -F /dev/md0
sudo mkdir -p /mnt/md0
sudo mount /dev/md0 /mnt/md0
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
echo '/dev/md0 /mnt/md0 ext4 rw,user,exec,defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
sudo chmod 777 /mnt/md0/

References

Post 010/100 of the 100DaysToOffload-Challenge