- Published on
Setup a 3 Node Ceph Storage Cluster on Ubuntu 16
- Authors
- Name
- Ruan Bekker
- @ruanbekker
For some time now, I wanted to do a setup of Ceph, and I finally got the time to do it. This setup was done on Ubuntu 16.04
What is Ceph
Ceph is a storage platform that implements object storage on a single distributed computer cluster and provides interfaces for object, block and file-level storage.
- Object Storage:
Ceph provides seemless access to objects via native language bindings or via the REST interface, RadosGW and also compatible for applications written for S3 and Swift.
- Block Storage:
Ceph's Rados Block Device (RBD) provides access to block device images that are replicated and striped across the storage cluster.
- File System:
Ceph provides a network file system (CephFS) that aims for high performance.
Our Setup
We will have 4 nodes. 1 Admin node where we will deploy our cluster with, and 3 nodes that will hold the data:
- ceph-admin (10.0.8.2)
- ceph-node1 (10.0.8.3)
- ceph-node2 (10.0.8.4)
- ceph-node3 (10.0.8.5)
Host Entries
If you don't have dns for your servers, setup the /etc/hosts
file so that the names can resolves to the ip addresses:
10.0.8.2 ceph-admin
10.0.8.3 ceph-node1
10.0.8.4 ceph-node2
10.0.8.5 ceph-node3
User Accounts and Passwordless SSH
Setup the ceph-system
user accounts on all the servers:
$ useradd -d /home/ceph-system -s /bin/bash -m ceph-system
$ passwd ceph-system
Setup the created user part of the sudoers that is able to issue sudo commands without a pssword:
$ echo "ceph-system ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph-system
$ chmod 0440 /etc/sudoers.d/ceph-system
Switch user to ceph-system
and generate SSH keys and copy the keys from the ceph-admin
server to the ceph-nodes:
$ sudo su - ceph-system
$ ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ""
$ ssh-copy-id ceph-system@ceph-node1
$ ssh-copy-id ceph-system@ceph-node2
$ ssh-copy-id ceph-system@ceph-node3
$ ssh-copy-id ceph-system@ceph-admin
Pre-Requisite Software:
Install Python and Ceph Deploy on each node:
$ sudo apt-get install python -y
$ sudo apt install ceph-deploy -y
Note: Please skip this section if you have additional disks on your servers.
The instances that im using to test this setup only has one disk, so I will be creating loop block devices using allocated files. This is not recommended as when the disk fails, all the (files/block device images) will be gone with that. But since im demonstrating this, I will create the block devices from a file:
I will be creating a 12GB file on each node
$ sudo mkdir /raw-disks
$ sudo dd if=/dev/zero of=/raw-disks/rd0 bs=1M count=12288
The use losetup to create the loop0 block device:
$ sudo losetup /dev/loop0 /raw-disks/rd0
As you can see the loop device is showing when listing the block devices:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 12G 0 loop
Install Ceph
Now let's install ceph using ceph-deploy to all our nodes:
$ sudo apt update && sudo apt upgrade -y
$ ceph-deploy install ceph-admin ceph-node1 ceph-node2 ceph-node3
The version I was running at the time:
$ ceph --version
ceph version 10.2.9
Initialize Ceph
Initialize the Cluster with 3 Monitors:
$ ceph-deploy new ceph-node1 ceph-node2 ceph-node3
Add the initial monitors and gather the keys from the previous command:
$ ceph-deploy mon create-initial
At this point, we should be able to scan the block devices on our nodes:
$ ceph-deploy disk list ceph-node3
[ceph-node3][INFO ] Running command: sudo /usr/sbin/ceph-disk list
[ceph-node3][DEBUG ] /dev/loop0 other
Prepare the Disks:
First we will zap the block devices and then prepare to create the partitions:
$ ceph-deploy disk zap ceph-node1:/dev/loop0 ceph-node2:/dev/loop0 ceph-node3:/dev/loop0
$ ceph-deploy osd prepare ceph-node1:/dev/loop0 ceph-node2:/dev/loop0 ceph-node3:/dev/loop0
[ceph_deploy.osd][DEBUG ] Host ceph-node1 is now ready for osd use.
[ceph_deploy.osd][DEBUG ] Host ceph-node2 is now ready for osd use.
[ceph_deploy.osd][DEBUG ] Host ceph-node3 is now ready for osd use.
When you scan the nodes for their disks, you will notice that the partitions has been created:
$ ceph-deploy disk list ceph-node1
[ceph-node1][DEBUG ] /dev/loop0p2 ceph journal, for /dev/loop0p1
[ceph-node1][DEBUG ] /dev/loop0p1 ceph data, active, cluster ceph, osd.0, journal /dev/loop0p2
Now let's activate the OSD's by using the data partitions:
$ ceph-deploy osd activate ceph-node1:/dev/loop0p1 ceph-node2:/dev/loop0p1 ceph-node3:/dev/loop0p1
Redistribute Keys:
Copy the configuration files and admin key to your admin node and ceph data nodes:
$ ceph-deploy admin ceph-admin ceph-node1 ceph-node2 ceph-node3
If you would like to add more OSD's (not tested):
$ ceph-deploy disk zap ceph-node1:/dev/loop1 ceph-node2:/dev/loop1 ceph-node3:/dev/loop1
$ ceph-deploy osd prepare ceph-node1:/dev/loop1 ceph-node2:/dev/loop1 ceph-node3:/dev/loop1
$ ceph-deploy osd activate ceph-node2:/dev/loop1p1:/dev/loop1p2 ceph-node2:/dev/loop1p1:/dev/loop1p2 ceph-node3:/dev/loop1p1:/dev/loop1p2
$ ceph-deploy admin ceph-node1 ceph-node2 ceph-node3
Ceph Status:
Have a look at your cluster status:
$ sudo ceph -s
cluster 8d704c8a-ac19-4454-a89f-89a5d5b7d94d
health HEALTH_OK
monmap e1: 3 mons at {ceph-node1=10.0.8.3:6789/0,ceph-node2=10.0.8.4:6789/0,ceph-node3=10.0.8.5:6789/0}
election epoch 10, quorum 0,1,2 ceph-node2,ceph-node3,ceph-node1
osdmap e14: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v29: 64 pgs, 1 pools, 0 bytes data, 0 objects
100 MB used, 18298 MB / 18398 MB avail
64 active+clean
Everything looks good. Also change the permissions on this file, on all the nodes in order to execute the ceph, rados commands:
$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring
Storage Pools:
List your pool in your Ceph Cluster:
$ rados lspools
rbd
Let's create a new storage pool called mypool
:
$ ceph osd pool create mypool 32 32
pool 'mypool' created
Let's the list the storage pools again:
$ rados lspools
rbd
mypool
You can also use the ceph command to list the pools:
$ ceph osd pool ls
rbd
mypool
Create a Block Device Image:
$ rbd create --size 1024 mypool/disk1 --image-feature layering
List the Block Device Images under your Pool:
$ rbd list mypool
disk1
Retrieve information from your image:
$ rbd info mypool/disk1
rbd image 'disk1':
size 1024 MB in 256 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.1021643c9869
format: 2
features: layering
flags:
create_timestamp: Thu Jun 7 23:48:23 2018
Create a local mapping of the image to a block device:
$ sudo rbd map mypool/disk1
/dev/rbd0
Now we have a block device available at /dev/rbd0
. Go ahead and mount it to /mnt
:
$ sudo mount /dev/rbd0 /mnt
We can then see it when we list our mounted disk partitions:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 19G 13G 5.2G 72% /
/dev/rbd0 976M 1.3M 908M 1% /mnt
We can also resize the disk on the fly, let's resize it from 1GB to 2GB:
$ rbd resize mypool/disk1 --size 2048
Resizing image: 100% complete...done.
To grow the space we can use resize2fs for ext4 partitions and xfs_growfs for xfs partitions:
$ sudo resize2fs /dev/rbd0
resize2fs 1.42.13 (17-May-2015)
Filesystem at /dev/rbd0 is mounted on /mnt; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 1
The filesystem on /dev/rbd0 is now 524288 (4k) blocks long.
When we look at our mounted partitions, you will notice that the size of our mounted partition has been increased in size:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 19G 13G 5.2G 72% /
/dev/rbd0 2.0G 1.5M 1.9G 1% /mnt
Object Storage RadosGW
Let's create a new pool where we will store our objects:
$ ceph osd pool create object-pool 32 32
pool 'object-pool' created
We will now create a local file, push the file to our object storage service, then delete our local file, download the file as a file with a different name, and read the contents:
Create the local file:
$ echo "ok" > test.txt
Push the local file to our pool in our object storage:
$ rados put objects/data/test.txt ./test.txt --pool object-pool
List the pool (note that this can be executed from any node):
$ $ rados ls --pool object-pool
objects/data/test.txt
Delete the local file, download the file from our object storage and read the contents:
$ rm -rf test.txt
$ rados get objects/data/test.txt ./newfile.txt --pool object-pool
$ cat ./newfile.txt
ok
View the disk space from our storage-pool:
$ rados df --pool object-pool
pool name KB objects clones degraded unfound rd rd KB wr wr KB
object-pool 1 1 0 0 0 0 0 1 1
total used 261144 37
total avail 18579372
total space 18840516