Charmed Ceph manual install
The guide shows how to perform a general install of Charmed Ceph. It will provide detail on the fundamental concepts and show how a more customised Ceph cluster can be achieved. Ensure that the base requirements have been met.
What you will need
- a snapd-compatible host to run the Juju client
- a MAAS cluster (with a user account at your disposal)
Cluster specifications
The Ceph cluster will have three Monitors and six OSDs. The OSDs will be provided by three storage nodes, with two OSDs hosted per node (backed by devices /dev/sdb
and /dev/sdc
).
A Monitor will be containerised on each of the storage nodes. This means that you will require three machines for the Ceph cluster. One additional machine will be needed for the Juju controller. The MAAS cluster must therefore consist of at least four machines.
The MAAS nodes will be running Ubuntu 22.04 LTS (Jammy) as will any LXD containers created during the deployment. The use of Jammy will give us the opportunity to include the UCA in our configuration. Ceph Reef will be deployed.
Set up the environment
Ensure that you have a host with Juju installed and that it has network connectivity to the MAAS cluster. You must also have a user created on the MAAS cluster.
Inform Juju about the remote MAAS cluster and provide credentials for accessing it. Here the MAAS cluster will be added to Juju as a cloud called ‘my-maas’.
juju add-cloud
juju add-credential my-maas
We’ll then create a Juju controller management node as well as a Juju model to house the Ceph cluster. The controller will be called ‘my-controller’ and the model will be called ‘ceph’:
juju bootstrap --bootstrap-series=jammy my-maas my-controller
juju add-model --config default-series=jammy ceph
Please see the relevant Juju documentation if you need help with carrying out the above commands.
Configuration options
In this deployment charm configuration options will be placed in a YAML file as opposed to stating them directly on the command line. File ceph.yaml
will contain the options for all the charms we will use:
ceph-mon:
customize-failure-domain: true
monitor-count: 3
expected-osd-count: 3
source: cloud:jammy-bobcat
ceph-osd:
customize-failure-domain: true
osd-devices: /dev/sdb /dev/sdc
source: cloud:jammy-bobcat
The meaning of the above options are explained below, categorised by charm.
ceph-mon
-
customize-failure-domain
This option determines how a Ceph CRUSH map is configured. A value of ‘false’ (the default) will lead to a map that will replicate data across hosts (implemented as Ceph bucket type ‘host’). With a value of ‘true’ all MAAS-defined zones will be used to generate a map that will replicate data across Ceph availability zones (implemented as bucket type ‘rack’). This option is also supported by the ceph-osd charm, and its value must be the same for both charms. -
expected-osd-count
This option states the number of OSDs expected to be deployed in the cluster. This value can influence the number of placement groups (PGs) to use per pool. The PG calculation is based either on the actual number of OSDs or this option’s value, whichever is greater. The default value is ‘0’, which tells the charm to only consider the actual number of OSDs. -
monitor-count
This option gives the number of ceph-mon units in the monitor cluster (where one ceph-mon unit represents one MON). The default value is ‘3’ and is generally a good choice. For other monitor counts, it is good practice to set this explicitly to avoid a possible race condition during the formation of the cluster. The capacity for fault tolerance is based upon the preservation of quorum (majority of the original MON cluster). For example, a count of five can tolerate two failures (quorum: ⅗), a count of four can tolerate one failure (quorum: ¾), and a count of three can also tolerate one failure (quorum: ⅔). -
source
This option states the software sources. A common value is an OpenStack UCA release (e.g. ‘cloud:jammy-bobcat’). See Ceph and the UCA. The underlying host’s existing apt sources will be used if this option is not specified (this behaviour can be explicitly chosen by using the value of ‘distro’). This option is also supported by the ceph-osd charm, and its value must be the same for both charms.
ceph-osd
-
customize-failure-domain
This option’s description is the same as that of the identically-named ceph-mon option. Each charm’s option must have the same value. -
osd-devices
This option lists what block devices can be used for OSDs across the cluster. This list may affect newly added ceph-osd units as well as existing units (the option may be modified after units have been added). The charm will attempt to activate as Ceph storage any listed device that is visible by the unit’s underlying machine. -
source
This option’s description is the same as that of the identically-named ceph-mon option. Each charm’s option must have the same value.
In particular, we see that the OpenStack 2023.1 (code named Antelope) on Jammy has been chosen as the software source for Ceph. From Ceph and the UCA it can be deduced that the resulting Ceph version will be Reef.
Data resiliency
Ceph storage is natively highly available. This means that, by design, data objects are stored in such a way that data resiliency is ensured in the advent of an OSD failure. By default the ceph-osd charm will maintain three replicas of each storage object across the cluster.
The storage of these objects is organised on a per pool basis, where each pool is of one of two types:
- replicated (default)
- erasure coded
For a detailed explanation of these two pool types see section Pool types in this guide.
Deploy Ceph
The actual deployment of Ceph is straightforward:
juju deploy -n 3 --config ./ceph.yaml ceph-osd
juju deploy -n 3 --to lxd:0,lxd:1,lxd:2 --config ./ceph.yaml ceph-mon
juju integrate ceph-osd:mon ceph-mon:osd
As planned, a containerised Monitor is placed on each storage node. We’ve assumed that the machines spawned in the first command are assigned the IDs of 0, 1, and 2. The latter is the default behaviour (i.e. a model’s first machine will have an ID of 0).
The output to the juju status
command should look similar to this:
Model Controller Cloud/Region Version SLA Timestamp
ceph my-controller my-maas/default 3.5.2 unsupported 14:31:23Z
App Version Status Scale Charm Store Rev OS Notes
ceph-mon 18.2.0 active 3 ceph-mon jujucharms 210 ubuntu
ceph-osd 18.2.0 active 3 ceph-osd jujucharms 589 ubuntu
Unit Workload Agent Machine Public address Ports Message
ceph-mon/0* active idle 0/lxd/0 10.0.0.130 Unit is ready and clustered
ceph-mon/1 active idle 1/lxd/0 10.0.0.131 Unit is ready and clustered
ceph-mon/2 active idle 2/lxd/0 10.0.0.132 Unit is ready and clustered
ceph-osd/0* active idle 0 10.0.0.127 Unit is ready (2 OSD)
ceph-osd/1 active idle 1 10.0.0.128 Unit is ready (2 OSD)
ceph-osd/2 active idle 2 10.0.0.129 Unit is ready (2 OSD)
Machine State DNS Inst id Series AZ Message
0 started 10.0.0.127 node1 jammy default Deployed
0/lxd/0 started 10.0.0.130 juju-385085-0-lxd-0 jammy default Container started
1 started 10.0.0.128 node2 jammy default Deployed
1/lxd/0 started 10.0.0.131 juju-385085-1-lxd-0 jammy default Container started
2 started 10.0.0.129 node3 jammy default Deployed
2/lxd/0 started 10.0.0.132 juju-385085-2-lxd-0 jammy default Container started
Above we see the three Monitors and the six OSDs (two per storage node). It should also be clear that each Monitor is containerised (e.g. 1/lxd/0 indicates “LXD machine 0 on machine 1”).
Under the ‘Version’ column for either the ceph-mon or ceph-osd application a value of ‘18.2.0’ is shown. This corresponds to Ceph Reef.
Verification
Verify the state of the Ceph cluster by displaying the output to the traditional ceph status command. Invoke this command on one of the Monitors:
juju ssh ceph-mon/0 -- sudo ceph status
Sample output is:
cluster:
id: de95f820-c7e1-11ea-a916-b9347f3c399e
health: HEALTH_OK
services:
mon: 3 daemons, quorum juju-385085-0-lxd-0,juju-385085-1-lxd-0,juju-385085-2-lxd-0 (age 5d)
mgr: juju-385085-0-lxd-0(active, since 5d), standbys: juju-385085-2-lxd-0, juju-385085-1-lxd-0
osd: 6 osds: 6 up (since 5d), 6 in (since 5d)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 6.0 GiB used, 174 GiB / 180 GiB avail
pgs: 1 active+clean
You now have a Ceph cluster up and running.