User Tools

Site Tools


storage:ceph

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
storage:ceph [2022/09/14 14:31] Jan Formanstorage:ceph [2022/09/15 08:21] (current) Jan Forman
Line 1: Line 1:
 +====== PetaSAN ======
 +Ceph for dummies
 +[[http://www.petasan.org/]]
  
 +
 +====== Ansible ======
 +[[https://docs.ceph.com/projects/ceph-ansible/en/latest/]]
 +
 +====== List all pools ======
 +<code>ceph osd pool ls detail</code>
 +
 +====== OSD disk free ======
 +<code>ceph osd df tree</code>
 +
 +
 +====== CEPH rebalance ======
 +<code>ceph osd reweight-by-utilization</code>
 +
 +====== Check OSD Blocklist ======
 +<code>ceph osd blocklist ls
 +ceph osd blocklist rm 127.0.0.1:0/3710147553</code>
 +
 +====== Set minimum version ======
 +
 +<code>ceph osd require-osd-release octopus</code>
 +
 +====== Remove OSD hard ======
 +<code>
 +dd if=/dev/zero of=/dev/sd{X} bs=1M count=10 conv=fsync</code>
 +====== Insert object into RADOS ======
 +<code>rados -p pool put {object} filename</code>
 +rados -p pool ls
 +
 +====== Copy Pool ======
 +<code>pool={poolname}
 +ceph osd pool create $pool.new 128 128 erasure EC_RGW
 +rados cppool $pool $pool.new
 +ceph osd pool rename $pool $pool.old
 +ceph osd pool rename $pool.new $pool</code>
 +
 +====== Where are data? ======
 +<code>ceph osd map {pool} object {object} -f json-pretty</code>
 +
 +====== CEPH Print key ======
 +
 +<code>ceph -k /etc/ceph/ceph.client.admin.keyring auth print-key entity</code>
 +
 +
 +====== CEPH for Windows ======
 +[[https://github.com/dokan-dev/dokany/releases]] - required\\
 +[[https://cloudbase.it/ceph-for-windows/]]
 +
 +
 +====== CEPH list pool ======
 +<code>ceph osd lspools</code>
 +
 +====== CEPH delete pool ======
 +<code>ceph osd pool delete <pool-name> <pool-name> --yes-i-really-really-mean-it</code>
 +
 +====== CEPH Create Erasure pool ======
 +<code>ceph osd pool create {name} {pgsize} erasure
 +ceph osd pool set {name} allow_ec_overwrites true;
 +ceph osd pool application enable {name} rbd;
 +</code>
 +
 +====== RADOSGW ======
 +<code>ceph-authtool --create-keyring /etc/ceph/ceph.client.radosgw.keyring</code>
 +<code>ceph-authtool /etc/ceph/ceph.client.radosgw.keyring -n client.radosgw.node01 --gen-key</code>
 +<code>ceph-authtool -n client.radosgw.node01 --cap osd 'allow rwx' --cap mon 'allow rwx' /etc/ceph/ceph.client.radosgw.keyring</code>
 +
 +<code>ceph -k /etc/ceph/ceph.client.admin.keyring auth add client.radosgw.node01 -i /etc/ceph/ceph.client.radosgw.keyring</code>
 +===== ceph.conf =====
 +<code>
 +[client.radosgw.node01]
 +       host = node01
 +       keyring = /etc/ceph/ceph.client.radosgw.keyring
 +       log file = /var/log/ceph/client.radosgw.$host.log
 +</code>
 +
 +apt install radosgw\\
 +systemctl restart radosgw\\
 +\\
 +http://node01:7480
 +
 +====== ISCSI ======
 +<code>
 +sudo apt install ceph-iscsi targetcli-fb
 +systemctl daemon-reload
 +systemctl enable rbd-target-gw
 +systemctl start rbd-target-gw
 +systemctl enable rbd-target-api
 +systemctl start rbd-target-api
 +
 +</code>
 +start //gwcli//
 +<code>cd /iscsi-targets
 +create iqn.2003-01.com.janforman.iscsi-gw:iscsi-igw
 +cd /iscsi-targets/iqn.2003-01.com.janforman.iscsi-gw:iscsi-igw/gateways
 +create {nodename} {IP}
 +cd /disks
 +create pool=rbd image=disk_1 size=90G
 +cd /iscsi-targets/iqn.2003-01.com.janforman.iscsi-gw:iscsi-igw/hosts
 +create iqn.1994-05.com.janforman:client
 +cd /iscsi-targets/iqn.2003-01.com.janforman.iscsi-gw:iscsi-igw/hosts/iqn.1994-05.com.janforman:client
 +auth username=myiscsiusername password=myiscsipassword
 +disk add rbd/disk_1
 +</code>
 +
 +====== Insert it into dashboard ======
 +file: http://admin:admin@10.160.1.15:5001
 +<code>
 +ceph dashboard iscsi-gateway-add -i file
 +</code>
 +===== Set OSD configs =====
 +<code>
 +ceph tell osd.* config set osd_heartbeat_grace 20
 +ceph tell osd.* config set osd_heartbeat_interval 5
 +</code>
 +====== OSD dump info ======
 +<code>ceph osd dump</code>
 +
 +====== CEPH Repair ======
 +<code>ceph health detail</code>
 +
 +<code>
 +HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
 +OSD_SCRUB_ERRORS 1 scrub errors
 +PG_DAMAGED Possible data damage: 1 pg inconsistent
 +    pg 3.31 is active+clean+inconsistent, acting [5,2,0]
 +</code>
 +Corrupted PG on OSD 5,2,0
 +<code>ceph pg repair 3.31</code>
 +
 +<code>
 +2019-07-29 10:01:54.975649 mon.cloud-gis00 (mon.0) 21584 : cluster [INF] Health check cleared: OSD_SCRUB_ERRORS (was: 1 scrub errors)
 +2019-07-29 10:01:54.975690 mon.cloud-gis00 (mon.0) 21585 : cluster [INF] Health check cleared: PG_DAMAGED (was: Possible data damage: 1 pg inconsistent)
 +2019-07-29 10:01:54.975709 mon.cloud-gis00 (mon.0) 21586 : cluster [INF] Cluster is now healthy
 +2019-07-29 10:01:52.358272 osd.5 (osd.5) 428 : cluster [ERR] 3.31 shard 0 soid 3:8df0528b:::rbd_data.9f8f474b0dc51.0000000000002485:head : candidate had a read error
 +2019-07-29 10:01:52.358608 osd.5 (osd.5) 429 : cluster [ERR] 3.31 repair 0 missing, 1 inconsistent objects
 +2019-07-29 10:01:52.358616 osd.5 (osd.5) 430 : cluster [ERR] 3.31 repair 1 errors, 1 fixed
 +</code>
 +
 +====== Edit crush-map ======
 +<code>
 +ceph osd getcrushmap -o /tmp/crushmap
 +crushtool -d /tmp/crushmap -o crush_map
 +
 +crushtool -c crush_map -o /tmp/crushmap
 +ceph osd setcrushmap -i /tmp/crushmap
 +</code>
 +
 +====== Turn cache on ======
 +<code>
 +[client]
 +rbd_cache = true</code>
 +
 +May improve performance
 +<code>
 +osd_enable_op_tracker = false
 +throttler perf counter = false 
 +</code>
 +
 +====== Change device class ======
 +If the automatic device class detection gets something wrong (e.g., because the device driver is not properly exposing information about the device via /sys/block), you can also adjust device classes from the command line:
 +<code>
 +$ ceph osd crush rm-device-class osd.2 osd.3
 +done removing class of osd(s): 2,3
 +$ ceph osd crush set-device-class ssd osd.2 osd.3
 +set osd(s) 2,3 to class 'ssd'
 +</code>
 +
 +====== Partitions ======
 +<code>
 +# types
 +type 0 osd
 +type 1 host
 +type 2 chassis
 +type 3 rack
 +type 4 row
 +type 5 pdu
 +type 6 pod
 +type 7 room
 +type 8 datacenter
 +type 9 region
 +type 10 root
 +</code>
 +
 +====== CEPH LVM List ======
 +
 +<code>
 +ceph-volume lvm list
 +</code>
 +
 +====== OSD Weight ======
 +<code>
 +ceph osd crush set 0 0.5 pool=default host=proxmox01
 +ceph osd crush set 1 0.5 pool=default host=proxmox02
 +ceph osd crush set 2 0.5 pool=default host=proxmox03
 +</code>
 +
 +====== Benchmark ======
 +<code>
 +rados -p ceph bench 60 write --no-cleanup
 +</code>
 +Default object size is 4 MB, and the default number of simulated threads (parallel writes) is 16.\\
 +-t (threads)\\
 +write / seq / read
 +
 +===== Remove benchmark data =====
 +<code>rados -p pool cleanup --prefix benchmark_data</code>
 +====== Show pool stats ======
 +<code>
 +rados -p ceph df
 +</code>
 +
 +====== Enable dashboard ======
 +<code>ceph mgr module enable dashboard</code>
 +Generate selfsigned certificate
 +<code>ceph dashboard create-self-signed-cert</code>
 +Disable TLS
 +<code>ceph config set mgr mgr/dashboard/ssl false</code>
 +
 +<code>
 +ceph dashboard ac-user-create <username> -i <file-containing-password> administrator
 +</code>
 +
 +====== Add new MON ======
 +<code>
 +ceph auth get mon. -o /tmp/keyring
 +ceph mon getmap -o /tmp/map
 +sudo ceph-mon -i {HOSTNAME} --mkfs --monmap /tmp/map --keyring /tmp/keyring
 +chown -R ceph:ceph /var/lib/ceph/mon
 +</code>
 +manual run 
 +<code>ceph-mon -f -i {HOSTNAME} --public-addr {IP}</code>
 +
 +====== CEPH List Auth ======
 +<code>ceph auth list</code>
 +
 +====== Show clock-skew ======
 +<code>ceph time-sync-status</code>
 +
 +
 +====== Ceph Evict Client ======
 +[[https://docs.ceph.com/en/latest/cephfs/eviction/]]
 +
 +====== Replication ======
 +<code>
 +ceph osd pool set data size 3
 +ceph osd pool set data min_size 2
 +</code>
 +
 +For n = 4 nodes each with 1 osd and 1 mon and settings of replica min_size 1 and size 4 three osd can fail, only one mon can fail (the monitor quorum means more than half will survive). 4 + 1 number of monitors is required for two failed monitors (at least one should be external without osd). For 8 monitors (four external monitors) three mon can fail, so even three nodes each with 1 osd and 1 mon can fail. I am not sure that setting of 8 monitors is possible.\\
 +
 +For three nodes each with one monitor and osd the only reasonable settings are replica min_size 2 and size 3 or 2. Only one node can fail. If you have an external monitors, if you set min_size to 1 (this is very dangerous) and size to 2 or 1 the 2 nodes can be down. But with one replica (no copy, only original data) you can loose your job very soon.
 +
 +
 +  * Ensure you have a realistic number of placement groups. We recommend
 +  * approximately 100 per OSD. E.g., total number of OSDs multiplied by 100
 +  * divided by the number of replicas (i.e., osd pool default size). So for
 +  * 10 OSDs and osd pool default size = 4, we'd recommend approximately
 +  * (100 * 10) / 4 = 250.
storage/ceph.txt · Last modified: 2022/09/15 08:21 by Jan Forman