This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
storage:ceph [2022/09/14 14:31] – Jan Forman | storage:ceph [2022/09/15 08:21] (current) – Jan Forman | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== PetaSAN ====== | ||
+ | Ceph for dummies | ||
+ | [[http:// | ||
+ | |||
+ | ====== Ansible ====== | ||
+ | [[https:// | ||
+ | |||
+ | ====== List all pools ====== | ||
+ | < | ||
+ | |||
+ | ====== OSD disk free ====== | ||
+ | < | ||
+ | |||
+ | |||
+ | ====== CEPH rebalance ====== | ||
+ | < | ||
+ | |||
+ | ====== Check OSD Blocklist ====== | ||
+ | < | ||
+ | ceph osd blocklist rm 127.0.0.1: | ||
+ | |||
+ | ====== Set minimum version ====== | ||
+ | |||
+ | < | ||
+ | |||
+ | ====== Remove OSD hard ====== | ||
+ | < | ||
+ | dd if=/ | ||
+ | ====== Insert object into RADOS ====== | ||
+ | < | ||
+ | rados -p pool ls | ||
+ | |||
+ | ====== Copy Pool ====== | ||
+ | < | ||
+ | ceph osd pool create $pool.new 128 128 erasure EC_RGW | ||
+ | rados cppool $pool $pool.new | ||
+ | ceph osd pool rename $pool $pool.old | ||
+ | ceph osd pool rename $pool.new $pool</ | ||
+ | |||
+ | ====== Where are data? ====== | ||
+ | < | ||
+ | |||
+ | ====== CEPH Print key ====== | ||
+ | |||
+ | < | ||
+ | |||
+ | |||
+ | ====== CEPH for Windows ====== | ||
+ | [[https:// | ||
+ | [[https:// | ||
+ | |||
+ | |||
+ | ====== CEPH list pool ====== | ||
+ | < | ||
+ | |||
+ | ====== CEPH delete pool ====== | ||
+ | < | ||
+ | |||
+ | ====== CEPH Create Erasure pool ====== | ||
+ | < | ||
+ | ceph osd pool set {name} allow_ec_overwrites true; | ||
+ | ceph osd pool application enable {name} rbd; | ||
+ | </ | ||
+ | |||
+ | ====== RADOSGW ====== | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | |||
+ | < | ||
+ | ===== ceph.conf ===== | ||
+ | < | ||
+ | [client.radosgw.node01] | ||
+ | host = node01 | ||
+ | | ||
+ | log file = / | ||
+ | </ | ||
+ | |||
+ | apt install radosgw\\ | ||
+ | systemctl restart radosgw\\ | ||
+ | \\ | ||
+ | http:// | ||
+ | |||
+ | ====== ISCSI ====== | ||
+ | < | ||
+ | sudo apt install ceph-iscsi targetcli-fb | ||
+ | systemctl daemon-reload | ||
+ | systemctl enable rbd-target-gw | ||
+ | systemctl start rbd-target-gw | ||
+ | systemctl enable rbd-target-api | ||
+ | systemctl start rbd-target-api | ||
+ | |||
+ | </ | ||
+ | start //gwcli// | ||
+ | < | ||
+ | create iqn.2003-01.com.janforman.iscsi-gw: | ||
+ | cd / | ||
+ | create {nodename} {IP} | ||
+ | cd /disks | ||
+ | create pool=rbd image=disk_1 size=90G | ||
+ | cd / | ||
+ | create iqn.1994-05.com.janforman: | ||
+ | cd / | ||
+ | auth username=myiscsiusername password=myiscsipassword | ||
+ | disk add rbd/disk_1 | ||
+ | </ | ||
+ | |||
+ | ====== Insert it into dashboard ====== | ||
+ | file: http:// | ||
+ | < | ||
+ | ceph dashboard iscsi-gateway-add -i file | ||
+ | </ | ||
+ | ===== Set OSD configs ===== | ||
+ | < | ||
+ | ceph tell osd.* config set osd_heartbeat_grace 20 | ||
+ | ceph tell osd.* config set osd_heartbeat_interval 5 | ||
+ | </ | ||
+ | ====== OSD dump info ====== | ||
+ | < | ||
+ | |||
+ | ====== CEPH Repair ====== | ||
+ | < | ||
+ | |||
+ | < | ||
+ | HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent | ||
+ | OSD_SCRUB_ERRORS 1 scrub errors | ||
+ | PG_DAMAGED Possible data damage: 1 pg inconsistent | ||
+ | pg 3.31 is active+clean+inconsistent, | ||
+ | </ | ||
+ | Corrupted PG on OSD 5,2,0 | ||
+ | < | ||
+ | |||
+ | < | ||
+ | 2019-07-29 10: | ||
+ | 2019-07-29 10: | ||
+ | 2019-07-29 10: | ||
+ | 2019-07-29 10: | ||
+ | 2019-07-29 10: | ||
+ | 2019-07-29 10: | ||
+ | </ | ||
+ | |||
+ | ====== Edit crush-map ====== | ||
+ | < | ||
+ | ceph osd getcrushmap -o / | ||
+ | crushtool -d / | ||
+ | |||
+ | crushtool -c crush_map -o / | ||
+ | ceph osd setcrushmap -i / | ||
+ | </ | ||
+ | |||
+ | ====== Turn cache on ====== | ||
+ | < | ||
+ | [client] | ||
+ | rbd_cache = true</ | ||
+ | |||
+ | May improve performance | ||
+ | < | ||
+ | osd_enable_op_tracker = false | ||
+ | throttler perf counter = false | ||
+ | </ | ||
+ | |||
+ | ====== Change device class ====== | ||
+ | If the automatic device class detection gets something wrong (e.g., because the device driver is not properly exposing information about the device via / | ||
+ | < | ||
+ | $ ceph osd crush rm-device-class osd.2 osd.3 | ||
+ | done removing class of osd(s): 2,3 | ||
+ | $ ceph osd crush set-device-class ssd osd.2 osd.3 | ||
+ | set osd(s) 2,3 to class ' | ||
+ | </ | ||
+ | |||
+ | ====== Partitions ====== | ||
+ | < | ||
+ | # types | ||
+ | type 0 osd | ||
+ | type 1 host | ||
+ | type 2 chassis | ||
+ | type 3 rack | ||
+ | type 4 row | ||
+ | type 5 pdu | ||
+ | type 6 pod | ||
+ | type 7 room | ||
+ | type 8 datacenter | ||
+ | type 9 region | ||
+ | type 10 root | ||
+ | </ | ||
+ | |||
+ | ====== CEPH LVM List ====== | ||
+ | |||
+ | < | ||
+ | ceph-volume lvm list | ||
+ | </ | ||
+ | |||
+ | ====== OSD Weight ====== | ||
+ | < | ||
+ | ceph osd crush set 0 0.5 pool=default host=proxmox01 | ||
+ | ceph osd crush set 1 0.5 pool=default host=proxmox02 | ||
+ | ceph osd crush set 2 0.5 pool=default host=proxmox03 | ||
+ | </ | ||
+ | |||
+ | ====== Benchmark ====== | ||
+ | < | ||
+ | rados -p ceph bench 60 write --no-cleanup | ||
+ | </ | ||
+ | Default object size is 4 MB, and the default number of simulated threads (parallel writes) is 16.\\ | ||
+ | -t (threads)\\ | ||
+ | write / seq / read | ||
+ | |||
+ | ===== Remove benchmark data ===== | ||
+ | < | ||
+ | ====== Show pool stats ====== | ||
+ | < | ||
+ | rados -p ceph df | ||
+ | </ | ||
+ | |||
+ | ====== Enable dashboard ====== | ||
+ | < | ||
+ | Generate selfsigned certificate | ||
+ | < | ||
+ | Disable TLS | ||
+ | < | ||
+ | |||
+ | < | ||
+ | ceph dashboard ac-user-create < | ||
+ | </ | ||
+ | |||
+ | ====== Add new MON ====== | ||
+ | < | ||
+ | ceph auth get mon. -o / | ||
+ | ceph mon getmap -o /tmp/map | ||
+ | sudo ceph-mon -i {HOSTNAME} --mkfs --monmap /tmp/map --keyring / | ||
+ | chown -R ceph:ceph / | ||
+ | </ | ||
+ | manual run | ||
+ | < | ||
+ | |||
+ | ====== CEPH List Auth ====== | ||
+ | < | ||
+ | |||
+ | ====== Show clock-skew ====== | ||
+ | < | ||
+ | |||
+ | |||
+ | ====== Ceph Evict Client ====== | ||
+ | [[https:// | ||
+ | |||
+ | ====== Replication ====== | ||
+ | < | ||
+ | ceph osd pool set data size 3 | ||
+ | ceph osd pool set data min_size 2 | ||
+ | </ | ||
+ | |||
+ | For n = 4 nodes each with 1 osd and 1 mon and settings of replica min_size 1 and size 4 three osd can fail, only one mon can fail (the monitor quorum means more than half will survive). 4 + 1 number of monitors is required for two failed monitors (at least one should be external without osd). For 8 monitors (four external monitors) three mon can fail, so even three nodes each with 1 osd and 1 mon can fail. I am not sure that setting of 8 monitors is possible.\\ | ||
+ | |||
+ | For three nodes each with one monitor and osd the only reasonable settings are replica min_size 2 and size 3 or 2. Only one node can fail. If you have an external monitors, if you set min_size to 1 (this is very dangerous) and size to 2 or 1 the 2 nodes can be down. But with one replica (no copy, only original data) you can loose your job very soon. | ||
+ | |||
+ | |||
+ | * Ensure you have a realistic number of placement groups. We recommend | ||
+ | * approximately 100 per OSD. E.g., total number of OSDs multiplied by 100 | ||
+ | * divided by the number of replicas (i.e., osd pool default size). So for | ||
+ | * 10 OSDs and osd pool default size = 4, we'd recommend approximately | ||
+ | * (100 * 10) / 4 = 250. |