Pacemaker

De jagfloriano.com
Revisión del 11:02 3 ene 2026 de Escleiron (discusión | contribs.)
(difs.) ← Revisión anterior | Revisión actual (difs.) | Revisión siguiente → (difs.)
Ir a la navegaciónIr a la búsqueda

Creación de un clúster Pacemaker

Introducción

En este laboratorio se configura un clúster Pacemaker utilizando almacenamiento compartido vía iSCSI. Para ello se emplean un mínimo de tres máquinas:

  • Dos nodos que formarán el clúster Pacemaker
  • Una máquina adicional que actuará como servidor de almacenamiento SAN

El objetivo es simular el uso de discos compartidos de una cabina de almacenamiento en una infraestructura real.

Topología del laboratorio

  • Servidor SAN
    • Icecube — 192.168.1.80
  • Nodos Pacemaker
    • Nodo1 — 192.168.1.81
    • Nodo2 — 192.168.1.82

Configuración del almacenamiento SAN

Antes de proceder con la implementación del clúster, es necesario haber configurado el almacenamiento compartido. Este proceso se detalla en la sección dedicada a iSCSI

Verificación del disco compartido

Comprobar que el nuevo disco aparece en ambos nodos:

Nodo1:
[root@nodo1 ~]# iscsiadm -m session -o show
tcp: [1] 192.168.1.80:3260,1 iqn.2026-01.icecube:storage.target01 (non-flash)
[root@nodo1 ~]# lsblk
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0 10.5G  0 disk
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0  9.5G  0 part
  ├─centos-root 253:0    0  8.4G  0 lvm  /
  └─centos-swap 253:1    0    1G  0 lvm  [SWAP]
sdb               8:16   0    8G  0 disk
sr0              11:0    1 1024M  0 rom
Nodo2:
[root@nodo2 ~]#  iscsiadm -m session -o show
tcp: [1] 192.168.1.80:3260,1 iqn.2026-01.icecube:storage.target01 (non-flash)
[root@nodo2 ~]# lsblk
NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda                   8:0    0   20G  0 disk
├─sda1                8:1    0    1G  0 part /boot
└─sda2                8:2    0   19G  0 part
  ├─rhel-root       253:0    0   17G  0 lvm  /
  └─rhel-swap       253:1    0    2G  0 lvm  [SWAP]
sdb                   8:16   0   40G  0 disk
sr0                  11:0    1 1024M  0 rom

Pacemaker

Instalación

PACEMAKER RHEL9

[root@nodo1 ~]# subscription-manager repos \
  --enable=rhel-9-for-x86_64-highavailability-rpms
Repository 'rhel-9-for-x86_64-highavailability-rpms' is enabled for this system.

[root@nodo2 ~]# subscription-manager repos \
  --enable=rhel-9-for-x86_64-highavailability-rpms
Repository 'rhel-9-for-x86_64-highavailability-rpms' is enabled for this system.
[root@nodo1 ~]# dnf install -y pacemaker pcs fence-agents-all lvm2
[root@nodo2 ~]# dnf install -y pacemaker pcs fence-agents-all lvm2

Configuración Cluster

[root@nodo1 ~]# systemctl enable --now pcsd
Created symlink /etc/systemd/system/multi-user.target.wants/pcsd.service  /usr/lib/systemd/system/pcsd.service.
[root@nodo1 ~]#

[root@nodo2 ~]# systemctl enable --now pcsd
Created symlink /etc/systemd/system/multi-user.target.wants/pcsd.service  /usr/lib/systemd/system/pcsd.service.
[root@nodo2 ~]#
[root@nodo1 ~]# passwd hacluster
Changing password for user hacluster.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[root@nodo1 ~]#

[root@nodo2 ~]# passwd hacluster
Changing password for user hacluster.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[root@nodo2 ~]#


root@nodo1 ~]# firewall-cmd --add-service=high-availability --permanent
success
[root@nodo1 ~]# firewall-cmd --reload
success
[root@nodo1 ~]#


[root@nodo2 ~]# firewall-cmd --add-service=high-availability --permanent
success
[root@nodo2 ~]# firewall-cmd --reload
success
[root@nodo1 ~]# pcs host auth nodo1 nodo2
Username: hacluster
Password:
nodo1: Authorized
nodo2: Authorized
[root@nodo1 ~]#

[root@nodo2 ~]#  pcs host auth nodo1 nodo2
Username: hacluster
Password:
nodo1: Authorized
nodo2: Authorized
[root@nodo2 ~]#
[root@nodo2 ~]# pcs cluster setup iscsi-cluster nodo1 nodo2
No addresses specified for host 'nodo1', using 'nodo1'
No addresses specified for host 'nodo2', using 'nodo2'
Destroying cluster on hosts: 'nodo1', 'nodo2'...
nodo1: Successfully destroyed cluster
nodo2: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'nodo1', 'nodo2'
nodo2: successful removal of the file 'pcsd settings'
nodo1: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'nodo1', 'nodo2'
nodo2: successful distribution of the file 'corosync authkey'
nodo2: successful distribution of the file 'pacemaker authkey'
nodo1: successful distribution of the file 'corosync authkey'
nodo1: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'nodo1', 'nodo2'
nodo2: successful distribution of the file 'corosync.conf'
nodo1: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
[root@nodo2 ~]#
[root@nodo2 ~]# pcs status
Error: error running crm_mon, is pacemaker running?
  crm_mon: Connection to cluster failed: Connection refused
[root@nodo2 ~]# pcs cluster start --all
nodo1: Starting Cluster...
nodo2: Starting Cluster...
[root@nodo2 ~]# pcs cluster enable --all
nodo1: Cluster Enabled
nodo2: Cluster Enabled
[root@nodo2 ~]# pcs status
Cluster name: iscsi-cluster

WARNINGS:
No stonith devices and stonith-enabled is not false
error: Resource start-up disabled since no STONITH resources have been defined
error: Either configure some or disable STONITH with the stonith-enabled option
error: NOTE: Clusters with shared data need STONITH to ensure data integrity
warning: Node nodo1 is unclean but cannot be fenced
warning: Node nodo2 is unclean but cannot be fenced
error: CIB did not pass schema validation
Errors found during check: config not valid

Cluster Summary:
  * Stack: unknown (Pacemaker is running)
  * Current DC: NONE
  * Last updated: Sat Jan  3 00:28:04 2026 on nodo2
  * Last change:  Sat Jan  3 00:27:58 2026 by hacluster via hacluster on nodo2
  * 2 nodes configured
  * 0 resource instances configured

Node List:
  * Node nodo1: UNCLEAN (offline)
  * Node nodo2: UNCLEAN (offline)

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@nodo2 ~]# pcs property set stonith-enabled=false
[root@nodo2 ~]# pcs status
Cluster name: iscsi-cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: nodo2 (version 2.1.10-1.el9-5693eaeee) - partition with quorum
  * Last updated: Sat Jan  3 00:34:35 2026 on nodo2
  * Last change:  Sat Jan  3 00:34:28 2026 by root via root on nodo2
  * 2 nodes configured
  * 0 resource instances configured

Node List:
  * Online: [ nodo1 nodo2 ]

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@nodo2 ~]#

Configuración LVM

Editar /etc/lvm/lvm.conf

Establecer la misma configuración de /etc/lvm/lvm.conf en todos los nodos del clúster y ejecutar un dracut -f y reboot.

[root@nodo1 ~]# grep -vE '^\s*#|^\s*$' /etc/lvm/lvm.conf
config {
}
devices {
}
allocation {
}
log {
}
backup {
}
shell {
}
global {
        system_id_source = "uname"
}
activation {
        auto_activation_volume_list = [ ]
}
report {
}
dmeventd {
}
[root@nodo1 ~]# dracut -f
[root@nodo1 ~]# reboot

Verificaciones post reinicio:

[root@nodo1 ~]# uname -n
nodo1

[root@nodo1 ~]# lvm systemid
  system ID: nodo1

Crear PV|VG|LV con LUN compartida

La creación del VG debe ejecutarse en un solo nodo del clúster: Crear Phisycal Volume:

[root@nodo1 ~]# pvcreate /dev/sdb
  Physical volume "/dev/sdb" successfully created.

Crear Volume Group:

[root@nodo1 ~]# vgcreate --setautoactivation n vg_shared /dev/sdb
  Volume group "vg_shared" successfully created with system ID nodo1
[root@nodo1 ~]# vgs -o+systemid
  VG        #PV #LV #SN Attr   VSize   VFree  System ID
  rhel        1   2   0 wz--n- <19.00g     0
  vg_shared   1   0   0 wz--n-  39.96g 39.96g nodo1
[root@nodo1 ~]#

Crear Logical Volume:

[root@nodo1 ~]# lvcreate -n lv_data -l 100%FREE vg_shared
  Wiping xfs signature on /dev/vg_shared/lv_data.
  Logical volume "lv_data" created.

Formatear en XFS:

[root@nodo1 ~]# mkfs.xfs /dev/vg_shared/lv_data
meta-data=/dev/vg_shared/lv_data isize=512    agcount=4, agsize=2618880 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=10475520, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@nodo1 ~]#

Configuración Recursos

Requisitos previos

Hay que desactivar el VG para que lo gestione el cluster y crear el directorio donde montar el FS:

[root@nodo1 ~]# vgchange -an vg_shared
  0 logical volume(s) in volume group "vg_shared" now active
[root@nodo1 ~]# lvscan
  ACTIVE            '/dev/rhel/swap' [2.00 GiB] inherit
  ACTIVE            '/dev/rhel/root' [<17.00 GiB] inherit
  inactive            '/dev/vg_shared/lv_data' [39.96 GiB] inherit
[root@nodo1 ~]# mkdir -p /srv/shared

Crear recursos

El orden de creación es importante ya que afecta al orden de arranque:

Crear recurso que active el VG:

[root@nodo1 ~]# pcs resource create vg_shared ocf:heartbeat:LVM-activate vgname=vg_shared vg_access_mode=system_id --group SHARED
Deprecation Warning: Using '--group' is deprecated and will be replaced with 'group' in a future release. Specify --future to switch to the future behavior.

Crear recurso que monta el LV:

[root@nodo1 ~]# pcs resource create fs_shared ocf:heartbeat:Filesystem device="/dev/vg_shared/lv_data" directory="/srv/shared"  fstype="xfs" --group SHARED
Deprecation Warning: Using '--group' is deprecated and will be replaced with 'group' in a future release. Specify --future to switch to the future behavior.

Crear recurso que active la IP del recurso:

[root@nodo1 ~]# pcs resource create ip_shared   ocf:heartbeat:IPaddr2   ip=192.168.1.83   cidr_netmask=24   nic=enp0s3   --group SHARED
Deprecation Warning: Using '--group' is deprecated and will be replaced with 'group' in a future release. Specify --future to switch to the future behavior.

Verificación

[root@nodo1 ~]#  pcs status
Cluster name: iscsi-cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: nodo2 (version 2.1.10-1.el9-5693eaeee) - partition with quorum
  * Last updated: Sat Jan  3 11:02:49 2026 on nodo1
  * Last change:  Sat Jan  3 10:52:08 2026 by root via root on nodo1
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ nodo1 nodo2 ]

Full List of Resources:
  * Resource Group: SHARED:
    * vg_shared (ocf:heartbeat:LVM-activate):    Started nodo1
    * fs_shared (ocf:heartbeat:Filesystem):      Started nodo1
    * ip_shared (ocf:heartbeat:IPaddr2):         Started nodo1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@nodo1 ~]#

Pruebas de movimiento paquetes:

Verificamos que los recursos están arrancados en el nodo1y comprobamos que el VG esta activo, el LV montado y la IP arrancada:

[root@nodo1 ~]# pcs status
Cluster name: iscsi-cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: nodo2 (version 2.1.10-1.el9-5693eaeee) - partition with quorum
  * Last updated: Sat Jan  3 11:24:14 2026 on nodo1
  * Last change:  Sat Jan  3 10:52:08 2026 by root via root on nodo1
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ nodo1 nodo2 ]

Full List of Resources:
  * Resource Group: SHARED:
    * vg_shared (ocf:heartbeat:LVM-activate):    Started nodo1
    * fs_shared (ocf:heartbeat:Filesystem):      Started nodo1
    * ip_shared (ocf:heartbeat:IPaddr2):         Started nodo1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@nodo1 ~]#
[root@nodo1 ~]# lvscan
  ACTIVE            '/dev/rhel/swap' [2.00 GiB] inherit
  ACTIVE            '/dev/rhel/root' [<17.00 GiB] inherit
  ACTIVE            '/dev/vg_shared/lv_data' [39.96 GiB] inherit

[root@nodo1 ~]# df -hT /srv/shared
Filesystem                    Type  Size  Used Avail Use% Mounted on
/dev/mapper/vg_shared-lv_data xfs    40G  318M   40G   1% /srv/shared

[root@nodo1 ~]# ip a show enp0s3
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:f3:3c:51 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.81/24 brd 192.168.1.255 scope global noprefixroute enp0s3
       valid_lft forever preferred_lft forever
    inet 192.168.1.83/24 brd 192.168.1.255 scope global secondary enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fef3:3c51/64 scope link tentative noprefixroute
       valid_lft forever preferred_lft forever

Procedemos a mover el paquete al nodo2:

[root@nodo1 ~]# pcs resource move SHARED
Location constraint to move resource 'SHARED' has been created
Waiting for the cluster to apply configuration changes...
Location constraint created to move resource 'SHARED' has been removed
Waiting for the cluster to apply configuration changes...
resource 'SHARED' is running on node 'nodo2'
[root@nodo1 ~]#

Verificamos que los recursos están arrancados en el nodo2 y comprobamos que el VG esta activo, el LV montado y la IP arrancada:

[root@nodo1 ~]# pcs status
Cluster name: iscsi-cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: nodo2 (version 2.1.10-1.el9-5693eaeee) - partition with quorum
  * Last updated: Sat Jan  3 11:25:34 2026 on nodo1
  * Last change:  Sat Jan  3 11:24:31 2026 by root via root on nodo1
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ nodo1 nodo2 ]

Full List of Resources:
  * Resource Group: SHARED:
    * vg_shared (ocf:heartbeat:LVM-activate):    Started nodo2
    * fs_shared (ocf:heartbeat:Filesystem):      Started nodo2
    * ip_shared (ocf:heartbeat:IPaddr2):         Started nodo2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@nodo1 ~]#
[root@nodo2 ~]# lvscan
  ACTIVE            '/dev/rhel/swap' [2.00 GiB] inherit
  ACTIVE            '/dev/rhel/root' [<17.00 GiB] inherit
  ACTIVE            '/dev/vg_shared/lv_data' [39.96 GiB] inherit
[root@nodo2 ~]# df -hT /srv/shared
Filesystem                    Type  Size  Used Avail Use% Mounted on
/dev/mapper/vg_shared-lv_data xfs    40G  318M   40G   1% /srv/shared
[root@nodo2 ~]# ip a show enp0s3
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:f3:3c:51 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.82/24 brd 192.168.1.255 scope global noprefixroute enp0s3
       valid_lft forever preferred_lft forever
    inet 192.168.1.83/24 brd 192.168.1.255 scope global secondary enp0s3
       valid_lft forever preferred_lft forever

Notas finales

  • El disco iSCSI queda disponible para ser utilizado como recurso compartido en Pacemaker.
  • Todos los nodos deben ver el mismo dispositivo de bloques.
  • El uso de pcs resource move en RHEL 8/9 genera movimientos **temporales**; las constraints de localización se crean y eliminan automáticamente tras el movimiento.
  • Para definir afinidad permanente de un recurso a un nodo es necesario crear una constraint explícita.


Referencias

  • iSCSI – Configuración de almacenamiento compartido
  • LVM – Gestión de volúmenes lógicos