Updated ocf:alteeve:server to look for /tmp/<resource>.fail' and, if that file exists, exits with rc:1. This is done to allow for testing.

Signed-off-by: digimer <mkelly@alteeve.ca>
main
digimer 2 years ago
parent 9bf0f50084
commit f9689a7106
  1. 5
      Anvil/Tools/Cluster.pm
  2. 2
      notes
  3. 12
      ocf/alteeve/server
  4. 1
      share/words.xml

@ -863,9 +863,10 @@ sub check_node_status
=head2 check_server_constraints
This method checks to see if the peer node is offline, and the local node is only. If this is the case, the location constraints for servers are checked to ensure that they favour the current host. If not, the location constraint is updated.
This checks to see if the constraints on a server are sane. Specifically;
This is meant to be used to prevent servers from automatically migrating back to a node after it was fenced.
* If the server is on a sub-node and the peer is offline, ensure that the location constraints prefer the current host. This prevents migrations back to the old host.
* Check to see if a DRBD resource constriant was applied against a node, and the node's DRBD resource is UpToDate. If so, remove the constraint.
This method takes no parameters.

@ -17,6 +17,8 @@ Common queries;
* SELECT a.host_name, b.file_name, c.file_location_active FROM hosts a, files b, file_locations c WHERE a.host_uuid = c.file_location_host_uuid AND b.file_uuid = c.file_location_file_uuid ORDER BY b.file_name ASC, a.host_name ASC;
* SELECT a.dr_link_uuid, b.host_name, c.anvil_name, a.dr_link_note FROM dr_links a, hosts b, anvils c WHERE a.dr_link_host_uuid = b.host_uuid AND a.dr_link_anvil_uuid = c.anvil_uuid ORDER BY c.anvil_name ASC, b.host_name ASC;
# Fail a resource for testing purposes.
crm_resource --fail --resource srv02-b -N vm-a01n01
uname -r; grubby --default-kernel; lsinitrd -m /boot/initramfs-4.18.0-448.el8.x86_64.img | grep lvm; systemctl is-enabled scancore.service;
dnf -y update; systemctl disable --now anvil-daemon; systemctl disable --now scancore

@ -256,6 +256,18 @@ if (not $anvil->data->{switches}{monitor})
show_environment($anvil, 3);
}
# If there's a test fail file, return '1' to cause pacemaker to fail this resource.
if ($anvil->data->{environment}{OCF_RESKEY_name})
{
my $test_fail_file = "/tmp/".$anvil->data->{environment}{OCF_RESKEY_name}.".fail";
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { test_fail_file => $test_fail_file }});
if (-e $test_fail_file)
{
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 1, priority => "alert", key => "warning_0150", variables => { fail_file => $test_fail_file }});
$anvil->nice_exit({exit_code => 1});
}
}
### What are we being asked to do?
# start -Starts the resource.
# stop -Shuts down the resource.

@ -3583,6 +3583,7 @@ The error was:
<key name="warning_0147">[ Warning ] - The interface: [#!variable!interface!#] appears to be down (state: [#!variable!state!#]). The system uptime is: [#!variable!uptime!#], so it might be a problem where the interface didn't start on boot as it should have. So we're going to bring the interface up.</key>
<key name="warning_0148">[ Warning ] - The IPMI stonith resource: [#!variable!resource!#] is in the role: [#!variable!role!#] (should be 'Started'). Will check the IPMI config now.</key>
<key name="warning_0149">[ Warning ] - Failed to find a valid IP address or password to be used to setup the DR host's IPMI.</key>
<key name="warning_0150">[ Warning ] - The test "fail file": [#!variable!fail_file!#] was found. So long as this file exists, the ocf:alteeve:server RA will return 'OCF_ERR_GENERIC' (exit code 1). Delete the file to resume normal operation.</key>
</language>
<!-- 日本語 -->
<language name="jp" long_name="日本語" description="Anvil! language file.">

Loading…
Cancel
Save