Updated Scancore->post_scan_analysis_striker() to check the RC when

booting an unexpectedly off host and only update it's power state if the
boot actually succeeded.

* Started work on a new anvil-manage-daemons tool and
  anvil-monitor-daemons systemd unit.

Signed-off-by: digimer <mkelly@alteeve.ca>
main
digimer 10 months ago
parent 27152845fd
commit 835d9e79cb
  1. 26
      Anvil/Tools/ScanCore.pm
  2. 1
      man/Makefile.am
  3. 45
      man/anvil-manage-daemons.8
  4. 1
      share/words.xml
  5. 1
      tools/Makefile.am
  6. 0
      tools/anvil-manage-daemons
  7. 1
      units/Makefile.am
  8. 13
      units/anvil-monitor-daemons.service

@ -2721,15 +2721,27 @@ LIMIT 1;";
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 1, key => "log_0673", variables => { host_name => $host_name }});
$shell_call =~ s/--action status/ --action on/;
my ($output, $return_code) = $anvil->System->call({debug => $debug, timeout => 30, shell_call => $shell_call});
my ($output, $return_code) = $anvil->System->call({debug => 1, timeout => 30, shell_call => $shell_call});
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => $debug, list => { shell_call => $shell_call }});
# Mark it as booting.
$anvil->Database->update_host_status({
debug => $debug,
host_uuid => $host_uuid,
host_status => "booting",
});
if ($return_code)
{
# Failed to boot.
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 1, key => "warning_0170", variables => {
host_name => $host_name,
return_code => $return_code,
output => $output,
}});
}
else
{
# Mark it as booting.
$anvil->Database->update_host_status({
debug => $debug,
host_uuid => $host_uuid,
host_status => "booting",
});
}
}
}
}

@ -20,6 +20,7 @@ dist_man8_MANS = \
anvil-join-anvil.8 \
anvil-maintenance-mode.8 \
anvil-manage-alerts.8 \
anvil-manage-daemons.8 \
anvil-manage-dr.8 \
anvil-manage-files.8 \
anvil-manage-firewall.8 \

@ -0,0 +1,45 @@
.\" Manpage for the Anvil! daemon managers
.\" Contact mkelly@alteeve.com to report issues, concerns or suggestions.
.TH anvil-manage-daemons "8" "August 02 2022" "Anvil! Intelligent Availability™ Platform"
.SH NAME
anvil-manage-daemons \- Tool used to monitor and manage Anvil! daemons.
.SH SYNOPSIS
.B anvil-manage-daemons
\fI\,<command> \/\fR[\fI\,options\/\fR]
.SH DESCRIPTION
When run with '\fB\-\-monitor\fR', it will run as a daemon, checking all other Anvil! daemons. If any are found to be 'failed', they will be stopped and restarted.
.TP
.SH OPTIONS
.TP
\-?, \-h, \fB\-\-help\fR
Show this man page.
.TP
\fB\-\-log\-secure\fR
When logging, record sensitive data, like passwords.
.TP
\-v, \-vv, \-vvv
Set the log level to 1, 2 or 3 respectively. Be aware that level 3 generates a significant amount of log data.
.SS "Commands:"
.TP
\fB\-\-enable\fR
All Anvil! daemons that are not enabled will be enabled.
.TP
\fB\-\-disable\fR
All Anvil! daemons that are not disabled will be disabled.
.TP
\fB\-\-monitor\fR
This is set to the job UUID when the request to boot is coming from a database job. When set, the referenced job will be updated and marked as complete / failed when the run completes.
.TP
\fB\-\-now\fR
This can be used with \fB\-\-enable\fR or \fB\-\-disable\fR to have the daemons started or stopped immediately.
.TP
\fB\-\-start\fR
This will start all daemons that are not already running.
.TP
\fB\-\-stop\fR
This will stop all daemons that are not already stopped.
.IP
.SH AUTHOR
Written by Madison Kelly, Alteeve staff and the Anvil! project contributors.
.SH "REPORTING BUGS"
Report bugs to users@clusterlabs.org

@ -4172,6 +4172,7 @@ We will try to proceed anyway.</key>
</key>
<key name="warning_0168">Please specify a storage group to use to add the new drive to.</key>
<key name="warning_0169">[ Warning ] - After reconfiguring the network, we've failed to connect to any database for two minutes. Rebooting in case this fixes the connection.</key>
<key name="warning_0170">[ Warning ] - The attempt to boot: [#!variable!host_name!#] appears to have failed. The return code received was: [#!variable!return_code!#] (expected '0'). The output, if any, was: [#!variable!output!#].</key>
</language>
<!-- 日本語 -->

@ -15,6 +15,7 @@ dist_sbin_SCRIPTS = \
anvil-join-anvil \
anvil-maintenance-mode \
anvil-manage-alerts \
anvil-manage-daemons \
anvil-manage-dr \
anvil-manage-files \
anvil-manage-firewall \

@ -3,6 +3,7 @@ MAINTAINERCLEANFILES = Makefile.in
servicedir = $(SYSTEMD_UNIT_DIR)
dist_service_DATA = \
anvil-daemon.service \
anvil-monitor-daemons \
anvil-monitor-network.service \
anvil-monitor-performance.service \
anvil-safe-start.service \

@ -0,0 +1,13 @@
[Unit]
Description=Anvil! Intelligent Availability Platform - Daemon Monitor
Wants=network.target
[Service]
Type=simple
ExecStart=/usr/sbin/anvil-manage-daemons --monitor
ExecStop=/bin/kill -WINCH ${MAINPID}
Restart=always
RestartSec=60
[Install]
WantedBy=multi-user.target
Loading…
Cancel
Save