* Created Database->get_power() that loads data from the special 'power' table.
* Fixed a bug in calls to Network->ping() where some weren't formatted properly for receiving two string variables.
* Updated Database->get_anvils() to record the machine types when recording host information.
* Updated Database->get_hosts_info() to also load the 'host_ipmi' column.
* Updated Database->get_upses() to store the link to the 'power' -> 'power_uuid', when available.
* Created ScanCore->call_scan_agents() that does the work of actually calling scan agents, moving the logic out from the scancore daemon.
* Created ScanCore->check_power() that takes a host and the anvil it is in and returns if it's on batteries or not. If it is, the time on batteries and estimate hold-up time is returned. If not, the highest charge percentage is returned.
* Created ScanCore->post_scan_analysis() that is a wrapper for calling the new ->post_scan_analysis_dr(), ->post_scan_analysis_node() and ->post_scan_analysis_striker(). Of which, _dr and _node are still empty, but _striker is complete.
** ->post_scan_analysis_striker() is complete. It now boots a node after a power loss if the UPSes powering it are OK (at least one has mains power, and the main-powered UPS(es) have reached the minimum charge percentage). If it's thermal, IPMI is called and so long as at least one thermal sensor is found and it/they are all OK, it is booted. For now, M2's thermal reboot delay logic hasn't been replicated, as it added a lot of complexity and didn't prove practically useful.
* Created System->collect_ipmi_data() and moved 'scan_ipmitool's ipmitool call and parse into that method. This was done to allow ScanCore->post_scan_analysis_striker() to also call IPMI on a remote machine during thermal down events without reimplementing the logic.
* Updated scan-ipmitool to only record temperature data for data collected locally. Also renamed 'machine' variables and hash keys to 'host_name' to clarify what is being stored.
* Updated scancore to clear the 'system::stop_reason' variable.
* Added missing packages to striker-manage-install-target.
Signed-off-by: Digimer <digimer@alteeve.ca>
IfthehealthisC<<1>>,the"time on batteries"and"estimated hold up time"willbeC<<0>>andthehighestchargepercentagewillbeset.
IfthehealthisC<<2>>,the"time on batteries"willbethenumberofsecondssincethelastUPStolosepowerwasfoundtoberunningonbatteries,TheestimatedholduptimeofthestrongestUPSisalsoreturnedinseconds.
@ -15,8 +15,8 @@ NOTE: All string keys MUST be prefixed with the agent name! ie: 'scan_ipmitool_l
<!-- Messages entries -->
<!-- Messages entries -->
<keyname="scan_ipmitool_message_0001">No IPMI BMC found on this host nor where any other machines with IPMI found or where accessible. Nothing to do.</key>
<keyname="scan_ipmitool_message_0001">No IPMI BMC found on this host nor where any other machines with IPMI found or where accessible. Nothing to do.</key>
<keyname="scan_ipmitool_message_0002">There was no IPMI sensor value units set for sensor: [#!variable!sensor!#] on the machine: [#!variable!machine!#].</key>
<keyname="scan_ipmitool_message_0002">There was no IPMI sensor value units set for sensor: [#!variable!sensor!#] on the machine: [#!variable!host_name!#].</key>
<keyname="scan_ipmitool_message_0003">There was no IPMI sensor value set for sensor: [#!variable!sensor!#] on the machine: [#!variable!machine!#].</key>
<keyname="scan_ipmitool_message_0003">There was no IPMI sensor value set for sensor: [#!variable!sensor!#] on the machine: [#!variable!host_name!#].</key>
<keyname="scan_ipmitool_message_0004">
<keyname="scan_ipmitool_message_0004">
The sensor: [#!variable!sensor_name!#] has changed.
The sensor: [#!variable!sensor_name!#] has changed.
Note: If you are listening to 'critical' level alerts only, you will not get the alert telling you when the temperature is back to normal.
Note: If you are listening to 'critical' level alerts only, you will not get the alert telling you when the temperature is back to normal.
</key>
</key>
<keyname="scan_ipmitool_message_0015">There was no IPMI sensor value units set for sensor: [#!variable!sensor!#] on the machine: [#!variable!machine!#].</key>
<keyname="scan_ipmitool_message_0015">There was no IPMI sensor value units set for sensor: [#!variable!sensor!#] on the machine: [#!variable!host_name!#].</key>
<keyname="scan_ipmitool_message_0016">
<keyname="scan_ipmitool_message_0016">
The sensor: [#!variable!sensor_name!#] has changed.
The sensor: [#!variable!sensor_name!#] has changed.
<keyname="scan_ipmitool_message_0017">There was no IPMI sensor value units set for sensor: [#!variable!sensor!#] on the machine: [#!variable!machine!#].</key>
<keyname="scan_ipmitool_message_0017">There was no IPMI sensor value units set for sensor: [#!variable!sensor!#] on the machine: [#!variable!host_name!#].</key>
<keyname="scan_ipmitool_message_0018">There was no IPMI sensor value set for sensor: [#!variable!sensor!#] on the machine: [#!variable!machine!#].</key>
<keyname="scan_ipmitool_message_0018">There was no IPMI sensor value set for sensor: [#!variable!sensor!#] on the machine: [#!variable!host_name!#].</key>
<keyname="scan_ipmitool_message_0019">
<keyname="scan_ipmitool_message_0019">
The new sensor: [#!variable!sensor_name!#] has been found on the machine: [#!variable!machine!#].
The new sensor: [#!variable!sensor_name!#] has been found on the machine: [#!variable!host_name!#].
@ -110,9 +110,9 @@ The new sensor: [#!variable!sensor_name!#] has been found on the machine: [#!var
</key>
</key>
<!-- Log entries -->
<!-- Log entries -->
<keyname="scan_ipmitool_log_0001">Starting to read the IPMI sensor values for: [#!variable!machine!#]</key>
<keyname="scan_ipmitool_log_0001">Starting to read the IPMI sensor values for: [#!variable!host_name!#]</key>
<keyname="scan_ipmitool_log_0002">Failed to query node: [#!variable!machine!#]'s IPMI interface using the call: [#!variable!call!#]. Is the password correct?</key>
<keyname="scan_ipmitool_log_0002">Failed to query node: [#!variable!host_name!#]'s IPMI interface using the call: [#!variable!call!#]. Is the password correct?</key>
<keyname="scan_ipmitool_log_0004">The sensor named: [#!variable!sensor_name!#] appears to have vanished, but this is the first scan that it vanished. This is generally harmless and just a sensor read issue.</key>
<keyname="scan_ipmitool_log_0004">The sensor named: [#!variable!sensor_name!#] appears to have vanished, but this is the first scan that it vanished. This is generally harmless and just a sensor read issue.</key>
<keyname="scan_ipmitool_log_0005">The sensor named: [#!variable!sensor_name!#] has returned.</key>
<keyname="scan_ipmitool_log_0005">The sensor named: [#!variable!sensor_name!#] has returned.</key>
<keyname="error_0165">The temperature: [#!variable!temperature!#] does not appear to be valid..</key>
<keyname="error_0165">The temperature: [#!variable!temperature!#] does not appear to be valid..</key>
<keyname="error_0166">The resource: [#!variable!resource!#] in the config file: [#!variable!file!#] was found, but does not appear to be a valid UUID: [#!variable!uuid!#].</key>
<keyname="error_0166">The resource: [#!variable!resource!#] in the config file: [#!variable!file!#] was found, but does not appear to be a valid UUID: [#!variable!uuid!#].</key>
<keyname="error_0167">The resource: [#!variable!resource!#] in the config file: [#!variable!file!#] was found, and we were asked to replace the 'scan_drbd_resource_uuid' but the new UUID: [#!variable!uuid!#] is not a valud UUID.</key>
<keyname="error_0167">The resource: [#!variable!resource!#] in the config file: [#!variable!file!#] was found, and we were asked to replace the 'scan_drbd_resource_uuid' but the new UUID: [#!variable!uuid!#] is not a valud UUID.</key>
<keyname="error_0168">The 'fence_ipmilan' command: [#!variable!command!#] does not appear to be valid.</key>
<!-- Table headers -->
<!-- Table headers -->
<keyname="header_0001">Current Network Interfaces and States</key>
<keyname="header_0001">Current Network Interfaces and States</key>
@ -1071,6 +1072,24 @@ The file: [#!variable!file!#] needs to be updated. The difference is:
====
====
</key>
</key>
<keyname="log_0557">Scan agent: [#!variable!agent_name!#] exited after: [#!variable!runtime!#] seconds with the return code: [#!variable!return_code!#].</key>
<keyname="log_0557">Scan agent: [#!variable!agent_name!#] exited after: [#!variable!runtime!#] seconds with the return code: [#!variable!return_code!#].</key>
<keyname="log_0558">I'm not on the same network as: [#!variable!host_name!#]. Unable to check the power state.</key>
<keyname="log_0559">The host: [#!variable!host_name!#] appears to be off, but there's no IPMI information, so unable to check the power state or power on the machine.</key>
<keyname="log_0560">The host: [#!variable!host_name!#] has no IPMI information. Wouldn't be able to boot it, even if it's off, so skipping it.</key>
<keyname="log_0561">The host: [#!variable!host_name!#] will be checked to see if it needs to be booted or not.</key>
<keyname="log_0562">The host: [#!variable!host_name!#] is up, no need to check if it needs booting.</key>
<keyname="log_0563">The host: [#!variable!host_name!#] couldn't be reached directly, but IPMI reports that it is up. Could the IPMI BMC be hung or unplugged?</key>
<keyname="log_0564">The host: [#!variable!host_name!#] is off. Will check now if it should be booted.</key>
<keyname="log_0565">The host: [#!variable!host_name!#] has no stop reason, so we'll leave it off.</key>
<keyname="log_0566">The host: [#!variable!host_name!#] was stopped by the user, so we'll leave it off.</key>
<keyname="log_0567">The host: [#!variable!host_name!#] was powered off because of power loss. Checking to see if it is now safe to restart it.</key>
<keyname="log_0568">The host: [#!variable!host_name!#] was powered off because of thermal issues. Checking to see if it is now safe to restart it.</key>
<keyname="log_0569">Unable to find an install manifest for the Anvil! [#!variable!anvil_name!#]. As such, unable to determine what UPSes power the machine: [#!variable!host_name!#]. Unable to determine if the power feeding this node is OK or not.</key>
<keyname="log_0570">Unable to parse the install manifest uuid: [#!variable!manifest_uuid!#] for the Anvil! [#!variable!anvil_name!#]. As such, unable to determine what UPSes power the machine: [#!variable!host_name!#]. Unable to determine if the power feeding this node is OK or not.</key>
<keyname="log_0571">The UPS referenced by the 'power_uuid': [#!variable!power_uuid!#] under the host: [#!variable!host_name!#] has no record of being on mains power, so we can't determine how long it's been on batteries. Setting the "shortest time on batteries" to zero seconds.</key>
<keyname="log_0572">Clearing the host's stop reason.</key>
<keyname="log_0573">The host: [#!variable!host_name!#] is off, but there appears to be a problem translating the 'fence_ipmilan' into a workable 'ipmitool' command. Unable to check the thermal data of the host, and so, unable to determine if it's safe to boot the node.</key>
<keyname="log_0574">The host: [#!variable!host_name!#] was powered off because of power loss. Power is back and the UPSes are sufficiently charged. Booting it back up now.</key>
<keyname="log_0575">The host: [#!variable!host_name!#] was powered off for thermal reasons. All available thermal sensors read as OK now. Booting it back up now.</key>
<!-- Messages for users (less technical than log entries), though sometimes used for logs, too. -->
<!-- Messages for users (less technical than log entries), though sometimes used for logs, too. -->
<keyname="message_0001">The host name: [#!variable!target!#] does not resolve to an IP address.</key>
<keyname="message_0001">The host name: [#!variable!target!#] does not resolve to an IP address.</key>
@ -1662,6 +1681,7 @@ If you are comfortable that the target has changed for a known reason, you can s
<keyname="striker_0276">This tracks the last time a given mail server was configured for use. It allows for a round-robin switching of mail servers when one mail server stops working and two or more mail servers have been configured.</key>
<keyname="striker_0276">This tracks the last time a given mail server was configured for use. It allows for a round-robin switching of mail servers when one mail server stops working and two or more mail servers have been configured.</key>
<keyname="striker_0277">No UPSes</key>
<keyname="striker_0277">No UPSes</key>
<keyname="striker_0278">This is a condition record, used by programs like scan agents to track how long a condition has existed for.</key>
<keyname="striker_0278">This is a condition record, used by programs like scan agents to track how long a condition has existed for.</key>
<keyname="striker_0279">This indicated why a machine was powered off. This is used by ScanCore to decide if or when to power up the target host.</key>
<!-- These are generally units and appended to numbers -->
<!-- These are generally units and appended to numbers -->
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { variable_uuid => $variable_uuid }});
return(0);
}
=pod
=pod
"I'm sorry, but I don't want to be an emperor. That's not my business. I don't want to rule or conquer anyone. I should like to help everyone if possible - Jew, Gentile - black man - white.
"I'm sorry, but I don't want to be an emperor. That's not my business. I don't want to rule or conquer anyone. I should like to help everyone if possible - Jew, Gentile - black man - white.