48 Commits (b631446cd747fb3b63a674f13c32af9f1ae15342)

Author SHA1 Message Date
digimer ab0b1a262b Reworked Network->wait_for_bonds() to be ->wait_for_networks() 10 months ago
digimer 0f1ff02e78 Added alarms around remote calls to better handle dropped networks. 10 months ago
digimer 495cb90ca6 Created Network->wait_for_network to hold startup for NM to be up. 10 months ago
digimer 05de34c7bc Scancore and anvil-daemon now holds for bonds to be up. 10 months ago
digimer de4bb0d001 Bumped logging for debugging. 11 months ago
digimer 827cf1f331 Fixed a bug that was crashing anvil-daemon 11 months ago
digimer 5ec395c53a Reworked DB resync logic. 1 year ago
digimer be290bf561 This commit fixes a bug where the drbd kernel module build was being killed mid-compile, leaving DBRD unusable. 1 year ago
digimer fb70836126 This moves the call of anvil-safe-start out of scancore and into a new, dedicated systemd unit that runs on boot only. 2 years ago
Digimer b3b185a43c * Added the alteeve-repo-setup man page and updated it to show that when called with '-h'. 2 years ago
Digimer 1b70b49cf8 * Updated Network->find_matches() to try to populate the first and second parameters if they're not passed in. 3 years ago
Digimer d70b9a4956 Updated scancore and anvil-daemon to check their RAM use at the end of each loop and, if it's using more than 1 GiB of RAM, it sends an alert and exits. 3 years ago
Digimer 4c7bb45ab9 Fixed a race condition where configuring the IPMI BMC would appear to fail because the BMC wouldn't report the user list after a cold reset. 3 years ago
Digimer 8abb5b46e0 * Added support for setting per-agent log-level and log secure values in amvil.conf. 3 years ago
Digimer aec22bb79c Added a check in scan-network that finds/removes duplicate network interface names. 3 years ago
Digimer 16c20ae69c * Updated Tools->catch_sig() to use return code 0 instead of 255 so that systemd doesn't think our daemons failed on stop. 4 years ago
Digimer 4dcd505753 * Biggest change in this commit; scan-apc-pdu and scan-apc-ups now only run on Striker dashboards! This was because we found that if two machines ran their agents at the same time, the reponce time from SNMP read requests grew a lot. This meant it was likely a third, fourth and so on machne would also then have their scan agent runs while the existing runs were still trying to process, causing the SNMP reads to get slower still until timeouts popped. 4 years ago
Digimer 6abe06f125 The theme of these commits is improving DB responsiveness. 4 years ago
Digimer 41cd1e0319 * Several bugs fixed and enhancements; 4 years ago
Digimer 3fb81c1a0a * Updated Convert->time() to silently return if the given time was '--'. 4 years ago
Digimer 4a87ee71db * This commit started with work on webui endpoint set_power, but then switched to scancore debugging and I neglected to switch branches. 4 years ago
Digimer 416f51323a * Created tools/striker-boot-machine to, well, boot machines. It uses host_ipmi or, failing that, other fence methods when available to boot a node. 4 years ago
Digimer ca7052dd53 The core logic is done!!!! Still need to finish end-points for the WebUI to hook into, but the core of M3 is complete! Many, many bugs are expected, of course. :) 4 years ago
Digimer 15e71768a1 * Started work on anvil-safe-start. The enable/disable logic and how it runs automatically is controlled by the database and the tool can be used to control anvil-safe-start on both the local and peer node. It will be started by ScanCore, if scancore starts within 10 minutes of the node booting. It will always be able to run manually. 4 years ago
Digimer fb0836f912 * THe get_cpu endpoint was completed. 4 years ago
Digimer 59b867cc25 * Updated DRBD->gather_data() to check if drbdadm exists before trying to call it to avoid scary errors in the logs. Also moved some strings that pulled from the scan-drbd agent into the main words file. 4 years ago
Digimer 1a520b03d5 * Cleaned up a lot of logging in anvil-daemon and tools it calls. 4 years ago
Digimer 6009590352 * Fixed a bug in scan-apc-ups where changes in the transfer reason were not being recorded. 4 years ago
Digimer 8d0f873912 * Updated scan-storcli to check if a MegaRAID controlled exists and neither storcli64 or perccli64 exist. If a controller is found but no RPM is installed, it checks to see if the host is Dell and then decides to try and install perccli or storcli. 4 years ago
Digimer 1d03a386d3 * Created Database->get_bridges() that, surprise, loads data from the 'bridges' table. 4 years ago
Digimer 96bc1f0b78 * Created Convert->fence_ipmilan_to_ipmitool() that takes a 'fence_ipmilan' call and converts it into a direct 'ipmitool' call. 4 years ago
Digimer 713f77bc78 * Finally finished scan-apc-ups! Proved way harder than anticipated... (over a solid week of work!) In M3, this agent is no longer host-bound, and the UPSes to scan based on entries in 'upses' using this scan agent. 4 years ago
Digimer 2f4a06f2e0 * Updated System->call() to take the 'timeout' parameter which, when set, prepends the call with 'timeout X <shell_call>' to make it easier to deal with calls that could potentially hang. 4 years ago
Digimer 46f1a05789 * Got the code in scan-server to the point where it _should_ now gracefully and automatically detect changes to a server's definition originatin from the database (via Striker), directly editing the on-disk definition file, or editing via libvirt tools (like virt-manager). Still needs to be tested though. 4 years ago
Digimer 1a1fa7ce88 * Created Cluster->get_anvil_uuid() that returns the 'anvil_uuid' of a given 'host_uuid'. 4 years ago
Digimer 0a1dc809a2 * Created the ScanCore.pm module with the first 'agent_startup' method which generalized scan agent start up. 4 years ago
Digimer 925664762a * Created Database->check_for_schema() (not finished) that will check/add a schema for a scan agent. 4 years ago
Digimer 767148b538 * Updated Database->get_mail_servers() to clear old stored data, and to pull out the list of when a mail server was last used. 4 years ago
Digimer b2c7fd95fb * Renamed the ScanCore unit file to scancore. 4 years ago
Digimer e35800c413 * Fixed up (though more testing/work needed) to ocf:alteeve:server to get it working with DRBD resources referenced using '/dev/drbd/by-res/...'. 4 years ago
Digimer 453f5c6223 * Fixed a bug where $anvil->nice_exit() was being passed 'exit' instead of 'exit_code' as a parameter. 5 years ago
Digimer e764eccf6e * Started work on Email->check_alert_recipients(). 5 years ago
Digimer c3869a2ff6 * Started adding in front-end support for managing email servers and alert recipients. Added the new 'Email' module to (later) habdle all email-related tasks. 5 years ago
Digimer 9c0f6b8f79 * Added automatic 'echo return_code:$?' to System->call and Remote->call which is parsed out and returned automatically on all calls. 6 years ago
Digimer ff5ef43940 * Continued work on the ssh configuration system in anvil-daemon. 6 years ago
Digimer 53295a0d7f * Updated the variables used for logging and log handles to be more inline with other variable names. 6 years ago
Digimer 8fad67fc5a * Updated Words->read to default to 'path::words::words.xml' when the 'file' parameter is not passed. Also updated it to check to see if the words file was read before and, if so, clear the data from the previous read before re-reading it. 6 years ago
Digimer 946fce018a * Renamed the 'ScanCore' executable to just 'scancore', moved it into the standard 'tools' directory and changed the agents directory to '/usr/sbin/scancore-agents'. 6 years ago