48 Commits (5b3ff87c1f869c2669dc2296dea0da0d0919bfe0)

Author SHA1 Message Date
digimer cf73d8ed36 * Updated System->configure_ipmi() to auto-configure DR hosts once they've been assigned a BCN IP address. 2 years ago
digimer 645f54ab89 This commit has more changes than I would normally like, but it's all linked to changing file uploads to rsync serially. 2 years ago
Digimer f6cbe7d1d2 * Fixed a bug in System->collect_ipmi_data() where double-quoted passwords were preventing reading of the sensor data. 2 years ago
Digimer 4ba1982183 This is the start of a set of changes needed to rework how we handle DRBD fence requests, so that they create location constraints instead of triggering a full stonith fence. 2 years ago
Digimer b8bb7cc423 * Changed the default trigger of live migrations to require a health score difference of 2 or higher. This can be user-adjusted using the new 'feature::scancore::threshold::preventative-live-migration' anvil.conf option. 2 years ago
Digimer 1770e9e0e0 * Fixed a bug where Database resync's where trying to resync tables without history schema entries. 2 years ago
Digimer 1b70b49cf8 * Updated Network->find_matches() to try to populate the first and second parameters if they're not passed in. 2 years ago
Digimer aa7d9bdf14 * Fixed a bug where resync'ing the database was missing tables. 3 years ago
Digimer d70b9a4956 Updated scancore and anvil-daemon to check their RAM use at the end of each loop and, if it's using more than 1 GiB of RAM, it sends an alert and exits. 3 years ago
Digimer 9eec6c4977 * Created ScanCore->check_temperature_direct() based around that start logic from ScanCore->post_scan_analysis_striker() temperature check, and updated the later to use the former. 3 years ago
Digimer 32d47f70f1 * Fixed bugs around ScanCore->check_power() so that it now returns time on batteries and highest charge are returned properly. 3 years ago
Digimer 213babaaf2 Trying to fix a bug where vnet devices keep reporting as having returned. 3 years ago
Digimer 8abb5b46e0 * Added support for setting per-agent log-level and log secure values in amvil.conf. 3 years ago
Digimer c449e2edf0 Resetting scan agent timeout to 30 seconds, 60 didn't help with a random 3 years ago
Digimer 15d8309095 This commit adds scan agent DB connection info caching to help minimize the number of unnecessary DB resync checks that happen. 3 years ago
Digimer 4800f7181f * Updated ScanCore to boot a node that is off without a stop reason. 3 years ago
Digimer ccd89f923b Fixed two small bugs that were preventing proactive live migration from working. 3 years ago
Digimer 6777104398 * Fixed a bug in anvil-daemon where, when an anvil-manage-power reboot run had triggered a reboot, anvil-daemon didn't set the job_progress to '100', causing constant reboots. Also fixed a bug where the log level was hard-set to '1' instead of '2' needed during debugging. 3 years ago
Digimer 607c097fc8 * Fixed a bug where, once a DRBD resource was allowed to be dual-primary for migration, that wasn't properly disabled post-migration. 3 years ago
Digimer 0c475d2a2e * Fixed a couple logging bugs. 3 years ago
Digimer 4dcd505753 * Biggest change in this commit; scan-apc-pdu and scan-apc-ups now only run on Striker dashboards! This was because we found that if two machines ran their agents at the same time, the reponce time from SNMP read requests grew a lot. This meant it was likely a third, fourth and so on machne would also then have their scan agent runs while the existing runs were still trying to process, causing the SNMP reads to get slower still until timeouts popped. 3 years ago
Digimer 6abe06f125 The theme of these commits is improving DB responsiveness. 3 years ago
Digimer 41cd1e0319 * Several bugs fixed and enhancements; 3 years ago
Digimer fc0954d0c8 * Started work on, but not at all finished, anvil-manage-server which will allow manipulation of a server's resources. 4 years ago
Digimer 44864ce321 * Updated Database->resync_databases() to set a default schema of 'public'. Also fixed a bug where, when the difference in record numbers between two line was > 999, it would not trigger a resync. 4 years ago
Digimer 7abbc938af * Renamed tools/striker-purge-host to tools/striker-purge-target and moved the code from test.pl over to it. No longer provides interactive selection, but now does work with Anvil! systems as well as hosts. 4 years ago
Digimer 3fb81c1a0a * Updated Convert->time() to silently return if the given time was '--'. 4 years ago
Digimer 416f51323a * Created tools/striker-boot-machine to, well, boot machines. It uses host_ipmi or, failing that, other fence methods when available to boot a node. 4 years ago
Digimer ca7052dd53 The core logic is done!!!! Still need to finish end-points for the WebUI to hook into, but the core of M3 is complete! Many, many bugs are expected, of course. :) 4 years ago
Digimer 15dab8aab7 * Started working on the node post-scan login in ScanCore. Created ScanCore->check_temperature() to get a thermal score against a node. 4 years ago
Digimer fb0836f912 * THe get_cpu endpoint was completed. 4 years ago
Digimer 59b867cc25 * Updated DRBD->gather_data() to check if drbdadm exists before trying to call it to avoid scary errors in the logs. Also moved some strings that pulled from the scan-drbd agent into the main words file. 4 years ago
Digimer 48d7a8d611 * Fixed bugs in scan-apc-ups and scan-apc-pdu that allowed PDUs and UPSes to be recorded duplicate times in the database. Fixed multiple bugs in scan_apc_ups from when we cloned PDU as it's base. 4 years ago
Digimer 1b65f53faa * Remove host-health from the 'hosts' table as it wasn't needed, given the 'health' table. Bumped the SQL version to 0.0.2 4 years ago
Digimer 26f268a4aa * Updated the 'hosts' table and relevant Database methods to add columns for 'host_status' and 'host_health'. These will simplify tracking a node's (power) status and overall health. 4 years ago
Digimer 6009590352 * Fixed a bug in scan-apc-ups where changes in the transfer reason were not being recorded. 4 years ago
Fabio M. Di Nitto 8f9892650b [build] first pass at adding a build system to integrate with CI 4 years ago
Digimer 1d03a386d3 * Created Database->get_bridges() that, surprise, loads data from the 'bridges' table. 4 years ago
Digimer 96bc1f0b78 * Created Convert->fence_ipmilan_to_ipmitool() that takes a 'fence_ipmilan' call and converts it into a direct 'ipmitool' call. 4 years ago
Digimer 4d5ec72026 * Started work on the scan-drbd scan agent. Got it to the point that it is gathering needed data. 4 years ago
Digimer 9c92b6bbb8 * Created Database->get_tables_from_schema() that returns an ordered list of tables in a database schema file, removing the need for scan agents to manually provide a list for agent start-up / DB purge. 4 years ago
Digimer d677d19ca0 * Moved Database->check_condition_age to Alert. 4 years ago
Digimer be88be6d30 * Did a bunch of testing / bugfixes for scan-server. 4 years ago
Digimer 1a1fa7ce88 * Created Cluster->get_anvil_uuid() that returns the 'anvil_uuid' of a given 'host_uuid'. 4 years ago
Digimer e240a32a19 * Created Cluster->parse_crm_mon and updated Cluster->parse_cib() to determine what state a server is in and which host has a server. 4 years ago
Digimer 44dc4f4b47 * Fixed a bug in Words->string() where not having the parameter 'file' set caused the default 'words.xml' to be specified, preventing strings from scan agents from being used. 4 years ago
Digimer 4be943ebf3 * Finished (initial) testing of scan-hardware. The first M3 scan agent is done! 4 years ago
Digimer 0a1dc809a2 * Created the ScanCore.pm module with the first 'agent_startup' method which generalized scan agent start up. 4 years ago