191 Commits (4f56c7b831e875615c4221134f125692e66e4f6d)

Author SHA1 Message Date
Digimer ccd89f923b Fixed two small bugs that were preventing proactive live migration from working. 3 years ago
Digimer ef10765d8a Fixed a bug with registering health scores against down links in scan-network. 3 years ago
Digimer ad4609b39c Fixed a bug where interfaces in bonds weren't recording their parent bond. 3 years ago
Digimer d62900d712 Fixed a bug where the wrong string key was used when a network interface comes, goes or changes it's bond_uuid reference. Also changed the alert level of vnetX interfaces that disappear to notice level. 3 years ago
Digimer 0c77736dc8 * Fixed a bug in Cluster->manage_fence_delay() where removing the 'delay="15"' attribute was failing, now set it to 0 instead. 3 years ago
Digimer a9ce76bd1b * Finished fixing the scan-network interface to bridge mapping problem. 3 years ago
Digimer cebae28716 * WIP - Fixing a bug in scan-network where vnet devices aren't being recorded against their bridge. 3 years ago
Digimer 7e7b91b286 * Updates anvil-join-anvil to update corosync.conf to use the BCN1 link as the main knet network with the SN1 link as the backup link. 3 years ago
Digimer fd5d3c0434 * Finished (though testing still needed) scan-network. 4 years ago
Digimer d7d418ee1b * Fixed a bug in DRBD->gather_data() where the peer node's data was being recorded where the local node's data should have been saved. 4 years ago
Digimer a697011b08 * Disabled debug logging in anvil-daemon. 4 years ago
Digimer 0c475d2a2e * Fixed a couple logging bugs. 4 years ago
Digimer d3052c0229 * Finished Cluster->check_server_constraints() and added it to scan-cluster. This now makes sure servers don't roll back to their old host after it has been fenced and recovers. 4 years ago
Digimer b71ed28f64 * Added Cluster->manage_fence_delay() that reports back and, optionally, sets a preferred node in a fence race. 4 years ago
Digimer c7c6c8dee5 * Reworked the attempt to repair the network in anvil-daemon to not touch the network until the machine has been running for at least two minutes. 4 years ago
Digimer 0b6a9e37fa * Added scan_lvm_pv_sector_size to the scan_lvm_pvs table in the scan-lvm. This will be used later for growing a requested disk size for the DRBD metadata. 4 years ago
Digimer daca6c887b * This contains a fairly major change to how time stamps are handled. All INSERT and UPDATE calls now generate a new timestamp via Database->refresh_timestamp, instead of using 'sys::database::timestamp'. This was done in responce to finding a bug where tables in a database differed in both counts of public and private schemas (ip_addresses table, specifically) that failed to resync because the timestamps were re-used too often. 4 years ago
Digimer 96fffb0b96 * Finished updating ocf:alteeve:server to no longer require a database connection. To do this, and still be able to track live migration times, the Server->migrate_virsh() method now writes out the server name and migration time to a /tmp/anvil/migration-duration.<server_name>.<unix_time> file. This file is checked for by the scan-server resource agent and, when found, is parsed and the migration duration is recorded, then the file is purged. 4 years ago
Digimer 73267a8ea9 * WIP - Slowly working on anvil-manage-server 4 years ago
Digimer 78f3fb7b10 * Updated System->configure_ipmi to pull the machine from the anvils table instead of looking for the original job, which isn't useful now that we purge old jobs. 4 years ago
Digimer 4dcd505753 * Biggest change in this commit; scan-apc-pdu and scan-apc-ups now only run on Striker dashboards! This was because we found that if two machines ran their agents at the same time, the reponce time from SNMP read requests grew a lot. This meant it was likely a third, fourth and so on machne would also then have their scan agent runs while the existing runs were still trying to process, causing the SNMP reads to get slower still until timeouts popped. 4 years ago
Digimer 6abe06f125 The theme of these commits is improving DB responsiveness. 4 years ago
Digimer 41cd1e0319 * Several bugs fixed and enhancements; 4 years ago
Digimer ad4a1ecc78 * Increaded the scancore agent run timeout to 60 seconds. 4 years ago
Digimer 473c728117 * Updated scan-ipmitool to use 'jump' thresholds for a common sensor name where duplicates with the hex address appended may exist. 4 years ago
Digimer 44864ce321 * Updated Database->resync_databases() to set a default schema of 'public'. Also fixed a bug where, when the difference in record numbers between two line was > 999, it would not trigger a resync. 4 years ago
Digimer f833c311ba * To address issues with scancore debugging, we needed a tool to purge old anvils and hosts from the database. The 'test.pl' in this commit contains the new logic that will be merged into tools/striker-purge-host shortly. 4 years ago
Digimer a74be60469 * Fixed a bug where the log message for a changed CIB wasn't useful. 4 years ago
Digimer 4a87ee71db * This commit started with work on webui endpoint set_power, but then switched to scancore debugging and I neglected to switch branches. 4 years ago
Digimer ca7052dd53 The core logic is done!!!! Still need to finish end-points for the WebUI to hook into, but the core of M3 is complete! Many, many bugs are expected, of course. :) 4 years ago
Digimer 53cd0bdf3a * Now with 100% less typos. 4 years ago
Digimer e3ba64cb83 * Fixed a type in the Makefile.am. 4 years ago
Digimer 2e37691116 * Updated DRBD->gather_data() to store data on peers so that the peer's LV path and backing disk is recorded. Also fixed a bug in ->get_status() where the return code for local calls was stored as a host name. 4 years ago
Digimer 798518ba5e * While working on the boot/shutdown server tools, ran into and fixed a bug where files uploaded before an Anvil! was added could not have those files sync'ed. This was fixed though the new Database->check_file_locations() method. 4 years ago
Digimer faf1399440 * Continued work on anvil-safe-start. Got it to the point where it detects shared networks with its peer node and waits for all networks to be up. 4 years ago
Digimer fb0836f912 * THe get_cpu endpoint was completed. 4 years ago
Digimer 70dc0598f2 * Created Storage->manage_lvm_conf() that checks / updates lvm.conf to add a filter to avoid seeing DRBD devices as LVM components. This is now called from striker-initialize-host and scan-drbd. 4 years ago
Digimer 5640eda9f2 * Fixed typos for scan-filesystems in Makefile.am 4 years ago
Digimer 82fa42fe83 * Added scan-filesystems the Makefile.am 4 years ago
Digimer 59b867cc25 * Updated DRBD->gather_data() to check if drbdadm exists before trying to call it to avoid scary errors in the logs. Also moved some strings that pulled from the scan-drbd agent into the main words file. 4 years ago
Digimer 48d7a8d611 * Fixed bugs in scan-apc-ups and scan-apc-pdu that allowed PDUs and UPSes to be recorded duplicate times in the database. Fixed multiple bugs in scan_apc_ups from when we cloned PDU as it's base. 4 years ago
Digimer 265e3c74d6 * Updated Database->connect to track previous connected DB count to current one (only useful for daemons). If the connection count has not changed, a check for resync is not performed. 4 years ago
Digimer 9fa24750d6 * Fixed a bug in Convert-round() where the requested number of digits after the decimal place was coming back one too long. Also added logging that should have been there for a while now. 4 years ago
Digimer 53d654fd9d * Got scan-filesystem to the point where it now collects the data it needs. No processing done yet though. Also reworked the SQL schema for the agent to record more data. 4 years ago
Digimer 296556328b * Fixed a bug in Convert->bytes_to_human_readable() to handle being passed in bytes (with the size units of 'b' ot 'bytes'). 4 years ago
Digimer a1eede2757 * Added new jumps to scan-ipmitool to make it less likely to trigger a jump alert for 'Temp{1..4}' sensors. 4 years ago
Digimer 1ec03c9718 * Removing 'test.pl' from git. 4 years ago
Digimer 6009590352 * Fixed a bug in scan-apc-ups where changes in the transfer reason were not being recorded. 4 years ago
Digimer 8d0f873912 * Updated scan-storcli to check if a MegaRAID controlled exists and neither storcli64 or perccli64 exist. If a controller is found but no RPM is installed, it checks to see if the host is Dell and then decides to try and install perccli or storcli. 4 years ago
Digimer f4bf1fd54a * Removed some XML insertions into strings as the break inserting into strings. 4 years ago