482 Commits (a7a2cc70d7c469dbdddda3e4ce4a55585816da1c)

Author SHA1 Message Date
Digimer 6eb99a2168 * FInished the anvil-manage-alerts tool. It can now send test alerts at a user-requested alert level. 2 years ago
Digimer 8b7a44cf75 * Finished cleaning up the output of Machines. 2 years ago
Digimer 3e53c87a6b Formatted the output of anvil-manage-alerts data (not yet machines) to be more presentable. 2 years ago
Digimer 622fb84652 * Renamed the 'notifications' table to 'alert-override', better reflecting what it does. 2 years ago
Digimer 586ce6e5b9 * Got recipints working in anvil-manage-alerts(). 2 years ago
Digimer 35cf0c37fb * Updated System->check_ram_use() to set the maximum RAM based on the host type, and set those values in _set_default() so that the user can override if they want. 2 years ago
Digimer a6cd5c6604 * Starting work in the new anvil-manage-alerts, which will (when done), allow for management of mail servers, alert recipients, notification over-rides and to trigger test alerts. 2 years ago
Digimer bde0b2e7ec * Fixed a bug where deleting ports from a fence device in an Install Manifest would not cause the fence methods to be removed from the associated cluster. 2 years ago
Digimer 93427a7a38 * Updated Get->switches() to always support job-uuid. 2 years ago
Digimer c23c79cdf0 Added 'system::all::configured' to anvil-join-anvil to mark an explicit end of config. 2 years ago
Digimer 596855405f * Added variables to record when pacemaker and DRBD are configured. 2 years ago
Digimer 3b721b849c * Fixed a bug in anvil-configure-host where if the same MAC address was assigned to two interfaces, it would cause an endless reboot loop. 2 years ago
Digimer 599373816f * Fixed bugs that came up in testing. Was now able to setup long-throw DR! 2 years ago
Digimer 2fab7bc1b7 This adds support (testing needed) for "Long-Throw" DR; which is a wrapper for using 'drbd-proxy' to provide larger transmit buffers so slow/high-latency DR hosts. 2 years ago
Digimer c8ee75420d * Updated anvil-manage-dr to check if a server is protected before processing a --connect or --disconnect request. Also made it smarter if an attempt to connect a resource fails. 2 years ago
Digimer e90dae96f7 * In Server->shutdown_virsh(), disabled trying to resume a paused VM. Also updated the logging around not waiting for a VM to stop. 2 years ago
Digimer d271ffec26 * Updated Cluster->parse_crm_mon() to record the role of stonith resources. 2 years ago
Digimer 89121a2b3b * Fixed a bug in Alert->check_condition_age() where not setting a host_uuid caused the returned age to always be 0. 2 years ago
Digimer 99a6593fe6 * Fixed a bug when connecting to databases when one DB has no variable entries, making it seem like a DB was disabled. 2 years ago
Digimer b8bb7cc423 * Changed the default trigger of live migrations to require a health score difference of 2 or higher. This can be user-adjusted using the new 'feature::scancore::threshold::preventative-live-migration' anvil.conf option. 2 years ago
Digimer 9675ebf986 * Added --remove support to anvil-manage-dr, completing all the features for this tool. 2 years ago
Digimer 93e6a59841 * Added 'vnc-server' to the list of firewall services enabled on strikers. 2 years ago
Digimer 29a28ee97a * Fixed a bug with anvil-provision-server where running the command line menu from a Striker would not assign the job to the target Anvil!. 2 years ago
Digimer cbb441759e * Fixed a couple bugs in anvil-manage-files where a file moved from incoming to files or definitions wasn't having the directory updated properly in the database. Also made an explicit check when looking for missing files to check to see if the file exists in another managed directory and, if so and if a striker, update the DB. 2 years ago
Digimer a81478f2bc * Updated 'db_in_use' state to add the caller's name to the state name. This is pulled out when logging stale locks that are being reaped, to help debug where stale locks are coming from. 2 years ago
Digimer e7cf8ac789 * Got more work done on anvil-manage-files. It now picks up new files on nodes/dr hosts in an Anvil! and downloads them if needed. 2 years ago
Digimer be84a23924 * There were still references in anvil-manage-files to 'file_locations' -> 'file_location_host_uuid'. Had to rework some logic to get things working. More testing needed, but so far at least the "missing file" function is working again. 2 years ago
Digimer 15aadc3a4e * Updated scan-network to check for inactive or activating interfaces and manually bring them up, if the uptime is less than 10 minutes. 2 years ago
Digimer 171ea74000 * There is a fix in this commit to resolve a race condition where, when reconfiguring the network, the request to set a job to reboot would fail because the connections to all Strikers could be lost, causing Database->_test_access() would error out, blocking the reboot. When restarted, the network would not be changed, so no reboot would be requested, leaving the machine in an innaccesible state. 3 years ago
Digimer bce9e2caaf This is the first attempt at enabling firewalld completely. There is a decent chance that problems exist, so it won't be a surprise if a few more commits are needed to this branch before things work. 3 years ago
Digimer b2ea4f9adc * Moved System->manage_firewall() to Network->manage_firewall(). Started working on actually implementing it, which involves basically fully rewritting it. 3 years ago
Digimer f2d06fa9b1 * Updated striker-parse-oui to only run if/when the system has been running for at least one hour. 3 years ago
Digimer ab9b00a2f7 * Updated anvil-daemon, in its daily checks, to disable ksm and ksmtuned daemons. 3 years ago
Digimer b77bb81343 * Found a bug where, if a record was deleted from the public schema but not from the history schema, and then later a resync was performed, the record would be added to the peer database's public schema (while still not existing locally). This condition should never occur as data in history should only exist to track the public record. This update checks for this condition and purges those records prior to resync'ng a database table. 3 years ago
Digimer 3caf43ed42 Updated striker-purge-target to check for problems on write of DELETEs. 3 years ago
Digimer 911f7cfb6a This is another big commit with a lot of DB work. Getting closer to sorting out the frequent resyncs. 3 years ago
Digimer 24f5d39dff This is a set of changes all stemming from trying to debug frequent resyncs. More bugs still to be fixed. 3 years ago
Digimer 1b70b49cf8 * Updated Network->find_matches() to try to populate the first and second parameters if they're not passed in. 3 years ago
Digimer 7ec4cee143 Created the new anvil-show-local-ips that shows the IPs on the host in an easier to read format, compared to 'ip addr list'. 3 years ago
Digimer e9a9e0dd4b * Finished (but needs more testing) the new 'anvil-report-usage' tool. 3 years ago
Digimer d2973e603b Updated anvil-update-states to make the speed of links to 10000 when they are virtio interfaces. 3 years ago
Digimer 4751c6e747 Updated DRBD->get_devices() and Server->parse_definition() to take 'anvil_uuid' so that server data can be parsed from anywhere. 3 years ago
Digimer c9633aa3b0 * Updated Database->_find_behind_databases() to not run unless it's on a Striker. 3 years ago
Digimer aa7d9bdf14 * Fixed a bug where resync'ing the database was missing tables. 3 years ago
Digimer 74b7719cf5 * Created the new anvil-manage-host that can check/set if a host is configured. On Strikers, it can age out data, resync data, and check/set if the local database is active. 3 years ago
Digimer 422d248cbe * Updated Database->insert_or_update_states() to not actually record unless the state_host_uuid exists in all available databases. 3 years ago
Digimer 7b090e1623 * Updated Database->shutdown() to disconnect, stop the postgresql daemon, then reconnect. 3 years ago
Digimer 3fd0db15bf * This rather heavily reworks how database shutdowns works. It adds much more intelligent shutdown, tracking who is using the database, being able to mark a database as "offline" and waiting for users of the database to disconnect before it shuts down. 3 years ago
Digimer b234b79544 Updated anvil-daemon to check if anvil-sync-shared is running if the reported RAM use is too high. If so, it doesn't exit. This fixes an issue where anvil-sync-shared would loop forever as it would constantly be killed when downloading large files. 3 years ago
Digimer 7023ffb56b Further improved startup DRBD logic in ocf:alteeve:server. Specifically, it will startup if a local resource/volume is sync'ing. 3 years ago