51 Commits (d7a5ce747165970737b287fdec324fe1e3eb7bc3)

Author SHA1 Message Date
digimer 8ba613952c Typo fix. 2 years ago
digimer 83a527f4fa * Removed enabling anvil-safe-start out of the RPM and into anvil-join-anvil. 2 years ago
digimer efebd135eb * Removed more references to 'dr1_host_uuid' from the old way of linking DR hosts to Anvil! nodes. 2 years ago
digimer fea10e5bb1 * Prefixed all 'virsh' calls with 'setsid --wait' to help prevent future hangs if the call happens without a shell. 2 years ago
digimer a3988cc3e5 * Added System->configure_logind() to ensure that nodes are configured to ignore ACPI power button events so that IPMI-based fences work immediately. 2 years ago
Digimer c23c79cdf0 Added 'system::all::configured' to anvil-join-anvil to mark an explicit end of config. 2 years ago
Digimer 596855405f * Added variables to record when pacemaker and DRBD are configured. 2 years ago
Digimer e37f487704 Fixed a bug in System->check_ssh_keys where the 'admin' user's RSA keys were owned by root. 3 years ago
Digimer 4c7bb45ab9 Fixed a race condition where configuring the IPMI BMC would appear to fail because the BMC wouldn't report the user list after a cold reset. 3 years ago
Digimer 6cbdc388d4 Fixed a bug where corosync's configuration of a backup ring was broken. 3 years ago
Digimer 04cb116c1b Updated anvil-parse-fence-agents to validate each fence agent's metadata is valid before adding it to the unified XML. 3 years ago
Digimer cebae28716 * WIP - Fixing a bug in scan-network where vnet devices aren't being recorded against their bridge. 3 years ago
Digimer 7e7b91b286 * Updates anvil-join-anvil to update corosync.conf to use the BCN1 link as the main knet network with the SN1 link as the backup link. 3 years ago
Digimer 6777104398 * Fixed a bug in anvil-daemon where, when an anvil-manage-power reboot run had triggered a reboot, anvil-daemon didn't set the job_progress to '100', causing constant reboots. Also fixed a bug where the log level was hard-set to '1' instead of '2' needed during debugging. 3 years ago
Digimer 0f43961568 * This commit lowers the logging levels of some debug log entries. It's to help diagnose occassional function test failures with an unknown source. 3 years ago
Digimer 5b4bfa747c * Reworked the anvil-join-anvil job parsing to help diagnose occassional faults. Also changed a fatal parse error to one that allows the run to be retried. 3 years ago
Digimer 6abe06f125 The theme of these commits is improving DB responsiveness. 3 years ago
Digimer 49a700d68f * Fixed a bug in anvil-join-anvil where the desired DNS servers were not matching existing list of used DNS servers, even when they are the same already. 3 years ago
Digimer e036515df3 * Got anvil-safe-start to the point where is starts the cluster stack. Need to create the 'anvil-boot-server' and 'anvil-shutdown-server' before it can be completed, so those files have been added. 4 years ago
Digimer fb0836f912 * THe get_cpu endpoint was completed. 4 years ago
Digimer 5e9e7e4dde * Removed debug logging from tools. 4 years ago
Digimer 54496cbeb0 * Added a check to Database->get_ip_addresses() to check is a hash is set before using it, to help avoid unitialized variable messages. 4 years ago
Digimer 5db09f565d * Updated anvil-join-anvil to actively call a cluster start once per minute while waiting for initial startup. 4 years ago
Digimer 3733220b50 * Updated Log->entry() to prefix log lines with the short 'job-uuid', when the log entry is coming from a program running as a job. This is meant to make it easier to break up what log lines belong to what jobs, if multiple jobs are running at the same time (ie: when initializing multiple nodes / dr hosts in parallel). 4 years ago
Digimer 1a520b03d5 * Cleaned up a lot of logging in anvil-daemon and tools it calls. 4 years ago
Digimer a7f0676a0f * Got the 'anvil-provision-server' script to the point where it actually saves the new server job. 4 years ago
Digimer 1d03a386d3 * Created Database->get_bridges() that, surprise, loads data from the 'bridges' table. 4 years ago
Digimer 0f7267eae1 * Moved the '_host_name', '_short_host_name', and '_domain_name' private methods in Tools.pm over to Get.pm (removing the leading '_' in the method names). 4 years ago
Digimer 4f39272d9a * Fixed a big in Jobs->get_job_details() where jobs weren't being found via 'switches::job-uuid'. 4 years ago
Digimer 1498e1b53c * Got server migration working using ocf:alteeve:server in a test environment! 4 years ago
Madison Kelly 30f2b3fa8e * Switched all hash 'local' keys to be the host's short user name. Untested, likely bugs to be fixed in the next commit. 4 years ago
Digimer e35800c413 * Fixed up (though more testing/work needed) to ocf:alteeve:server to get it working with DRBD resources referenced using '/dev/drbd/by-res/...'. 4 years ago
Digimer d647014ad1 * Created (finished but not yet tested) DRBD->update_global_common() to update DRBD's global_common.conf file. 4 years ago
Digimer ef208fd3fb * Finished the logic for adding stonith devices and levels to pacemaker! More testing is needed though, bugs expected, but it adds them. 4 years ago
Digimer c27cc7507f * Renamed striker-parse-fence-agents to anvil-parse-fence-agents and changed anvil-daemon to run it on all machines. 4 years ago
Digimer 61f4dcc41f * Updated Cluster->parse_cib() to pull out fencing (stonith) devices and levels. 4 years ago
Digimer 3c2f25a860 * Added 'fence_delay' fence agent to handle the corner cases where an IPMI BMC had crashed until a power cycle, and PDU fencing was effected, but failed to report as such. 4 years ago
Digimer d2d5d7b460 * Fixed a bug in Striker->load_manifest() where fences were parsed twice, the second time missing a hash reference. 4 years ago
Digimer dcfdf1127c * Got more work done on System->configure_ipmi(). It should now configure the IP address, subnet mask and default gateway using information from the manifest and anvil-join-anvil data. 4 years ago
Digimer 1fa63d2ea3 * Added 'anvil_uuid' as a set parameter in Database->get_hosts(). 4 years ago
Digimer 345d2e33d4 * Updated Cluster->parse_cib() to pre-fill some hashes to avoid undefined errors. 4 years ago
Digimer dcd1fd1492 * Created Cluster->check_node_status() that checks the status of a node (in pacemaker). 4 years ago
Digimer 597d9413a5 * Created the skeleton Cluster.pm. 4 years ago
Digimer 76b6550ac6 * Created Database->get_ip_addresses() that pulls the IPs out and stores them in a hash that allows for easy referencing to associated interfaces and networks. 4 years ago
Digimer aa2fdfb609 * Fixed a bug in Database->get_anvils() that was clearing the manifests hash. 4 years ago
Digimer 453f5c6223 * Fixed a bug where $anvil->nice_exit() was being passed 'exit' instead of 'exit_code' as a parameter. 4 years ago
Digimer 726a4374d1 * Renamed the database table 'host_keys' to 'ssh_keys' to better represent what it stores. 4 years ago
Digimer e9e18f8e3b * Fixed a bug where the interface name wasn't quoted when down/up'ing an interface. 4 years ago
Digimer 530fb31478 * Updated Jobs->get_job_details() to use --job-uuid switch or, failing that, look for an incomplete on this host with the same command as the calling program. 4 years ago
Digimer 7a247aca4e * Fixed a bug in Database->insert_or_update_bonds() where 'bond_bridge_uuid' is NULL would cause a SQL error. 4 years ago