693 Commits (37f427aa648e08628ba8662896aed34a11be749c)

Author SHA1 Message Date
Digimer 6664c5b77f * Fixed a bug where scan-drbd, with DR configured, was not recording TCP ports assigned to connections properly. 3 years ago
Digimer 20a784baa2 * Continuing work on anvil-manage-dr. Got it to the point where it should (but doesn't yet) create the new DRBD config and the LV(s) on DR. 3 years ago
Digimer 5b35204af4 * Updated DRBD->get_next_resource() to take the new 'dr_tcp_ports' ports which, if set, returns two free TCP ports. 3 years ago
Digimer 76eb09393f Fixed a bug found in the last commit causing newly-added VGs to storage group membership to use the wrong UUID as the internal VG UUID. 3 years ago
Digimer 9edf698c37 Updated Database->get_storage_group_data() to determine when a node or DR host needs to be removed from a Storage group, or when a member of an Anvil! needs to be added to a storage group. 3 years ago
Digimer 221f468b6c * Fixed a bug where duplicate IPMI sensor names of type 'Volts' wasn't being processed properly, causing sensor data to not be recorded. 3 years ago
Digimer 2f8b1fb72e Updated anvil-provision-server so that when the OS type is 'win7', set the disk to sata and the NIC to e1000e. Also updated it to store the virt-install call in the 'variables' table and write it out to /mnt/shared/provision. 3 years ago
Digimer 06f679d7e7 * Added the ability to disable anvil-daemon management of /etc/hosts. 3 years ago
Digimer 213babaaf2 Trying to fix a bug where vnet devices keep reporting as having returned. 3 years ago
Digimer e40d0e2444 Fixed a bug where if a database is pingable but the pgsql database is down, and it's the first database tested (or local), then the DB handle used to read / quote fails. 3 years ago
Digimer 4c7bb45ab9 Fixed a race condition where configuring the IPMI BMC would appear to fail because the BMC wouldn't report the user list after a cold reset. 3 years ago
Digimer 6cbdc388d4 Fixed a bug where corosync's configuration of a backup ring was broken. 3 years ago
Digimer 04cb116c1b Updated anvil-parse-fence-agents to validate each fence agent's metadata is valid before adding it to the unified XML. 3 years ago
Digimer 8abb5b46e0 * Added support for setting per-agent log-level and log secure values in amvil.conf. 3 years ago
Digimer c449e2edf0 Resetting scan agent timeout to 30 seconds, 60 didn't help with a random 3 years ago
Digimer 15d8309095 This commit adds scan agent DB connection info caching to help minimize the number of unnecessary DB resync checks that happen. 3 years ago
Digimer 4800f7181f * Updated ScanCore to boot a node that is off without a stop reason. 3 years ago
Digimer acaacd9a86 * Created Storage->get_size_of_block_device() that takes a block device path and returns the size of the path, if it's found in the database. 3 years ago
Digimer 606bd8f1f0 Continuing work on anvil-manage-server. 3 years ago
Tsu-ba-me 92335b29cc fix(Anvil): add clean up logic when failed to validate Apache conf after modify 3 years ago
Tsu-ba-me d2d7a5380c fix(Anvil): search all args for Access-Control value 3 years ago
Tsu-ba-me 18ec7b1c1a fix(Anvil): abort when no new Apache conf created 3 years ago
Tsu-ba-me 3de9912f51 fix(Anvil): use augeas to modify Apache conf 3 years ago
Digimer 28865780f8 * Updated Database->get_server_definitions() to take a specific server UUID, allowing just the one definition to be loaded. Also had it clear previous loads. 3 years ago
Digimer ccd89f923b Fixed two small bugs that were preventing proactive live migration from working. 3 years ago
Digimer 548c52701a Updates Jobs->update_progress() to take a 'variables' hash reference, and to support logging as well. 3 years ago
Digimer 1e159f548e Added a couple notes for later dev. 3 years ago
Digimer 6db16ca313 * Fixed a bug in Database->insert_or_update_network_interfaces() where the passed-in network_interface_uuid parameter was not being set properly. 3 years ago
Digimer 0c77736dc8 * Fixed a bug in Cluster->manage_fence_delay() where removing the 'delay="15"' attribute was failing, now set it to 0 instead. 3 years ago
Digimer 7e7b91b286 * Updates anvil-join-anvil to update corosync.conf to use the BCN1 link as the main knet network with the SN1 link as the backup link. 3 years ago
Digimer fd5d3c0434 * Finished (though testing still needed) scan-network. 3 years ago
Digimer d7d418ee1b * Fixed a bug in DRBD->gather_data() where the peer node's data was being recorded where the local node's data should have been saved. 3 years ago
Digimer 6777104398 * Fixed a bug in anvil-daemon where, when an anvil-manage-power reboot run had triggered a reboot, anvil-daemon didn't set the job_progress to '100', causing constant reboots. Also fixed a bug where the log level was hard-set to '1' instead of '2' needed during debugging. 3 years ago
Digimer 607c097fc8 * Fixed a bug where, once a DRBD resource was allowed to be dual-primary for migration, that wasn't properly disabled post-migration. 3 years ago
Digimer 0c475d2a2e * Fixed a couple logging bugs. 3 years ago
Digimer d3052c0229 * Finished Cluster->check_server_constraints() and added it to scan-cluster. This now makes sure servers don't roll back to their old host after it has been fenced and recovers. 3 years ago
Digimer 87b31a16bb * Clear out the bond health in Network->check_network(). 3 years ago
Digimer 30f478267a * Forced anvil-daemon to log-level 2 and to enable secure logging to continue debugging setup issues. 3 years ago
Digimer 023f43eda9 * In the never-ending attempt to resolve the build consistency issues, this commit enables extra debugging logging and, hopefully, implements a fix in anvil-daemon where a job could be started repeatedly. 3 years ago
Digimer 5a343d6d75 * WIP; Started work on Cluster->check_server_constraints() that will track when a server's location constraint needs to be updated when the old preferred node is lost. 3 years ago
Digimer b71ed28f64 * Added Cluster->manage_fence_delay() that reports back and, optionally, sets a preferred node in a fence race. 3 years ago
Digimer 08a958ec60 * Finished updating Network->check_network() to check/heal bridges. 3 years ago
Digimer bd24c1c5bb * I _might_ have fixed the network configuration issue in anvil-configure-host... Updated it so that if 'nmcli' doesn't report a valid device name, it looks for it in the ifcfg-X file, and uses 'X' if not found there. 3 years ago
Digimer 11b1900e1b Note: Continuing to resolve the build issues with network startup. Expect breakage. 3 years ago
Digimer a1b06e4355 * Continuing to try to get the network to reliably start during configuration... 3 years ago
Digimer 3f32a56d0c * Created Network->check_bonds() that checks to see if any bonds are down, or if any interfaces configured to be in a bond are not actually in it. It accepts a 'heal' parameter that, by default, will bring up a bond with no active links, but leaves degraded bonds alone. It call also take 'all' and will try to bring up any missing interfaces. This distinction exists so that if a link is flaky and someone takes it down manually until it can be repaired, it doesn't get turned back on. 3 years ago
Digimer 1a8215a783 * Fixed a bug in Network->get_ips() bridge detection bashlet. 3 years ago
Digimer 80bdac8e34 * Updated the pacemaker server config to drop the stop timeout to 5 minutes and the migration timeout to 10 minutes. This will avoid blocking the entire cluster when a stop or migrate operation times out. Will update scan-server to clean these up when they happen. 3 years ago
Digimer daca6c887b * This contains a fairly major change to how time stamps are handled. All INSERT and UPDATE calls now generate a new timestamp via Database->refresh_timestamp, instead of using 'sys::database::timestamp'. This was done in responce to finding a bug where tables in a database differed in both counts of public and private schemas (ip_addresses table, specifically) that failed to resync because the timestamps were re-used too often. 3 years ago
Digimer 96fffb0b96 * Finished updating ocf:alteeve:server to no longer require a database connection. To do this, and still be able to track live migration times, the Server->migrate_virsh() method now writes out the server name and migration time to a /tmp/anvil/migration-duration.<server_name>.<unix_time> file. This file is checked for by the scan-server resource agent and, when found, is parsed and the migration duration is recorded, then the file is purged. 3 years ago