868 Commits (652cce4efc60e33602ef51bcf7316e0092c7e142)

Author SHA1 Message Date
Tsu-ba-me 7d9013a60b fix(tools): allow striker-manage-vnc-pipes to be executed as a job 3 years ago
Tsu-ba-me 0935b9a990 feat(tools): move manage_vnc_pipes endpoint core logic to separate script 3 years ago
Tsu-ba-me 5459e610aa fix(tools): auto-end tunnel script when connection breaks 3 years ago
Tsu-ba-me d5724c1457 chore(tools): rename striker-start-ssh-tunnel->striker-open-ssh-tunnel 3 years ago
Tsu-ba-me 23d818cfff fix(cgi-bin): avoid direct SSH calls 3 years ago
Digimer e3d65d654c * Continuing work on anvil-manage-server. 3 years ago
Digimer 3f1c2dd38f * Couple of small cleanups for fence_delay. 3 years ago
Digimer 8d2e454d69 * Updated fence_delay to set the ownership of the log file to 'hacluster:haclient'. This should address https://github.com/digimer/fence_delay/issues/1 3 years ago
Digimer bc8b9274cb WIP; Reworked anvil-manage-server to have a more interactive menu system (for the sections done so far). 3 years ago
Digimer 28865780f8 * Updated Database->get_server_definitions() to take a specific server UUID, allowing just the one definition to be loaded. Also had it clear previous loads. 3 years ago
Digimer 623dbb0863 WIP; Restarted work on anvil-manage-server. 3 years ago
Digimer 548c52701a Updates Jobs->update_progress() to take a 'variables' hash reference, and to support logging as well. 3 years ago
Digimer 1e159f548e Added a couple notes for later dev. 3 years ago
Digimer 39236e9b3f Switched default graphics for new servers to 'vnc' instead of spice. 4 years ago
Digimer cebae28716 * WIP - Fixing a bug in scan-network where vnet devices aren't being recorded against their bridge. 4 years ago
Digimer 7e7b91b286 * Updates anvil-join-anvil to update corosync.conf to use the BCN1 link as the main knet network with the SN1 link as the backup link. 4 years ago
Digimer d7d418ee1b * Fixed a bug in DRBD->gather_data() where the peer node's data was being recorded where the local node's data should have been saved. 4 years ago
Digimer a697011b08 * Disabled debug logging in anvil-daemon. 4 years ago
Digimer 6777104398 * Fixed a bug in anvil-daemon where, when an anvil-manage-power reboot run had triggered a reboot, anvil-daemon didn't set the job_progress to '100', causing constant reboots. Also fixed a bug where the log level was hard-set to '1' instead of '2' needed during debugging. 4 years ago
Fabio M. Di Nitto 7aea5e1b11 Switch to kmod-drdb 4 years ago
Digimer 04f7571097 * Fixed a typo causing anvil-manage-power to not compile. 4 years ago
Digimer 0c475d2a2e * Fixed a couple logging bugs. 4 years ago
Digimer d3052c0229 * Finished Cluster->check_server_constraints() and added it to scan-cluster. This now makes sure servers don't roll back to their old host after it has been fenced and recovers. 4 years ago
Digimer e7a06fce72 * Disabling the periodic network health check in anvil-daemon. 4 years ago
Digimer 30f478267a * Forced anvil-daemon to log-level 2 and to enable secure logging to continue debugging setup issues. 4 years ago
Digimer 47fa126a3c * Fixed a typo that blocked anvil-daemon from starting. 4 years ago
Digimer 023f43eda9 * In the never-ending attempt to resolve the build consistency issues, this commit enables extra debugging logging and, hopefully, implements a fix in anvil-daemon where a job could be started repeatedly. 4 years ago
Digimer 5a343d6d75 * WIP; Started work on Cluster->check_server_constraints() that will track when a server's location constraint needs to be updated when the old preferred node is lost. 4 years ago
Digimer 76689aa245 * I've decided that live reconfiguring of NetworkManager interfaces is too unreliable. This commit disables all attempts to reconfigure the network while it's up, and simply reboots on changes. 4 years ago
Digimer 629c2b8e8c * Moved up when the reboot happens, when it's needed, avoiding a network reload when a reboot is going to happen anyway. 4 years ago
Digimer bbee77d265 * Re-enabled reboot 4 years ago
Digimer 08a958ec60 * Finished updating Network->check_network() to check/heal bridges. 4 years ago
Digimer 6a8a192cfd * Added an explicit delete call when network changes. 4 years ago
Digimer bd24c1c5bb * I _might_ have fixed the network configuration issue in anvil-configure-host... Updated it so that if 'nmcli' doesn't report a valid device name, it looks for it in the ifcfg-X file, and uses 'X' if not found there. 4 years ago
Digimer c7c6c8dee5 * Reworked the attempt to repair the network in anvil-daemon to not touch the network until the machine has been running for at least two minutes. 4 years ago
Digimer 11b1900e1b Note: Continuing to resolve the build issues with network startup. Expect breakage. 4 years ago
Digimer a1b06e4355 * Continuing to try to get the network to reliably start during configuration... 4 years ago
Digimer 1e7847d4dd * Added a call to Network->check_bonds() to be called while non-Striker machines wait to connect to a database. 4 years ago
Digimer 3f32a56d0c * Created Network->check_bonds() that checks to see if any bonds are down, or if any interfaces configured to be in a bond are not actually in it. It accepts a 'heal' parameter that, by default, will bring up a bond with no active links, but leaves degraded bonds alone. It call also take 'all' and will try to bring up any missing interfaces. This distinction exists so that if a link is flaky and someone takes it down manually until it can be repaired, it doesn't get turned back on. 4 years ago
Digimer 0dd92a08c5 * Small change to variable name to help make logs clearer. 4 years ago
Digimer 0b6a9e37fa * Added scan_lvm_pv_sector_size to the scan_lvm_pvs table in the scan-lvm. This will be used later for growing a requested disk size for the DRBD metadata. 4 years ago
Digimer 80bdac8e34 * Updated the pacemaker server config to drop the stop timeout to 5 minutes and the migration timeout to 10 minutes. This will avoid blocking the entire cluster when a stop or migrate operation times out. Will update scan-server to clean these up when they happen. 4 years ago
Digimer 19c41c9171 * Added more logging while chasing a function test bug. 4 years ago
Digimer 0f43961568 * This commit lowers the logging levels of some debug log entries. It's to help diagnose occassional function test failures with an unknown source. 4 years ago
Digimer daca6c887b * This contains a fairly major change to how time stamps are handled. All INSERT and UPDATE calls now generate a new timestamp via Database->refresh_timestamp, instead of using 'sys::database::timestamp'. This was done in responce to finding a bug where tables in a database differed in both counts of public and private schemas (ip_addresses table, specifically) that failed to resync because the timestamps were re-used too often. 4 years ago
Digimer 5b4bfa747c * Reworked the anvil-join-anvil job parsing to help diagnose occassional faults. Also changed a fatal parse error to one that allows the run to be retried. 4 years ago
Digimer 96fffb0b96 * Finished updating ocf:alteeve:server to no longer require a database connection. To do this, and still be able to track live migration times, the Server->migrate_virsh() method now writes out the server name and migration time to a /tmp/anvil/migration-duration.<server_name>.<unix_time> file. This file is checked for by the scan-server resource agent and, when found, is parsed and the migration duration is recorded, then the file is purged. 4 years ago
Digimer e15c1651ed * Fixed a bug with deleting bad keys where jobs to delete keys on non-dashboard machine wasn't being assigned to the proper target machine. 4 years ago
Digimer 16c20ae69c * Updated Tools->catch_sig() to use return code 0 instead of 255 so that systemd doesn't think our daemons failed on stop. 4 years ago
Digimer 24ec17f8f7 * Added a new parameter called 'sensitive' to Database->connect() that returns after connections before any ancilliary checks are done, minimizing connect time. 4 years ago