167 Commits (cf8198ac9a2650b96b195e7e84626a4d8bfc8193)

Author SHA1 Message Date
Digimer 7fd6185445 * Disabled firewalling for now. There appears to be an issue starting up with DRBD. 2 years ago
Digimer bce9e2caaf This is the first attempt at enabling firewalld completely. There is a decent chance that problems exist, so it won't be a surprise if a few more commits are needed to this branch before things work. 3 years ago
Digimer f2d06fa9b1 * Updated striker-parse-oui to only run if/when the system has been running for at least one hour. 3 years ago
Digimer ab9b00a2f7 * Updated anvil-daemon, in its daily checks, to disable ksm and ksmtuned daemons. 3 years ago
Digimer 911f7cfb6a This is another big commit with a lot of DB work. Getting closer to sorting out the frequent resyncs. 3 years ago
Digimer e6dcff1cf1 * Added a missing modified_date to ip_addresses in Database->get_ip_addresses(). 3 years ago
Digimer 1b70b49cf8 * Updated Network->find_matches() to try to populate the first and second parameters if they're not passed in. 3 years ago
Digimer 142be7674e * Fixed a bug in striker-scan-network where the scan wasn't running properly when no network was specifically given. 3 years ago
Digimer 0b41029db2 Reworked Database->_find_behind_databases to loop through tables, then databases when evaluating for resync. This is still racy but should be less racy as the time between counts of columns for a given table should be a lot shorter. Also re-enabled triggering resyncs based on the age of the most recent record. 3 years ago
Digimer 7212ea1c2f Fixed a bug where reaping db_in_use states wasn't restricted to the caller's host_uuid. 3 years ago
Digimer 74b7719cf5 * Created the new anvil-manage-host that can check/set if a host is configured. On Strikers, it can age out data, resync data, and check/set if the local database is active. 3 years ago
Digimer edf51adaec * Changed 'anvil-manage-power' to no longer set the job progress to 50 prior to calling a reboot. It now sets to 100 immediately. Also reduced the uptime timer to five minutes from ten. 3 years ago
Digimer 7b090e1623 * Updated Database->shutdown() to disconnect, stop the postgresql daemon, then reconnect. 3 years ago
Digimer 3fd0db15bf * This rather heavily reworks how database shutdowns works. It adds much more intelligent shutdown, tracking who is using the database, being able to mark a database as "offline" and waiting for users of the database to disconnect before it shuts down. 3 years ago
Digimer b234b79544 Updated anvil-daemon to check if anvil-sync-shared is running if the reported RAM use is too high. If so, it doesn't exit. This fixes an issue where anvil-sync-shared would loop forever as it would constantly be killed when downloading large files. 3 years ago
Digimer 68b1d12545 Updated anvil-daemon to not shutdown a striker DB until the striker host has been running for at least an hour. 3 years ago
Digimer f77f486775 Fixed a typo in scan-network 3 years ago
Digimer d70b9a4956 Updated scancore and anvil-daemon to check their RAM use at the end of each loop and, if it's using more than 1 GiB of RAM, it sends an alert and exits. 3 years ago
Digimer a633ab7f63 Added a periodic check to ensure all users can ping. This fixes a bug where a local striker dashboard whose DB was stopped wouldn't work. 3 years ago
Digimer e37f487704 Fixed a bug in System->check_ssh_keys where the 'admin' user's RSA keys were owned by root. 3 years ago
Digimer 892a475881 * Fixed a bug in Convert->format_mmddyy_to_yymmdd() where being passed '--' didn't return the same. 3 years ago
Digimer 652f87ec74 * Updated scan-network to also clean up the media type. 3 years ago
Digimer 72038e8358 * Fixed a bug where ethtool's Media type contained tab characters that broke JSON when configuring the netowrk interfaces. 3 years ago
Digimer 3346d31194 * Created Get->kernel_release() that returns the current kernel release (version) in use on the host or on a remote system. 3 years ago
Digimer 65dfc22a38 Added an eval{} call around Database->query()'s ->prepare() DBI call to better handle lost database handle. 3 years ago
Digimer 034c38fdeb Disabled calling striker-prep-database from the spec file, and enabled scancore. 3 years ago
Digimer 8e41814ca2 * Updated anvil-daemon->prep_database() to start the postgresql daemon if it's not running and no databases are available. 3 years ago
Digimer b517117bc1 * Did more work on trying to figure out why iniital setup of the database was failing. I believe it was because, in anvil-daemon, after calling 'prep_database' we called ->connect() _without_ 'check_if_configured' set. Next round of function testing should help confirm is this was the case. 3 years ago
Digimer 3445d008d2 Removed a stray debug die. 3 years ago
Digimer 63c45430bb * Updated scan-network to clear duplicate IP addresses. 3 years ago
Digimer e60a1b46b3 Fixed bugs related to automatic database startup and conditional backup loading. 3 years ago
Digimer 4e9882812d * Fixed a bug where the periodic database dumps on the primary database Striker were not sync'ing to peers. Also fixed a bug where these periodic dumps weren't running at all. 3 years ago
Digimer 72b17ff1f9 * Reworked how databases are stopped, now being handled in anvil-daemon. This way, initial starts will still do traditional resyncs, then shut down. This should allow the best of both worlds, where data is not lost on striker start/stop loss/recovery, but operate normally otherwise without delays. 3 years ago
Madison Kelly 922899ea78 * WIP: Working on a new method of failing over between which Striker is the active database, instead of running N-number of databases all the time. 3 years ago
Digimer a697011b08 * Disabled debug logging in anvil-daemon. 4 years ago
Digimer 6777104398 * Fixed a bug in anvil-daemon where, when an anvil-manage-power reboot run had triggered a reboot, anvil-daemon didn't set the job_progress to '100', causing constant reboots. Also fixed a bug where the log level was hard-set to '1' instead of '2' needed during debugging. 4 years ago
Digimer 0c475d2a2e * Fixed a couple logging bugs. 4 years ago
Digimer d3052c0229 * Finished Cluster->check_server_constraints() and added it to scan-cluster. This now makes sure servers don't roll back to their old host after it has been fenced and recovers. 4 years ago
Digimer e7a06fce72 * Disabling the periodic network health check in anvil-daemon. 4 years ago
Digimer 30f478267a * Forced anvil-daemon to log-level 2 and to enable secure logging to continue debugging setup issues. 4 years ago
Digimer 47fa126a3c * Fixed a typo that blocked anvil-daemon from starting. 4 years ago
Digimer 023f43eda9 * In the never-ending attempt to resolve the build consistency issues, this commit enables extra debugging logging and, hopefully, implements a fix in anvil-daemon where a job could be started repeatedly. 4 years ago
Digimer bd24c1c5bb * I _might_ have fixed the network configuration issue in anvil-configure-host... Updated it so that if 'nmcli' doesn't report a valid device name, it looks for it in the ifcfg-X file, and uses 'X' if not found there. 4 years ago
Digimer c7c6c8dee5 * Reworked the attempt to repair the network in anvil-daemon to not touch the network until the machine has been running for at least two minutes. 4 years ago
Digimer 1e7847d4dd * Added a call to Network->check_bonds() to be called while non-Striker machines wait to connect to a database. 4 years ago
Digimer 3f32a56d0c * Created Network->check_bonds() that checks to see if any bonds are down, or if any interfaces configured to be in a bond are not actually in it. It accepts a 'heal' parameter that, by default, will bring up a bond with no active links, but leaves degraded bonds alone. It call also take 'all' and will try to bring up any missing interfaces. This distinction exists so that if a link is flaky and someone takes it down manually until it can be repaired, it doesn't get turned back on. 4 years ago
Digimer 19c41c9171 * Added more logging while chasing a function test bug. 4 years ago
Digimer daca6c887b * This contains a fairly major change to how time stamps are handled. All INSERT and UPDATE calls now generate a new timestamp via Database->refresh_timestamp, instead of using 'sys::database::timestamp'. This was done in responce to finding a bug where tables in a database differed in both counts of public and private schemas (ip_addresses table, specifically) that failed to resync because the timestamps were re-used too often. 4 years ago
Digimer 96fffb0b96 * Finished updating ocf:alteeve:server to no longer require a database connection. To do this, and still be able to track live migration times, the Server->migrate_virsh() method now writes out the server name and migration time to a /tmp/anvil/migration-duration.<server_name>.<unix_time> file. This file is checked for by the scan-server resource agent and, when found, is parsed and the migration duration is recorded, then the file is purged. 4 years ago
Digimer 24ec17f8f7 * Added a new parameter called 'sensitive' to Database->connect() that returns after connections before any ancilliary checks are done, minimizing connect time. 4 years ago