366 Commits (40d13433cf32e2a216e787da93b1e86061e2e44c)

Author SHA1 Message Date
digimer 9751c883cb * Updated Cluster->assemble_storage_groups() to remove refrences to anvil_dr1_host_uuid. Also added the logic for auto-adding DR host's VGs to a storage group. Commented it out though as, for now, this might be a bad idea. Needs more thought. 2 years ago
digimer e012d6016c Tha major point of this commit is to add the new 'anvil-manage-storage-groups' program that, well, manages storage groups. 2 years ago
digimer 1a217d21cf * Updated anvil-manage-dr to provide the ability to link anvil nodes to dr hosts. Also began work on making it work with the new DR links system. 2 years ago
digimer 16fc4e131c * Fixed a bug where, if a specific request to do a DB resync was made but the active_uuid wasn't matching the host, it wouldn't resync. This broke peering Strikers when the peer source was not the active_uuid. 2 years ago
digimer 985338a064 Fixed typo that broke compilation. 2 years ago
digimer 17863404e3 * Updated Database->_age_out_data() to only run once per day, unless explicitely called with --age-out-database. 2 years ago
digimer 3d6f71f27e * Updated Database->connect to clean up duplicates on setting the read UUID and database handle. 2 years ago
digimer 26a1fe1491 * Updated Database->connect() to allow local reads on strikers, regardless of the active DB. 2 years ago
digimer 5fcbb1643c * Updated Database->connect() to set an 'active_uuid', and the host with that UUID will be the only one to do resyncs. This might help with frequent resyncs, which could be caused by simultaneous resyncs happening on both nodes stepping on each other. This should help with issue #276 2 years ago
digimer 6ca0e0da90 * Updated Database->connect() to only try to load from dump files if 2+ databases are configured in striker. 2 years ago
digimer a3988cc3e5 * Added System->configure_logind() to ensure that nodes are configured to ignore ACPI power button events so that IPMI-based fences work immediately. 2 years ago
Digimer 9194eb3d09 * Updated System->check_if_configured() to record that a host is configured in /etc/anvil to make the system auto-mark as configured if the host is removed from the DB (or, more specifically, variables -> system::configured is lost). 2 years ago
Digimer f9ca6fb170 * This adds the new anvil-version-change tool which anvil-daemon will call on startup to handle checks for changes made over releases/updates. 2 years ago
Digimer 33b4516dea Fix a variable quoting bug in Database->locking(). 2 years ago
Digimer 622fb84652 * Renamed the 'notifications' table to 'alert-override', better reflecting what it does. 2 years ago
Digimer a6cd5c6604 * Starting work in the new anvil-manage-alerts, which will (when done), allow for management of mail servers, alert recipients, notification over-rides and to trigger test alerts. 2 years ago
Digimer 3b721b849c * Fixed a bug in anvil-configure-host where if the same MAC address was assigned to two interfaces, it would cause an endless reboot loop. 2 years ago
Digimer 99a6593fe6 * Fixed a bug when connecting to databases when one DB has no variable entries, making it seem like a DB was disabled. 2 years ago
Digimer 4ecc6097d3 * Cleaned up some old 'die' calls with better nice_exit() calls to help avoid dangling db_in_use flags. 2 years ago
Digimer ef3ac86162 * Fixed a bug where setting the db_in_use flag without a valid $ENV{_}. 2 years ago
Digimer 21738ab0d4 Added a bit more logging to the Database->mark_active method. 2 years ago
Digimer a81478f2bc * Updated 'db_in_use' state to add the caller's name to the state name. This is pulled out when logging stale locks that are being reaped, to help debug where stale locks are coming from. 2 years ago
Digimer e7cf8ac789 * Got more work done on anvil-manage-files. It now picks up new files on nodes/dr hosts in an Anvil! and downloads them if needed. 2 years ago
Digimer cd220e97dc Disabled striker-prep-databas and set Database->configure_pgsql() calls to use debug => 2. 2 years ago
Digimer 7fd6185445 * Disabled firewalling for now. There appears to be an issue starting up with DRBD. 2 years ago
Digimer 171ea74000 * There is a fix in this commit to resolve a race condition where, when reconfiguring the network, the request to set a job to reboot would fail because the connections to all Strikers could be lost, causing Database->_test_access() would error out, blocking the reboot. When restarted, the network would not be changed, so no reboot would be requested, leaving the machine in an innaccesible state. 3 years ago
Digimer b2ea4f9adc * Moved System->manage_firewall() to Network->manage_firewall(). Started working on actually implementing it, which involves basically fully rewritting it. 3 years ago
Digimer 1580ffbb24 Added the 'oui' table to the resync list again. 3 years ago
Digimer b154ec816a * Added network_interfaces, bonds, bridges and ip_addresses tables to the age-out list. 3 years ago
Digimer b77bb81343 * Found a bug where, if a record was deleted from the public schema but not from the history schema, and then later a resync was performed, the record would be added to the peer database's public schema (while still not existing locally). This condition should never occur as data in history should only exist to track the public record. This update checks for this condition and purges those records prior to resync'ng a database table. 3 years ago
Digimer 6c5f48e8ca * Fixed a bug (I think) where initial synchronization was failing because the new locking system tried to register a lock against the peer striker before the peer striker was in the DB. 3 years ago
Digimer 911f7cfb6a This is another big commit with a lot of DB work. Getting closer to sorting out the frequent resyncs. 3 years ago
Digimer 24f5d39dff This is a set of changes all stemming from trying to debug frequent resyncs. More bugs still to be fixed. 3 years ago
Digimer 1770e9e0e0 * Fixed a bug where Database resync's where trying to resync tables without history schema entries. 3 years ago
Digimer e6dcff1cf1 * Added a missing modified_date to ip_addresses in Database->get_ip_addresses(). 3 years ago
Digimer 572167d034 * Updated Database->get_storage_group_data() to record the VG name for a given host's VG in a given storage group. 3 years ago
Digimer d26a16e711 * Updated anvil-provision-server to handle human-readable sizes for disk and ram. 3 years ago
Digimer 142be7674e * Fixed a bug in striker-scan-network where the scan wasn't running properly when no network was specifically given. 3 years ago
Digimer c9633aa3b0 * Updated Database->_find_behind_databases() to not run unless it's on a Striker. 3 years ago
Digimer dd9d5e6ba0 Updated resync to no longer be tied to a host_uuid. 3 years ago
Digimer 0b41029db2 Reworked Database->_find_behind_databases to loop through tables, then databases when evaluating for resync. This is still racy but should be less racy as the time between counts of columns for a given table should be a lot shorter. Also re-enabled triggering resyncs based on the age of the most recent record. 3 years ago
Digimer aa7d9bdf14 * Fixed a bug where resync'ing the database was missing tables. 3 years ago
Digimer 74b7719cf5 * Created the new anvil-manage-host that can check/set if a host is configured. On Strikers, it can age out data, resync data, and check/set if the local database is active. 3 years ago
Digimer 8fbf594002 Updated striker-prep-database to stop -> start postgres post-configure, and to connect -> disconnect to run the schema load logic. 3 years ago
Digimer 422d248cbe * Updated Database->insert_or_update_states() to not actually record unless the state_host_uuid exists in all available databases. 3 years ago
Digimer 7b090e1623 * Updated Database->shutdown() to disconnect, stop the postgresql daemon, then reconnect. 3 years ago
Digimer 513ce3b74e Created 'striker-db-status' that reports the status of the databases to external tools. It's basic, but it works. 3 years ago
Digimer 3fd0db15bf * This rather heavily reworks how database shutdowns works. It adds much more intelligent shutdown, tracking who is using the database, being able to mark a database as "offline" and waiting for users of the database to disconnect before it shuts down. 3 years ago
Digimer b234b79544 Updated anvil-daemon to check if anvil-sync-shared is running if the reported RAM use is too high. If so, it doesn't exit. This fixes an issue where anvil-sync-shared would loop forever as it would constantly be killed when downloading large files. 3 years ago
Digimer ec3b3d2ac9 Fixed a bug in Database->_age_out_data() where checking if a table existed was hard coded to one table. 3 years ago