333 Commits (d7a5ce747165970737b287fdec324fe1e3eb7bc3)

Author SHA1 Message Date
digimer 02c3d204ea * Updated anvil-update-system to set 'job_data' to track reboots, and striker-update-cluster to read it. 1 year ago
digimer 65af56d5bd * Updated Database->insert_or_update_jobs() to not look for jobs that are complete when no job_uuid is passed. 1 year ago
digimer e0316da88b * Got anvil-manage-server-storage working enough to grow existing disk's hard drive sizes, and to insert/eject optical disks. 1 year ago
digimer 47f7a35df3 The main purpose of this commit is to add serial execution of similar jobs to help reduce race conditions for scripted jobs, like multiple server creation. 1 year ago
digimer c82bd9d73a * Created the new anvil-watch-power tool that shows the status of UPSes known on the system, including their "on battery" state, charge percentage, estimated hold up time, etc. 1 year ago
digimer c50a1936c0 * This adds the new 'file_locations' -> 'file_location_ready' column and associated methods. This is set to TRUE/1 when the file referenced is found on disk and it is the expected size and md5sum. This is meant to allow programs to wait/watch or a file to be ready if they need to use it. Files are now checked periodically via anvil-daemon. 2 years ago
digimer 510db70253 Another attempt to resolve the stoage group race condition. This moves the check for auto-assembly to scan-lvm. It only works for the first assemble, after that the user can/should use anvil-manage-storage-groups. 2 years ago
digimer dc7b909bfc More logging to debug storage group race condition 2 years ago
digimer ddc6965b60 * Fixed a bug where references to files on Anvil! nodes was broken in anvil-provision-server and anvil-manage-files. 2 years ago
digimer efebd135eb * Removed more references to 'dr1_host_uuid' from the old way of linking DR hosts to Anvil! nodes. 2 years ago
digimer 41fb8baeda * Fixed a bug in Database->get_storage_group_data() that was deleting DR host storage group members. 2 years ago
digimer 8ff40ec42c * Fixed a SQL query bug in Database->get_drbd_data(). 2 years ago
digimer 040bc02e26 * This adds the new Database->get_drbd_data() that, like ->get_lvm_data, collates the DRBD data collected by scan-drbd into more readibly parsable data structure. 2 years ago
digimer 8e0e51544c * Continued work on anvil-manage-server-storage. 2 years ago
digimer b144976853 This resolves Issue #310. 2 years ago
digimer 254f7ef4e2 This should fix the tracking of what files belong where, using the new DR links system. It also should finish (though testing is still needed) the serial rsync issue. 2 years ago
digimer 645f54ab89 This commit has more changes than I would normally like, but it's all linked to changing file uploads to rsync serially. 2 years ago
digimer 9751c883cb * Updated Cluster->assemble_storage_groups() to remove refrences to anvil_dr1_host_uuid. Also added the logic for auto-adding DR host's VGs to a storage group. Commented it out though as, for now, this might be a bad idea. Needs more thought. 2 years ago
digimer e012d6016c Tha major point of this commit is to add the new 'anvil-manage-storage-groups' program that, well, manages storage groups. 2 years ago
digimer 1a217d21cf * Updated anvil-manage-dr to provide the ability to link anvil nodes to dr hosts. Also began work on making it work with the new DR links system. 2 years ago
digimer 16fc4e131c * Fixed a bug where, if a specific request to do a DB resync was made but the active_uuid wasn't matching the host, it wouldn't resync. This broke peering Strikers when the peer source was not the active_uuid. 2 years ago
digimer 985338a064 Fixed typo that broke compilation. 2 years ago
digimer 17863404e3 * Updated Database->_age_out_data() to only run once per day, unless explicitely called with --age-out-database. 2 years ago
digimer 3d6f71f27e * Updated Database->connect to clean up duplicates on setting the read UUID and database handle. 2 years ago
digimer 26a1fe1491 * Updated Database->connect() to allow local reads on strikers, regardless of the active DB. 2 years ago
digimer 5fcbb1643c * Updated Database->connect() to set an 'active_uuid', and the host with that UUID will be the only one to do resyncs. This might help with frequent resyncs, which could be caused by simultaneous resyncs happening on both nodes stepping on each other. This should help with issue #276 2 years ago
digimer 6ca0e0da90 * Updated Database->connect() to only try to load from dump files if 2+ databases are configured in striker. 2 years ago
digimer a3988cc3e5 * Added System->configure_logind() to ensure that nodes are configured to ignore ACPI power button events so that IPMI-based fences work immediately. 2 years ago
Digimer 9194eb3d09 * Updated System->check_if_configured() to record that a host is configured in /etc/anvil to make the system auto-mark as configured if the host is removed from the DB (or, more specifically, variables -> system::configured is lost). 2 years ago
Digimer f9ca6fb170 * This adds the new anvil-version-change tool which anvil-daemon will call on startup to handle checks for changes made over releases/updates. 2 years ago
Digimer 33b4516dea Fix a variable quoting bug in Database->locking(). 2 years ago
Digimer 622fb84652 * Renamed the 'notifications' table to 'alert-override', better reflecting what it does. 2 years ago
Digimer a6cd5c6604 * Starting work in the new anvil-manage-alerts, which will (when done), allow for management of mail servers, alert recipients, notification over-rides and to trigger test alerts. 2 years ago
Digimer 3b721b849c * Fixed a bug in anvil-configure-host where if the same MAC address was assigned to two interfaces, it would cause an endless reboot loop. 2 years ago
Digimer 99a6593fe6 * Fixed a bug when connecting to databases when one DB has no variable entries, making it seem like a DB was disabled. 2 years ago
Digimer 4ecc6097d3 * Cleaned up some old 'die' calls with better nice_exit() calls to help avoid dangling db_in_use flags. 2 years ago
Digimer ef3ac86162 * Fixed a bug where setting the db_in_use flag without a valid $ENV{_}. 2 years ago
Digimer 21738ab0d4 Added a bit more logging to the Database->mark_active method. 2 years ago
Digimer a81478f2bc * Updated 'db_in_use' state to add the caller's name to the state name. This is pulled out when logging stale locks that are being reaped, to help debug where stale locks are coming from. 2 years ago
Digimer e7cf8ac789 * Got more work done on anvil-manage-files. It now picks up new files on nodes/dr hosts in an Anvil! and downloads them if needed. 2 years ago
Digimer cd220e97dc Disabled striker-prep-databas and set Database->configure_pgsql() calls to use debug => 2. 2 years ago
Digimer 7fd6185445 * Disabled firewalling for now. There appears to be an issue starting up with DRBD. 2 years ago
Digimer 171ea74000 * There is a fix in this commit to resolve a race condition where, when reconfiguring the network, the request to set a job to reboot would fail because the connections to all Strikers could be lost, causing Database->_test_access() would error out, blocking the reboot. When restarted, the network would not be changed, so no reboot would be requested, leaving the machine in an innaccesible state. 2 years ago
Digimer b2ea4f9adc * Moved System->manage_firewall() to Network->manage_firewall(). Started working on actually implementing it, which involves basically fully rewritting it. 2 years ago
Digimer 1580ffbb24 Added the 'oui' table to the resync list again. 2 years ago
Digimer b154ec816a * Added network_interfaces, bonds, bridges and ip_addresses tables to the age-out list. 2 years ago
Digimer b77bb81343 * Found a bug where, if a record was deleted from the public schema but not from the history schema, and then later a resync was performed, the record would be added to the peer database's public schema (while still not existing locally). This condition should never occur as data in history should only exist to track the public record. This update checks for this condition and purges those records prior to resync'ng a database table. 2 years ago
Digimer 6c5f48e8ca * Fixed a bug (I think) where initial synchronization was failing because the new locking system tried to register a lock against the peer striker before the peer striker was in the DB. 2 years ago
Digimer 911f7cfb6a This is another big commit with a lot of DB work. Getting closer to sorting out the frequent resyncs. 2 years ago
Digimer 24f5d39dff This is a set of changes all stemming from trying to debug frequent resyncs. More bugs still to be fixed. 2 years ago