530 Commits (99dc4ba6ba5c13af72554a70aa7d2d1ccdbd6bbd)

Author SHA1 Message Date
digimer b0c54b6dae * Updated anvil-update-system to check if another instance of anvil-update-system is running and, if so, exit. 1 year ago
digimer 7bd76c10dc Major thing in this commit is reworking striker-update-cluster to work without expecting anvil-daemon to be running on target machines. Similarly, they had to be able to work when the Striker DBs were not available. This is to account for cases where the Striker dashboards have updated, and the schema has changed, preventing the not-yet-updated DR hosts and subnodes from being able to use the DB. To do this, anvil-safe-stop, anvil-update-system, and anvil-shutdown-server had to be updated to use the new --no-db switch, which tells then to run without the database being available. 1 year ago
digimer 9bc78860a6 * Updated anvil-update-system to detect kmod-drbd upgrade problems and fix them. 1 year ago
digimer 42b44ac864 * Updated the log showing why anvil-daemon isn't exiting when a job is running with the job's current progress. 1 year ago
digimer d741f4aa6f * Updated anvil-daemon to not exit on high RAM use is any job is running. 1 year ago
digimer 751687129a * Updated anvil-daemon to not exit on RAM use if anvil-update-system is running. 1 year ago
digimer 3016fb875b * Reworded striker-update-cluster to use anvil-update-system for on-system OS updates. 1 year ago
digimer 1b8b0bc493 * Created the new 'anvil-manage-server-storage' with the first role of reload a DRBD resource. 1 year ago
digimer ea95d26cc5 * Fixed a bug in DRBD->get_next_resource() where reserved minor numbers were not being released. Also added a new parameter, "minor_only", that returns the next minor number but doesn't bother processing TCP ports. 1 year ago
digimer 88cc76914d This is an attempt to fix issue #341. It replaces the search for SN IPs from Network->find_matches() to Network->find_access(). The later of which doesn't care about the interface the IP was found on. 1 year ago
digimer c9e11fbbfc * Added checks to anvil-provision-server to fail out if either of the SN IPs are not found when generating a DRBD resource config. 1 year ago
digimer 156a0ca201 Updated anvil-daemon's new job launching logic to allow the restart of a running job that failed out early. 1 year ago
digimer 47f7a35df3 The main purpose of this commit is to add serial execution of similar jobs to help reduce race conditions for scripted jobs, like multiple server creation. 1 year ago
digimer b6a249d5e7 * Updated Cluster->add_server() to set the preferred host based first on if the server is running on a node, and if not, on the primary node (where before it defaulted to node 1). 1 year ago
digimer b7abc481e6 Updated scan-cluster to check to see that migrate_to and migrate_from are given a timeout of 600s and an on-fail of "block". Updated Cluster->add_server() to set migrate_from to timeout=600s and on-fail=block as well. 1 year ago
digimer c82bd9d73a * Created the new anvil-watch-power tool that shows the status of UPSes known on the system, including their "on battery" state, charge percentage, estimated hold up time, etc. 1 year ago
digimer 0e57836c8f This commit addresses (hopefully) issue #329. 1 year ago
digimer 110dceb55e * Added a check to make sure files were ready before provisioning a server. 2 years ago
digimer c50a1936c0 * This adds the new 'file_locations' -> 'file_location_ready' column and associated methods. This is set to TRUE/1 when the file referenced is found on disk and it is the expected size and md5sum. This is meant to allow programs to wait/watch or a file to be ready if they need to use it. Files are now checked periodically via anvil-daemon. 2 years ago
digimer 895f1ec262 This fixes a race condition when multiple servers are provisioned at (nearly) the same time. 2 years ago
digimer 0874ad571a Updated anvil-safe-start to not give up on starting corosync/pacemaker if it fails on the first try. 2 years ago
digimer 83a527f4fa * Removed enabling anvil-safe-start out of the RPM and into anvil-join-anvil. 2 years ago
digimer 89eae7098e NOTE: This updates the reserved RAM to 8 GiB from 4 GiB! 2 years ago
digimer f9689a7106 Updated ocf:alteeve:server to look for /tmp/<resource>.fail' and, if that file exists, exits with rc:1. This is done to allow for testing. 2 years ago
digimer cf73d8ed36 * Updated System->configure_ipmi() to auto-configure DR hosts once they've been assigned a BCN IP address. 2 years ago
digimer efebd135eb * Removed more references to 'dr1_host_uuid' from the old way of linking DR hosts to Anvil! nodes. 2 years ago
Fabio M. Di Nitto 856809c723 Fix typo in log message 2 years ago
Fabio M. Di Nitto a6f2c2271e Fix typo in log message 2 years ago
digimer b144976853 This resolves Issue #310. 2 years ago
digimer 645f54ab89 This commit has more changes than I would normally like, but it's all linked to changing file uploads to rsync serially. 2 years ago
digimer 7773e5f9b8 * Updated logging in DRBD->get_devices(). 2 years ago
digimer e012d6016c Tha major point of this commit is to add the new 'anvil-manage-storage-groups' program that, well, manages storage groups. 2 years ago
digimer f8743a7435 * Further work on anvil-manage-dr. Now properly sanity checks that a valid server is passed. 2 years ago
digimer 1a217d21cf * Updated anvil-manage-dr to provide the ability to link anvil nodes to dr hosts. Also began work on making it work with the new DR links system. 2 years ago
digimer 17863404e3 * Updated Database->_age_out_data() to only run once per day, unless explicitely called with --age-out-database. 2 years ago
digimer ff69916a85 * Applied typo fixed from PR #286 (thanks, Deezzir!). Also moved all the raw prints into words.xml. 2 years ago
digimer 9d2f9c4d88 * Fixed a string key name typo. 2 years ago
digimer b8b4352117 * Added support for Migration Network configs in old striker and anvil-configure-host 2 years ago
digimer b27a43eaf7 * Updated striker to only require 6 interfaces when configuring a node. 2 years ago
digimer 0fa6ddebc5 Updated scan-network to see an interface state of 'activated' as up (used to check specifically for 'active'). 2 years ago
digimer a3988cc3e5 * Added System->configure_logind() to ensure that nodes are configured to ignore ACPI power button events so that IPMI-based fences work immediately. 2 years ago
digimer dfa93a1837 * Added 'setsid' to all 'virsh' calls as nested calls (ie: crm_resource -> ocf:alteeve:server -> virsh) would fail because virsh couldn't connect to a terminal. See: 2 years ago
digimer b666caec64 * Updated anvil-provision-server to handle startup when the peer doesn't create/connect it's DRBD resource (ie: node is offline). 2 years ago
digimer a5cee52153 * Fixed a bug in DRBD->get_devices() where old test host UUIDs were left hard-coded. 2 years ago
Digimer 6d59399c73 * Updated the short OS list. 2 years ago
Digimer f9ca6fb170 * This adds the new anvil-version-change tool which anvil-daemon will call on startup to handle checks for changes made over releases/updates. 2 years ago
Digimer 02e371ac56 Updated virsh OS list. 2 years ago
Digimer f6cbe7d1d2 * Fixed a bug in System->collect_ipmi_data() where double-quoted passwords were preventing reading of the sensor data. 2 years ago
Digimer 4ba1982183 This is the start of a set of changes needed to rework how we handle DRBD fence requests, so that they create location constraints instead of triggering a full stonith fence. 2 years ago
Digimer 6eb99a2168 * FInished the anvil-manage-alerts tool. It can now send test alerts at a user-requested alert level. 2 years ago