61 Commits (560a0f9158aff6e6117099d5ab0a703dd0cc689e)

Author SHA1 Message Date
digimer efebd135eb * Removed more references to 'dr1_host_uuid' from the old way of linking DR hosts to Anvil! nodes. 2 years ago
digimer 9751c883cb * Updated Cluster->assemble_storage_groups() to remove refrences to anvil_dr1_host_uuid. Also added the logic for auto-adding DR host's VGs to a storage group. Commented it out though as, for now, this might be a bad idea. Needs more thought. 2 years ago
digimer e012d6016c Tha major point of this commit is to add the new 'anvil-manage-storage-groups' program that, well, manages storage groups. 2 years ago
digimer 9d2f9c4d88 * Fixed a string key name typo. 2 years ago
digimer a3988cc3e5 * Added System->configure_logind() to ensure that nodes are configured to ignore ACPI power button events so that IPMI-based fences work immediately. 2 years ago
Digimer 4ba1982183 This is the start of a set of changes needed to rework how we handle DRBD fence requests, so that they create location constraints instead of triggering a full stonith fence. 2 years ago
Tsu-ba-me c413e62798 fix(striker-ui-api): pass Remote->test_access() user to Cluster->get_primary_host_uuid() 2 years ago
Digimer bde0b2e7ec * Fixed a bug where deleting ports from a fence device in an Install Manifest would not cause the fence methods to be removed from the associated cluster. 2 years ago
Digimer d271ffec26 * Updated Cluster->parse_crm_mon() to record the role of stonith resources. 2 years ago
Digimer d8f31d9d84 * Added the anvil-boot-server man page. 2 years ago
Digimer 1e159f548e Added a couple notes for later dev. 4 years ago
Digimer 0c77736dc8 * Fixed a bug in Cluster->manage_fence_delay() where removing the 'delay="15"' attribute was failing, now set it to 0 instead. 4 years ago
Digimer 7e7b91b286 * Updates anvil-join-anvil to update corosync.conf to use the BCN1 link as the main knet network with the SN1 link as the backup link. 4 years ago
Digimer 607c097fc8 * Fixed a bug where, once a DRBD resource was allowed to be dual-primary for migration, that wasn't properly disabled post-migration. 4 years ago
Digimer d3052c0229 * Finished Cluster->check_server_constraints() and added it to scan-cluster. This now makes sure servers don't roll back to their old host after it has been fenced and recovers. 4 years ago
Digimer 5a343d6d75 * WIP; Started work on Cluster->check_server_constraints() that will track when a server's location constraint needs to be updated when the old preferred node is lost. 4 years ago
Digimer b71ed28f64 * Added Cluster->manage_fence_delay() that reports back and, optionally, sets a preferred node in a fence race. 4 years ago
Digimer 80bdac8e34 * Updated the pacemaker server config to drop the stop timeout to 5 minutes and the migration timeout to 10 minutes. This will avoid blocking the entire cluster when a stop or migrate operation times out. Will update scan-server to clean these up when they happen. 4 years ago
Digimer 16c20ae69c * Updated Tools->catch_sig() to use return code 0 instead of 255 so that systemd doesn't think our daemons failed on stop. 4 years ago
Digimer fc0954d0c8 * Started work on, but not at all finished, anvil-manage-server which will allow manipulation of a server's resources. 4 years ago
Digimer 4a87ee71db * This commit started with work on webui endpoint set_power, but then switched to scancore debugging and I neglected to switch branches. 4 years ago
Digimer 416f51323a * Created tools/striker-boot-machine to, well, boot machines. It uses host_ipmi or, failing that, other fence methods when available to boot a node. 4 years ago
Digimer ca7052dd53 The core logic is done!!!! Still need to finish end-points for the WebUI to hook into, but the core of M3 is complete! Many, many bugs are expected, of course. :) 4 years ago
Digimer 3a6902d899 * Made good progress on anvil-safe-stop. It will now stop or migrate servers (testing needed). 4 years ago
Digimer 711a04999e * Finished anvil-migrate-server and anvil-safe-start! Lots of testing still needed for both though, and 'anvil-safe-start' does run as a job yet, but the logic is all there. 4 years ago
Digimer eec14cb013 * Finished tools/anvil-boot-server and tools/anvil-shutdown-server. 4 years ago
Digimer a480357049 * Fixed a bug in Cluster->assemble_storage_groups() where, if a group is created during an anvil-provision-server run, the group would get created multiple times. 4 years ago
Digimer b36093671b * Updated Database queries that were passing 'debug => $debug' to not do that, as it was causing far too much (useless) noise in the logs. 4 years ago
Digimer e036515df3 * Got anvil-safe-start to the point where is starts the cluster stack. Need to create the 'anvil-boot-server' and 'anvil-shutdown-server' before it can be completed, so those files have been added. 4 years ago
Digimer fb0836f912 * THe get_cpu endpoint was completed. 4 years ago
Digimer 5536e8ff47 * Updated Cluster->assemble_storage_groups() and Cluster->anvil_name_from_uuid() and ->available_resources() to try to detect the anvil_uuid if not passed in. 4 years ago
Digimer 0ec1bf6b6a * Updated DRBD->delete_resource() to return a success if asked to delete a non-existent resource (as can happen when partial anvil-delete-server runs are re-run). 4 years ago
Digimer 4b9ec56106 * Updated DRBD->delete_resource() to return a success if asked to delete a non-existent resource (as can happen when partial anvil-delete-server runs are re-run). 4 years ago
Digimer 864d67b0a7 * Finished fixing automatic building of Storage Groups on systems where VGs are deleted. 4 years ago
Digimer 413a4f73c2 * Updated Tools->_anvil_version() and Get->anvil_version() to now pick up a SchemaVersion from anvil.sql. This will change only when the schema changes and is used when Database->connect() is checking compatibility with other anvil database hosts. This will make it only break connection when there is a reason to do so. The anvil_version still remains as an informational version that will help when supporting users later. 4 years ago
Digimer 89dec8e1f9 * Finished anvil-delete-server! (More testing needed though) 4 years ago
Digimer 549dbad635 * Created Cluster->delete_server(), which deletes a server resource from pacemaker (stopping it first, if needed). 4 years ago
Digimer 05b1fccdb3 * Created Cluster->add_server() which, well, adds a server to a pacemaker cluster, including sorting out location constraints to favour the node the server is running on, if it's running. 4 years ago
Digimer a7f0676a0f * Got the 'anvil-provision-server' script to the point where it actually saves the new server job. 4 years ago
Digimer f30cce3c5a * Created the new tools/anvil-provision-server tool which will handle provisioning new servers, as well as having an interactive menu system to provision servers from the command line. 4 years ago
Digimer d677d19ca0 * Moved Database->check_condition_age to Alert. 4 years ago
Digimer 33101f969a * Fixed several bugs related to tracking server boots, migrations and shut downs in the anvil database. The 'ocf:alteeve:server' now has (mostly?) safe integration with the Anvil! database. This was mostly done by updating Servers->boot_virsh(), ->shutdown_virsh() and ->migrate_server(). 4 years ago
Digimer 262cbccb35 * Finished scan-server, though lots of testing needed. 4 years ago
Digimer 46f1a05789 * Got the code in scan-server to the point where it _should_ now gracefully and automatically detect changes to a server's definition originatin from the database (via Striker), directly editing the on-disk definition file, or editing via libvirt tools (like virt-manager). Still needs to be tested though. 4 years ago
Digimer 1a1fa7ce88 * Created Cluster->get_anvil_uuid() that returns the 'anvil_uuid' of a given 'host_uuid'. 4 years ago
Digimer e6e4c7d530 * Moved Server->_parse_definition() to -> parse_definition() to make it a publid method. 4 years ago
Digimer e240a32a19 * Created Cluster->parse_crm_mon and updated Cluster->parse_cib() to determine what state a server is in and which host has a server. 4 years ago
Digimer 4dfe0cb5a0 * Created Cluster->boot_server, ->shutdown_server and ->migrate_server methods that handle booting, migrating and shutting down servers. Also created the private method ->_set_server_constraint which is used by migrate and boot to set resource constraints to control where a server boots or migrates to. 4 years ago
Digimer 0f7267eae1 * Moved the '_host_name', '_short_host_name', and '_domain_name' private methods in Tools.pm over to Get.pm (removing the leading '_' in the method names). 4 years ago
Digimer b2c7fd95fb * Renamed the ScanCore unit file to scancore. 4 years ago