77 Commits (da9dc03d04c9d19d4f13f2eb04dbb50f2b551b6f)

Author SHA1 Message Date
Digimer 607c097fc8 * Fixed a bug where, once a DRBD resource was allowed to be dual-primary for migration, that wasn't properly disabled post-migration. 4 years ago
Digimer 5b4bfa747c * Reworked the anvil-join-anvil job parsing to help diagnose occassional faults. Also changed a fatal parse error to one that allows the run to be retried. 4 years ago
Digimer 96fffb0b96 * Finished updating ocf:alteeve:server to no longer require a database connection. To do this, and still be able to track live migration times, the Server->migrate_virsh() method now writes out the server name and migration time to a /tmp/anvil/migration-duration.<server_name>.<unix_time> file. This file is checked for by the scan-server resource agent and, when found, is parsed and the migration duration is recorded, then the file is purged. 4 years ago
Digimer 16c20ae69c * Updated Tools->catch_sig() to use return code 0 instead of 255 so that systemd doesn't think our daemons failed on stop. 4 years ago
Digimer 24ec17f8f7 * Added a new parameter called 'sensitive' to Database->connect() that returns after connections before any ancilliary checks are done, minimizing connect time. 4 years ago
Digimer 7abbc938af * Renamed tools/striker-purge-host to tools/striker-purge-target and moved the code from test.pl over to it. No longer provides interactive selection, but now does work with Anvil! systems as well as hosts. 4 years ago
Digimer 4a87ee71db * This commit started with work on webui endpoint set_power, but then switched to scancore debugging and I neglected to switch branches. 4 years ago
Digimer 3a6902d899 * Made good progress on anvil-safe-stop. It will now stop or migrate servers (testing needed). 4 years ago
Digimer 2e37691116 * Updated DRBD->gather_data() to store data on peers so that the peer's LV path and backing disk is recorded. Also fixed a bug in ->get_status() where the return code for local calls was stored as a host name. 4 years ago
Digimer 711a04999e * Finished anvil-migrate-server and anvil-safe-start! Lots of testing still needed for both though, and 'anvil-safe-start' does run as a job yet, but the logic is all there. 4 years ago
Digimer eec14cb013 * Finished tools/anvil-boot-server and tools/anvil-shutdown-server. 4 years ago
Fabio M. Di Nitto 8f9892650b [build] first pass at adding a build system to integrate with CI 4 years ago
Digimer 413a4f73c2 * Updated Tools->_anvil_version() and Get->anvil_version() to now pick up a SchemaVersion from anvil.sql. This will change only when the schema changes and is used when Database->connect() is checking compatibility with other anvil database hosts. This will make it only break connection when there is a reason to do so. The anvil_version still remains as an informational version that will help when supporting users later. 4 years ago
Digimer 549dbad635 * Created Cluster->delete_server(), which deletes a server resource from pacemaker (stopping it first, if needed). 4 years ago
Digimer 05b1fccdb3 * Created Cluster->add_server() which, well, adds a server to a pacemaker cluster, including sorting out location constraints to favour the node the server is running on, if it's running. 4 years ago
Digimer 1d03a386d3 * Created Database->get_bridges() that, surprise, loads data from the 'bridges' table. 4 years ago
Digimer d677d19ca0 * Moved Database->check_condition_age to Alert. 4 years ago
Digimer 33101f969a * Fixed several bugs related to tracking server boots, migrations and shut downs in the anvil database. The 'ocf:alteeve:server' now has (mostly?) safe integration with the Anvil! database. This was mostly done by updating Servers->boot_virsh(), ->shutdown_virsh() and ->migrate_server(). 4 years ago
Digimer be88be6d30 * Did a bunch of testing / bugfixes for scan-server. 4 years ago
Digimer 46f1a05789 * Got the code in scan-server to the point where it _should_ now gracefully and automatically detect changes to a server's definition originatin from the database (via Striker), directly editing the on-disk definition file, or editing via libvirt tools (like virt-manager). Still needs to be tested though. 4 years ago
Digimer 4dfe0cb5a0 * Created Cluster->boot_server, ->shutdown_server and ->migrate_server methods that handle booting, migrating and shutting down servers. Also created the private method ->_set_server_constraint which is used by migrate and boot to set resource constraints to control where a server boots or migrates to. 4 years ago
Digimer 0f7267eae1 * Moved the '_host_name', '_short_host_name', and '_domain_name' private methods in Tools.pm over to Get.pm (removing the leading '_' in the method names). 4 years ago
Digimer b2c7fd95fb * Renamed the ScanCore unit file to scancore. 4 years ago
Digimer 14bf323627 * Fixed an issue with ocf:alteeve:server where, after a migration, the target host would invoke the RA as if it was trying to migrate, instead of verifying the server (resource) was OK post migration. 4 years ago
Digimer 1498e1b53c * Got server migration working using ocf:alteeve:server in a test environment! 4 years ago
Madison Kelly 30f2b3fa8e * Switched all hash 'local' keys to be the host's short user name. Untested, likely bugs to be fixed in the next commit. 4 years ago
Digimer 47203490a9 * Working on getting live migration to work with ocf:anvil:striker using the environment variables that pacemaker sets. Incomplete, but getting close. 4 years ago
Digimer cc1e0e2f77 * Updated ocf:alteeve:server to properly report a server's status when a monitor action is called. 4 years ago
Digimer 6eeb4e48c7 * Added 'timeout' and 'wfc-timeout' to drbd'd global-common.conf in DRBD->update_global_common(). 4 years ago
Digimer e35800c413 * Fixed up (though more testing/work needed) to ocf:alteeve:server to get it working with DRBD resources referenced using '/dev/drbd/by-res/...'. 4 years ago
Digimer 01974d7efe * Finished (though testing is needed) the updated ocf:alteeve:server resource agent. It now handles starting and stopping libvirtd and drbd daemons on-demand. 5 years ago
Digimer dcd1fd1492 * Created Cluster->check_node_status() that checks the status of a node (in pacemaker). 5 years ago
Digimer 726a4374d1 * Renamed the database table 'host_keys' to 'ssh_keys' to better represent what it stores. 5 years ago
Digimer 934c9b1286 * Updated logging to now log anything with 'priority' set to a new 'anvil.alert.log' file (while still also logging as normal to anvil.log). This should make it easier to watch for alert messages. 5 years ago
Digimer 7cdd2f60e9 * Created Network->download() to handle downloading a file on the local system. Created ->bridge_info() to parse 'bridge' output. Created ->load_ips() to load IP address information from the database (as opposed to ->get_ips() which queries a system). 5 years ago
Digimer 3a86bed694 * Fixed tools/striker-initialize-host so that it set the hostname on the target, not locally. 5 years ago
Digimer bc341809ca * Finished (for now) ocf:alteeve:server! It can boot, migrate and stop a server cleanly. It still checks to see if DRBD needs to be started and does so when needed, but it won't stop it anymore. 5 years ago
Digimer 8a2c86d088 * Renamed striker-configure-host (back) to anvil-configure-host, and started updating it to work on any machine type. 5 years ago
Digimer c0dd34334e * Fixed another bug in making ocf:alteeve:server work in pacemaker. 5 years ago
Digimer ed2e83a1a4 * Fixed a few more bugs in 'ocf:alteeve:anvil', but it's still failing when invoked by pacemaker. 5 years ago
Digimer f5caec52dc * Made DRBD->allow_two_primaries() smarter about finding the 'target_node_id' when it wasn't passed. 5 years ago
Digimer 113a44ecc6 * Got 'migrate_to' working in ocf:alteeve:server. 'migrate_from' still needs work. 5 years ago
Digimer 7db542b9b0 * Fixed a bug where definition files that used '<source file='X'/>' instead of '<source dev='X'/>' for the backing block device for disks. 5 years ago
Digimer 873ed3e2b0 * Fixed some typo bugs. 5 years ago
Digimer dff74102db * Create (but not yet tested) Server->shutdown() to, well, shutdown servers. 5 years ago
Digimer d224be9344 * Created DRBD->manage_resource() that allows for up/down/primary/secondary'ing a resource on a local or remote system. 5 years ago
Digimer b1ddf945e2 * Got ocf:alteeve:server working again to boot servers. It's now smarter, knowing when the server is running locally already (success), running on the other node (hard error) and running on DR (fatal error). 5 years ago
Digimer 324ef351fe * Updated DRBD->get_devices() to properly identify the peer node, when run on an actual node in the cluster (not DR or Striker). 5 years ago
Digimer 16f79ca244 * Created System->get_bridges() that gets a list of bridges (and connected interfaces, and data). Also created ->get_free_memory() that returns the amount of available RAM. 5 years ago
Digimer 4a93682447 * Started rebuilding ocf:alteeve:server using the new module methods. 5 years ago