Commit Graph

749 Commits

Author SHA1 Message Date
Digimer
27259d1d53 * Finished anvil-rename-server!
* Created Storage->delete_file() that, well, deletes files (locally or on a peer).

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-22 13:29:50 -04:00
digimer-bot
2f17a0d402
Merge pull request #77 from ClusterLabs/anvil-tools-dev
* Updated DRBD->gather_data() to store data on peers so that the peer…
2021-04-20 23:25:53 -04:00
Digimer
53cd0bdf3a * Now with 100% less typos.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-20 23:22:45 -04:00
Digimer
e3ba64cb83 * Fixed a type in the Makefile.am.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-20 23:00:03 -04:00
Digimer
2e37691116 * Updated DRBD->gather_data() to store data on peers so that the peer's LV path and backing disk is recorded. Also fixed a bug in ->get_status() where the return code for local calls was stored as a host name.
* Added the scan-hpacucli scan agent. It's been done for a while and should have been added ages ago.
* Updated anvil-rename-server to get to the point where it will take down the DRBD resources on all machines, but waits if there is a sync under way. It also verifies that the server is off on all systems from virsh's perspective.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-20 22:46:51 -04:00
digimer-bot
591b550085
Merge pull request #76 from ClusterLabs/anvil-tools-dev
* Finished anvil-migrate-server and anvil-safe-start! Lots of testing…
2021-04-19 00:36:40 -04:00
Digimer
711a04999e * Finished anvil-migrate-server and anvil-safe-start! Lots of testing still needed for both though, and 'anvil-safe-start' does run as a job yet, but the logic is all there.
* Fixed a bug in Cluster->migrate_server() where waiting for the server to migate would never exit.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-19 00:32:13 -04:00
digimer-bot
a6a11abe01
Merge pull request #75 from ClusterLabs/anvil-tools-dev
Anvil tools dev
2021-04-18 20:02:50 -04:00
Digimer
eec14cb013 * Finished tools/anvil-boot-server and tools/anvil-shutdown-server.
* Fixed a bug where, in rare cases, $anvil->hostname() would call 'hostnamectl' and get a dbus error during shutdown, which would then cause the hostname to be changed to the error in the database.
* Fixed a bug in Cluster->boot_server() where it would never verify that a server has started successfully.
* Updated Database->get_ip_addresses() to store the IPs we manage in 'ip_addresses::<ip_address_address>::X'.
* Updated ocf:alteeve:server to work from command line calls, though more testing is still needed.
* Started work on 'anvil-rename-server', but haven't gotten far with it yet.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-18 19:54:58 -04:00
Digimer
a480357049 * Fixed a bug in Cluster->assemble_storage_groups() where, if a group is created during an anvil-provision-server run, the group would get created multiple times.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-15 18:51:10 -04:00
digimer-bot
83b140511e
Merge pull request #74 from ClusterLabs/anvil-safe-start-work
Anvil safe start work
2021-04-15 02:43:51 -04:00
Digimer
b36093671b * Updated Database queries that were passing 'debug => $debug' to not do that, as it was causing far too much (useless) noise in the logs.
* Turned on print to console for logging in anvil-provision-server. Also updated it to check if the cluster is running and hold until it is.
* Cleaned up some code in Get->available_resources() that proved hard to debug.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-15 02:35:58 -04:00
Digimer
798518ba5e * While working on the boot/shutdown server tools, ran into and fixed a bug where files uploaded before an Anvil! was added could not have those files sync'ed. This was fixed though the new Database->check_file_locations() method.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-14 22:56:18 -04:00
digimer-bot
ff5cefd1c2
Merge pull request #73 from ClusterLabs/anvil-safe-start-work
* Got anvil-safe-start to the point where is starts the cluster stack…
2021-04-14 00:35:20 -04:00
Digimer
426e16fbdf
Merge branch 'master' into anvil-safe-start-work 2021-04-14 00:32:40 -04:00
Digimer
e036515df3 * Got anvil-safe-start to the point where is starts the cluster stack. Need to create the 'anvil-boot-server' and 'anvil-shutdown-server' before it can be completed, so those files have been added.
* Created Cluster->parse_quorum() to check if a node is quorate as 'have-quorum' in the pacemaker CIB doesn't appear to be super accurate during startup.
* Fixed a bug in striker-manage-install-target where if a node didn't have any registered IPs, it would break before generating the repo data.
* Fixed a bug in anvil-join-anvil where if the database had to be reconnected, the job data was lost.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-14 00:26:06 -04:00
digimer-bot
9ddd649383
Merge pull request #72 from ClusterLabs/anvil-safe-start-work
* Continued work on anvil-safe-start. Got it to the point where it de…
2021-04-12 20:52:53 -04:00
Digimer
faf1399440 * Continued work on anvil-safe-start. Got it to the point where it detects shared networks with its peer node and waits for all networks to be up.
* Fixed a bug in scan-drbd where the volume_uuid wasn't being stored in the proper hash, breaking insertions into scan_drbd_peers in some cases.
* Updated System->pids() to work with remote targets (will be used later to check for parallel runs of anvil-safe-start).

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-12 20:46:30 -04:00
digimer-bot
c745a13991
Merge pull request #71 from ClusterLabs/anvil-safe-start-work
* Started work on anvil-safe-start. The enable/disable logic and how …
2021-04-12 00:37:38 -04:00
Digimer
15e71768a1 * Started work on anvil-safe-start. The enable/disable logic and how it runs automatically is controlled by the database and the tool can be used to control anvil-safe-start on both the local and peer node. It will be started by ScanCore, if scancore starts within 10 minutes of the node booting. It will always be able to run manually.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-12 00:28:24 -04:00
digimer-bot
75343aadff
Merge pull request #70 from ClusterLabs/webui_anvil_page
* Finished the 'get_X' enpoints so far defined. Added get_servers and…
2021-04-11 16:32:03 -04:00
Digimer
942e0f66bf * Finished the 'get_X' enpoints so far defined. Added get_servers and completed get_status
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-11 16:26:42 -04:00
digimer-bot
9a7d9f235c
Merge pull request #69 from ClusterLabs/webui_anvil_page
* Fixed a typo that broke compiling anvil-daemon in the last commit. …
2021-04-10 01:32:07 -04:00
Digimer
5f0b7740e2 * Fixed a typo that broke compiling anvil-daemon in the last commit. Yay for CI/CD!
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-10 01:28:12 -04:00
digimer-bot
9aabf27fe6
Merge pull request #68 from ClusterLabs/webui_anvil_page
* THe get_cpu endpoint was completed.
2021-04-09 21:48:06 -04:00
Digimer
2384c44544
Merge branch 'master' into webui_anvil_page 2021-04-09 21:43:11 -04:00
Digimer
fb0836f912 * THe get_cpu endpoint was completed.
* The get_mmeory endpoint was completed.
* The get_replicated_storage endpoint was completed, though it requires testing and likely has issues.

To prepare for the get_status endpoint work, I needed to update ScanCore and modules to track the host_status. This commit contains the work needed for this.
* Updated ScanCore->post_scan_analysis_striker() to use configured fence devices (except PDUs) to check if a target host is off or on, in there is no host_ipmi interface. In all cases, if a machine can be confirmed on or off, the host_status is now updated.
* To support the above fence based power checks, updated scan-cluster to store the on-disk CIB in the new scan_cluster -> scan_cluster_cib colume.
* Updated ScanCore->parse_cib() to map stonith primitive IDs to fence agents. Updated ->parse_crm_mon() to not call if the executable doesn't exist to avoid unhelpful error messages in the logs when called from a Striker.
* Update DRBD->gather_data() to get the size data from /sys/block/drbd<minor>/size' x '/sys/block/drbd<minor>/queue/logical_block_size so it works when a device is Secondary (and can't be promoted).
* Updated Database->get_hosts_info() to record the short host name as well as the stored host name. Created ->update_host_status() as a wrapper to ->insert_or_update_hosts() that only updates the host status.
* Updated anvil-join-anvil to disabled ksm and ksmtuned daemons.
* Updated scancore and anvil-daemon to set the host_status to 'online' on startup.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-09 20:51:29 -04:00
digimer-bot
b5aa81471c
Merge pull request #66 from ClusterLabs/webui_anvil_page
Webui anvil page
2021-04-02 22:26:41 -04:00
Digimer
c2fe3a2f0a * Finished (initial) get_shared_storage.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-02 22:22:07 -04:00
Digimer
fa3c861a97 * Started work again on get_shared_storage
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-02 18:31:54 -04:00
digimer-bot
59cee0185f
Merge pull request #65 from ClusterLabs/scancore-debugging
* Fixed a bug that caused striker-initialize-host to not compile / run.
2021-04-01 11:42:10 -04:00
Digimer
70aa6a7a5b
Merge branch 'master' into scancore-debugging 2021-04-01 11:39:00 -04:00
Digimer
cd87c0f521 * Fixed a bug that caused striker-initialize-host to not compile / run.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-01 11:35:44 -04:00
digimer-bot
f93a5eccc9
Merge pull request #64 from ClusterLabs/scancore-debugging
* Created Storage->manage_lvm_conf() that checks / updates lvm.conf t…
2021-04-01 00:04:29 -04:00
Digimer
70dc0598f2 * Created Storage->manage_lvm_conf() that checks / updates lvm.conf to add a filter to avoid seeing DRBD devices as LVM components. This is now called from striker-initialize-host and scan-drbd.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-31 23:59:19 -04:00
digimer-bot
c2386fabd4
Merge pull request #63 from ClusterLabs/webui_anvil_page
Webui anvil page
2021-03-31 12:22:35 -04:00
Digimer
5640eda9f2 * Fixed typos for scan-filesystems in Makefile.am
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-31 12:15:40 -04:00
Digimer
82fa42fe83 * Added scan-filesystems the Makefile.am
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-31 11:56:55 -04:00
Digimer
e1a841b76f
Merge branch 'master' into webui_anvil_page 2021-03-31 01:42:27 -04:00
Digimer
59b867cc25 * Updated DRBD->gather_data() to check if drbdadm exists before trying to call it to avoid scary errors in the logs. Also moved some strings that pulled from the scan-drbd agent into the main words file.
* Fixed a bug in ScanCore->agent_startup() where a (thankfully broken) check to append tables to the 'sys::database::check_tables' would cause an infinite loop as both were pointers to the same anonymous array.
* Fixed a bug in scan-ipmitool where the scan_ipmitool_variables table didn't use a host_uuid reference, causing resyncs of that table to sync for all hosts and cause DB errors when the scan_ipmitool record from another host wasn't sync'ed yet.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-31 01:36:11 -04:00
digimer-bot
d96d351bb0
Merge pull request #61 from ClusterLabs/webui_anvil_page
* Fixed a bug where the time check to trigger a rescan test was being…
2021-03-31 00:43:07 -04:00
Digimer
581778b507 * Fixed a bug where the time check to trigger a rescan test was being turned into a string.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-30 23:58:30 -04:00
digimer-bot
50041fadd9
Merge pull request #60 from ClusterLabs/webui_anvil_page
* Fixed bugs in scan-apc-ups and scan-apc-pdu that allowed PDUs and U…
2021-03-30 23:56:52 -04:00
Digimer
bd8021cc7e
Merge branch 'master' into webui_anvil_page 2021-03-30 23:51:26 -04:00
Digimer
48d7a8d611 * Fixed bugs in scan-apc-ups and scan-apc-pdu that allowed PDUs and UPSes to be recorded duplicate times in the database. Fixed multiple bugs in scan_apc_ups from when we cloned PDU as it's base.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-30 23:40:42 -04:00
digimer-bot
3c7f202f47
Merge pull request #59 from ClusterLabs/webui_anvil_page
* Updated Database->connect to track previous connected DB count to c…
2021-03-30 18:17:55 -04:00
Digimer
265e3c74d6 * Updated Database->connect to track previous connected DB count to current one (only useful for daemons). If the connection count has not changed, a check for resync is not performed.
* Updated Database->_find_behind_databases() to not trigger a resync if the only difference in a table is the last-updated time and the difference is less than ten seconds. This should dramatically cut back on unnecessary resyncs and reduce load.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-30 16:20:04 -04:00
digimer-bot
006737bfd9
Merge pull request #58 from ClusterLabs/webui_anvil_page
* Fixed a bug in Convert-round() where the requested number of digits…
2021-03-27 00:23:07 -04:00
Digimer
9fa24750d6 * Fixed a bug in Convert-round() where the requested number of digits after the decimal place was coming back one too long. Also added logging that should have been there for a while now.
* Finished scan-filesystems!
* Realized that filesystem UUIDs are not always actual UUIDs, and so created an additonal column in filesystems -> filesystem_internal_uuid and created a normal filesystem_uuid table.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-26 23:37:01 -04:00
digimer-bot
079f724e77
Merge pull request #56 from ClusterLabs/webui_anvil_page
* Updated Database->manage_anvil_conf() to not manually create a back…
2021-03-23 06:10:38 -04:00