Commit Graph

828 Commits

Author SHA1 Message Date
digimer
7018703c24 Fixed a bug where the peer subnode would add a server to pacemaker
* Updated anvil-provision-server to only call add_server_to_cluster() if
  it's NOT the peer.
* Added the new 'ok_if_exists' parameter to Cluster->add_server() to
  return 0 if the server already existed in pacemaker as a resource.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
423f677716 Updated Storage->manage_lvm_conf() to only run on EL8.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
943bf2e8d3 Removed the no-longer-needed Network->check_network() method
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
dd0175e05c Now check for/backup/remove ifcfg-X files on EL8 hosts.
* Added caching to System->check_network_type()
* Changed anvil-configure-host job progress steps to 1.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
b0cede49e3 Removed calls to check apache config.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
e03219d1d8 Fixed a bug where non-strikers hung configuring their network.
* Updated Job->update_progress() to log and return if there are not DB
  connections.
* Bumped some logging in Database->connect().
* Deleted ifcfg code from anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
827cf1f331 Fixed a bug that was crashing anvil-daemon
* Network->find_matches() was trying to compare two IPs when the second
  IP wasn't actually defined.
* Disabled scancore's blocking of running before the host is configured.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
282fdbe7e0 Fixed a bug where IPs were being marked repeatedly as DELETEd.
* Database->get_ip_addresses() was marking IPs that weren't on a network
  we managed, the IP would be marked as DELETEd, which caused problems
  with initializing targets, and it generated a lot of repeat alerts.
* Updated logging in Network.pm to help with debugging.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
92ed77e05b Fixed a bug blocking most jobs from running.
* Also updated a bunch of 'apache' ownership calls to now use
  'striker-ui-api'.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
d7aa7966dc Fixed a couple bugs
* Network->collect_data() wasn't deleting old data before rescans.
* anvil-configure-host wasn't checking links that should be in a bond if
  the bond already existed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ff0e6c3575 Updated anvil-daemon to call scan-network if no interfaces exist.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
518fddfa82 More progress on the new NM version of anvil-configure-host
* It's technically done, but I know bugs remain.
* Updated Jobs->update_progress() to take 'file' and 'line' to make it
  easier in the logs to see the origin of the message, when logging the
  update.
* Created Network->modify_connection() to update network manager
  variables. Created ->reset_connection() to take an interface down and
  bring it back up again.
* Fixed a bug in scan-network where the device_to_uuid hash wasn't being
  stored.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
83057d0b45 Fixed several bugs around renaming interfaces
* Also fixed problems with scan-network related to the new network
  naming / NM system.
* Updated Database->insert_or_update_network_interfaces() to better
  search for a network_interface_uuid when not specified.
* Updated Network->collect_data() to take the new 'start' parameter
  which, when set, brings up unconfigured connections/devices.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
71735947dc Created Job->bump_progress() to make advancing job progress easier
* Updated Network->collect_data() to find the GENERAL.DEVICES and
  GENERAL.IP-IFACE from match.interface-name when the link is down.
* More work done on anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
cad524db9d Removed anvil-update-states
* Created new anvil-monitor-network daemon to trigger scan-server via
  anvil-monitor-network on network events.
* Moved functionality into scan-network

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
a27773a69d scan-network now records interfaces, bonds and bridges!
* Much testing still needed, but this is a significant milestone.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
9c67b97fdd Fixed a bug in initializing DROP'ed DBs.
* Got more work done on adding network_interfaces to the database in
  scan-server.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ec11335197 Fixed DB initialization bugs.
* More work done on the new network stack also.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
f575507c1e This begins adding support for EL9.
* Added the 'hostname' and 'hostnamectl --transient' to
  Get->host_name().
* Updated Database->insert_or_update_hosts() to log when no host_name,
  host_type or host_uuid is not passed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
52e7875252 Bumoed logging to find '!!error!!' related parsing errors.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
13a6c44aa4 Updated Cluster->parse_cib() to support new in_ccm and crmd values
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-20 16:33:27 -05:00
digimer
26f4446bf9 Continued work on anvil-watch-servers; Parsed server data now.
* Updated Cluster->parse_cib() to store DRBD fence node restrictions by
  server/node. Also updated to make it easier to get the server's
  preferred node.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-29 01:27:21 -05:00
digimer
207a014ae0 Got anvil-watch-servers showing the status of subnodes.
* Updated System->maintenance_mode() to take 'host_uuid' so that the
  maintenance mode of remote machines can be checked/set.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-27 23:47:29 -05:00
digimer
8ce1f04335 Finished CPU support in anvil-manage-server-system!
* Updated Get->available_resources() to record the maximum cores that
  can be allocated to a server. This is N-1 for hosts with 4 or less
  cores, or N-2 cores otherwise.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-24 16:51:15 -05:00
digimer
b4037fade5 Added RAM change support to anvil-manage-server-system
* Updated Database->insert_or_update_servers() to error if the RAM being
  recorded is less than 640 KiB. This is because, somewhere yet
  undiscovered, the RAM is being recorded in KiB which breaks things.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-23 22:53:36 -05:00
digimer
b5ce2c4871 Added a call to Remote->add_target_to_known_hosts in Remote->test_access
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-23 20:40:57 -05:00
digimer
6bc2601d34 Updated anvil-manage-server-system to change boot device ordering.
* Updated Server->parse_definition() to store a hash to translate a
  device target to a device type.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-23 19:22:46 -05:00
digimer
ef0f04117f Added '--list' to anvil-manage-dr
* Updated Database->get_hosts() to store hosts in a host_type hash.
* Updated Database->get_servers() to store servers by name, regardless
  of host Anvil! node.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-14 17:50:48 -05:00
digimer
d9aa8aee74 Updated anvil-manage-keys to work from the command line
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-13 18:59:24 -05:00
digimer
9bd98951b5 Added backup/restore of partition table in Storage->auto_grow_pv
* Auto-growing PVs (and the backing partition) is now supported by
  anvil-manage-host '--auto-grow-pv'.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-08 20:59:40 -05:00
digimer
edc544255e Rebased with main and resolved conflicts.
This branch resolves issue #462; Auto growing PVs. Specifically, it looks at the LVM PVs on the host and checks to see if there is unused free space after the backing partition. If there is, it auto-grows the partition and then resizes the PV. This featu
re is designed to make life easier for users who deleted the auto-created '/home' partition during the anaconda disk partitioning tool.
* Created Storage->auto_grow_pv() that does the above.
* Added the missing hidden method name _create_rsync_wrapper in the Storage module POD.
* Added a call to Storage->auto_grow_pv() in anvil-configure-host and anvil-version-changes for nodes and DR.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-08 12:00:48 -05:00
digimer
556e892002 Added 'sessions' to the age-out table list
* To help mitigate issue #520.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-03 20:13:47 -04:00
digimer
09fcc45c4a Skipped checking 'sessions' when considering a DB resync.
* This relatesto issue #520

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-03 19:51:00 -04:00
digimer
e0a4bd4d69 Undid the Remote->call to System->call piping
Builds have been failing since this change. Debugging is needed but it's
too large a task for just now.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-02 23:23:25 -04:00
digimer
081c5ea90e Possibly fixed the anvil-delete-server hang bug.
* Updated Server->connect_to_libvirt() to check that the target URI's
  SSH fingerprint is recorded before connecting. Also added an alarm
  wrapper around the Sys::Virt->new() call.
* Continued work on anvil-manage-server-system, working on the boot
  order section now.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-02 15:01:08 -04:00
digimer
747e759248 Updated node restart logic to boot unless power or temps are known bad
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-01 21:01:25 -04:00
digimer
8c97f478a8 Updated Server->update_definition() to undefine a server when needed.
* Boosted logging to debug anvil-delete-server hang in jenkins.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-01 16:06:59 -04:00
digimer
9b55504872 Updated anvil-manage-server-system to update defined servers.
* Updated Server->locate() to take the new 'anvil' parameter to speed up
  searches.
* Updated Server->update_definition() to use Server->locate() to find
  where updates are needed. It now also defines the server with the new
  config.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-01 00:15:13 -04:00
digimer
fd461f940d Fixed a bug in Remote->call()
* If the call to Remote-call() set the target that was actually the
  local short hostname, it would fail to make the call at all. Now if
  the 'target' is local, the shell call is instead passed to
  System->call() instead.
* Cleaned up logging.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-31 13:16:39 -04:00
digimer
49f194eac6 Fixed issue #515; anvil-join-anvil updates hostnames properly now
* Updated Get->host_name() to accept the new 'refresh' parameter. This
  forces a reread of the hostname, instead of using the cached value.
* Updated System->host_name() so that, when it's updating the hostname,
  it updates the database and cached variables.
* Updated Words->center_text() to avoid undefinied parameter issues.
* Updated anvil-join-anvil to ensure the 'sys::host_name' variable.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-27 19:58:37 -04:00
digimer
85f86cda8a Updated Storage->read_file() to test if the target file exists.
This allows for better logging if the file to be copied doesn't exist.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-25 21:21:52 -04:00
digimer
4f6fa4b6ed Working on a bug where broken manifests are saved.
* Updated Striker->generate_manifest() to add pod and make the prefix,
  sequence and domain parameters required.
* Created the check_for_broken_manifests() function for anvil-daemon to
  detect/remove broken manifests.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-24 13:36:30 -04:00
digimer
a5f424d340 Test fix for variable deletion bug.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-23 12:58:15 -04:00
digimer
8bc35b322f More logging to debug variable deletion bug.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-23 00:24:23 -04:00
digimer
56bb18951a Hopefully fixed an empty variable bug in duplicate variable searches.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-22 00:11:24 -04:00
digimer
4d1528f614 Added more logging to debug variable deletion bug.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-21 23:10:31 -04:00
digimer
9ee8f782ee Continuing to try to resolve duplicate variables bug.
* Added a called to Database->_check_for_duplicates to Database->resync_databases
* Added 'check_for_resync => 1' to anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-21 14:31:14 -04:00
digimer
2a3f0bab24 Reworked how and when duplicate variables are checked/cleared.
Moved the logic to a new private method, and call it now from the active
Striker in the once per minute loop. The duplicate variable issue seems
to be not entirely uncommon.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-21 13:33:14 -04:00
digimer
ed269aa450 Fixed a bug where duplicate variables were being created in some cases.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-20 22:05:33 -04:00
digimer
be89bfb438 Updated striker-get-screenshots to create the screenshot directory.
Also fixed a typo in the POD for Storage->make_directory().

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-19 21:36:15 -04:00
digimer
5ec395c53a Reworked DB resync logic.
With this new system, a 'primary_db' is chosen (first connected DB UUID when sorted) and only it does resyncs. Further, resyncs have been pulled from all tools except anvil-daemon. So with this new system, the chances of duplicate, simultaneous resyncs should be removed (hopefully for real this time).

* Database->check_agent_data() no longer calls a resync after loading a
  schema.
* Removed the Database->coonnect() 'all' parameter
* The database used to read from is now always the same as the primary,
  even if there is a local DB.
* Database->connect() 'check_for_resync' parameter can now be set to
  '2', which means "check for resync _if_ I am primary", where '1' still
  checks for resync no matter what.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-19 20:41:57 -04:00
digimer
b1f89c2723 Finished initial version of striker-show-jobs
* Updated Database->get_jobs() to take 'job_host_uuid = all' to allow
  loading jobs from all cluster machines. Also updated it to record the
  'job_host_uuid' and the unix timestamp version of 'modified_date'.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-18 20:51:20 -04:00
digimer
0014cc591d Re-enabled DB connections in ocf:alteeve:server.
Added DB connections to ocf:alteeve:server when starting or stopping
servers. This is to ensure that the servers -> server_state are updated
properly.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-18 20:51:20 -04:00
digimer
b3c067b016 Fixed a bug in anvil-manage-files where missing files weren't being downloaded.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-12 01:01:31 -04:00
digimer
7545df1e55 Fixed a bug in which host runs an anvil-delete-server job.
* Updated anvil-delete-server to use the new Server->locate method. This
  was done as the old Server->locate() was failing to find the server
running on the peer when anvil-delete-server was running on the backup
subnode.
* Updated Server->locate() to search hosts for XML definition and DRBD
  configs so that it can record where the server is recorded to run,
even if the server isn't running or defined at the time the locate ran.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 22:22:06 -04:00
digimer
55b1380031 Finished (but need more testing) of Server->locate().
This includes the changes in PR#492.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
829ae546a2 Beginning work on new Server->locate() method to find servers across an
Anvil! cluster.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
f12e001ac2 Finished Server->connect_to_virsh().
* Now, connecting to virsh can detect when still-open connections
  already exist.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
245f75de9b Added Server->update_definition()
* This takes a server and new definition XML and updated the database and any available hosts. Does not yet update defined or running servers.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
e361d0b424 More progress on anvil-manage-server-system
* It now edits the XML to change the boot menu, but doesn't save yet.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
201cd53265 Improved the logic behind Network->find_target_ip()
* This adds the new 'networks' and 'test_access' parameters to allow
  restricting/ordering matched networks, and adds 'test_access' to
  validate the link is working.
* Continued work on anvil-manage-server-system

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
ed91435211 Fixed a bug where duplicate files weren't actually being deleted.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-30 00:13:56 -04:00
digimer
74ddb7f3a9 Updated Database-get_files() to detect/remove duplicate file entries.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-29 21:28:49 -04:00
digimer
c3fe39e6a8 Added a check file unlinked files.
* On subnodes and DR hosts, a check is made now in Storage->check_files() for files not linked in file_locations. Any found are added, with a check to see if the file already exists locally and, if so, that the md5sum is accurate or not (to set if the file is ready for use or not).

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-29 00:05:29 -04:00
Digimer
745081b649
Merge branch 'main' into patch-screenshot 2023-09-22 23:28:12 -04:00
digimer
c039c58128 * This commit moves taking screenshots of hosted servers onto the strikers using the Sys::Virt module. This was needed because the screenshots were being taken by scan-server, and that was causing it to take a long time to run. It should never have been handled by the scan agent anyway. This update requires a WebUI fix to use the new screenshot tool. This tool also adds holding multiple screenshots to allow users to "scrub" through screenshots up to 10 hours in the past.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-22 17:15:09 -04:00
digimer
8925dabb9d * Updated anvil-shutdown-server to take the new '--immediate' switch which forces a server to shut down immediately (akin to pulling the power on a traditional machine). This is needed to allow a user to recover a crash or hung server.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-21 18:56:12 -04:00
digimer
4646f4a030 * Quieted logging.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-21 16:39:18 -04:00
digimer
580980717d This commit covers the convertion of 'virsh' shell calls to using 'Sys::Virt' module, and fixes several small bugs related to scan-server;
* Switched all calls to virsh to use Sys::Virt to deal with contention of simultaneous virsh calls.
* Removed collecting screenshots from scan-server.
* Fixed a bad variable substitution in an alert.
* Fixed a bug where a server's boot time wasn't being recorded properly.
* Reworked how we determine which server definition was most recently updated and propogated.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-21 15:59:43 -04:00
digimer
e895e1f264 * Finished writting the anvil-manage-server-storage.
* Fixed handling --eject and --insert to work without a device target specified when only one exists, or to find the file path when only the file name is given.
* Updated anvil-manage-server-storage to show files when processing an optical devices without a file being passed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-05 16:53:08 -04:00
digimer
d255adc7b4 * Updated anvil-daemon to set the mode of /mnt/shared/* to 0777 during creation and to check that that mode is set for existing sub-directories. This resolves issue #443.
* Cleaned up anvil-manage-dr.8 hyphen escapes.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-17 22:14:40 -04:00
digimer
f23768ac32 * Updated Striker->generate_manifest() to remove integral DR support and add migration-network support. This helps address issue #440.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-17 18:58:12 -04:00
digimer
4a6ee0c283 Updated Database->get_lvm_data to ignore data from deleted hosts.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-10 16:21:55 -04:00
digimer
79ff96cee5 * Fixed a bug in DRBD->manage_resource() that prevented new resources from being created.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-09 11:23:57 -04:00
digimer
3ee30e6e24 * Updated DRBD->allow_two_primaries() to gracefully fail if the peer isn't connected.
* Updated DRBD->manage_resource() to check if the host is StandAlone when asked to 'up' a resource and, if so, connect first. Also updated this to error out gracefully if the call to allow_two_primaries() returns non-zero.
* Update Server->migrate_virsh() to error out gracefully if the DRBD->allow_two_primaries() returns non-zero.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-08 22:52:16 -04:00
digimer
55aaf7876e Starting work on checking if the peer is connected before managing allow-two-primaries.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-08 17:33:39 -04:00
digimer
59ade94124 * Added PID logging as an option, and enabled it in ocf:alteeve:server
* Updated DRBD->manage_resource() to take the task 'adjust'.
* Updated ocf:alteeve:server's start_drbd_resource() to call adjust if startup of a resource isn't needd.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-07 22:28:13 -04:00
Fabio M. Di Nitto
fc75bda6ef ocf:alteeve:server: add support for log levels and bump timeouts
also improve logging for migrations

Signed-off-by: Fabio M. Di Nitto <fabbione@fabbione.net>
2023-08-07 08:22:50 +02:00
digimer
a0ff080741 * Deleted some old unused code from Cluster->assemble_storage_groups().
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-25 19:30:53 -04:00
digimer
ed480cf1cb * Fixed a double-$ bug in Remote->_check_known_hosts_for_target()
* Updated striker-update-cluster to take '--timeout' and a number of seconds, or 'Xm' or 'Xh' for minutes or hourse, respectively. Also updated to show the remaining time while waiting, and added waiting timeout to the rest of the while loops that prior had no time limit. This addresses issue #383 and issue #382.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-25 19:13:41 -04:00
digimer
be290bf561 This commit fixes a bug where the drbd kernel module build was being killed mid-compile, leaving DBRD unusable.
* Created System->wait_on_dnf() which was plucked from anvil-daemon, and now also called in scancore and anvil-safe-start.
* Updated scancore and anvil-safe-start to check on start that DRBD's kernel module is available (and build if not).

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-24 22:32:41 -04:00
digimer
d68adb5b4e * Updated anvil-manage-power to not reboot if anvil-version-changes is running (which, if it's taking time, is generating new kmods).
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-24 20:44:40 -04:00
digimer
556e91238d * Updated Network->find_access() to clear the data from previous scans, which fixes a bug where checking multiple hosts could return stale data for the previous host.
* Updated anvil-manage-server-storage, striker-collect-debug, and striker-update-cluster to be able to find a connection on an interface when none were found on preferred networks.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-24 15:43:54 -04:00
digimer
66c82e5e22 * Fixed a bug in anvil-update-system where updating a single package with --reboot wouldn't request a reboot. Finished reworking it so that a check is made to see if the kernel or DRBD kmod will be updated and, if so, removes the kmod-drbd RPMs prior to doing the update (as opposed to the sloppier check-on-error method).
* Fixed a bug in System->reboot_needed() where the cache file path had a typo in the hash key.
* Updated anvil-daemon to use the full path to dnf when determining if a dnf process was running.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-23 21:43:26 -04:00
Tsu-ba-me
c46ff969f3 fix: add UUID to server process during find in Server.pm 2023-07-23 21:43:26 -04:00
Tsu-ba-me
4bdd206e0c fix: replace ps|grep with pgrep to reduce run time 2023-07-23 21:43:25 -04:00
Tsu-ba-me
54c98f89ab fix: allow extend remote call with openssh options 2023-07-23 21:43:25 -04:00
digimer
7bd76c10dc Major thing in this commit is reworking striker-update-cluster to work without expecting anvil-daemon to be running on target machines. Similarly, they had to be able to work when the Striker DBs were not available. This is to account for cases where the Striker dashboards have updated, and the schema has changed, preventing the not-yet-updated DR hosts and subnodes from being able to use the DB. To do this, anvil-safe-stop, anvil-update-system, and anvil-shutdown-server had to be updated to use the new --no-db switch, which tells then to run without the database being available.
* Updated Server->shutdown_virsh() to work without a database connection.
* Updated System->reboot_needed() to store/read from a cache file when the database is not available.
* Updated anvil-safe-start to remove the old --enable/disable/status switches, now that we use anvil-safe-start.service systemd unit.
* Reworked anvil-safe-stop to work without a database connection, and to work on DR hosts.
* Updated anvil-special-operations to add new tasks, but it's likely these new tasks aren't needed and will be removed very shortly.
* Added/updated multiple man pages.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-22 18:09:01 -04:00
Digimer
c1e4380a64
Merge branch 'main' into anvil-tools-dev 2023-07-15 00:06:49 -04:00
digimer
458cb267da * Fixed a bug in Cluster->get_primary_host_uuid() where servers were not loaded before trying to calculate RAM use.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-15 00:04:12 -04:00
digimer
4dc1b0e117 * Added a check to Network->get_company_from_mac() to manually set the company to KVM/qemu if the prefix is 52:54:00.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-14 23:00:16 -04:00
digimer
02c3d204ea * Updated anvil-update-system to set 'job_data' to track reboots, and striker-update-cluster to read it.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-14 22:52:51 -04:00
digimer
3016fb875b * Reworded striker-update-cluster to use anvil-update-system for on-system OS updates.
* Updated DRBD->get_status() to take the new 'host' paramter to allow the caller to define the hash key string used in the stored data.
* Updated Get->anvil_version() (and a few other places) to use the new 'striker-ui-api' shell user, replacing the 'apache' user.
* Updated Remote->test_access() to take the new 'close' parameter to close the SSH session used when testing access to the target.
* Fixed a logging bug in anvil-manage-power.
* Updated anvil-update-system to take the '--no-reboot' and 'clear-cache' command line switches.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-14 22:29:07 -04:00
Tsu-ba-me
a7751da153 fix: rename, relocate function to find qemu-kvm processes 2023-07-12 18:40:11 -04:00
Tsu-ba-me
c3c69733d9 fix: correct base port check, server info extract, vnc alive assign in Server.pm 2023-07-12 18:27:50 -04:00
Tsu-ba-me
3cce3c39b8 fix: add Server subroutine to extract server VM info from qemu-kvm process(es) 2023-07-12 02:27:26 -04:00
digimer
d56b7f9a84 * Created (but not finished!) the new striker-update-cluster tool.
* Updated Cluster->get_primary_host_uuid() to only load anvils if not already loaded.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-07 17:54:57 -04:00
digimer
a7ebe45f76 This adds the new 'striker-collect-debug' tool that collects all potentially useful debug info into a single tarball.
* Fixed a bug in Get->anvil_from_switch() to work when the Anvil! name is passed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-07-05 21:04:05 -04:00
digimer
1b8b0bc493 * Created the new 'anvil-manage-server-storage' with the first role of reload a DRBD resource.
* Updated Remote->call() to remove the 'background' parameter as it wasn't working.
* Updated anvil-manage-server-storage to use 'anvil-manage-server-storage' to adjust resources in a way that doesn't block.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-06-30 21:02:30 -04:00
digimer
7fbed10864 * Updated Remote->call() to take the new 'background' parameter.
* Continues work on adding new disks (DRBD volumes) to anvil-manage-server-storage.
* Updated DRBD->get_status() to record the peer-role.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-06-29 22:17:58 -04:00