Commit Graph

804 Commits

Author SHA1 Message Date
digimer
ad0a353a89 Fixed a bug where unused interfaces were not being ignored.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-25 22:41:34 -04:00
digimer
36525cdeab Added a work-around for an LVM JSON formatting issue
* Related to https://issues.redhat.com/browse/RHEL-29680
* Updated Storage->manage_lvm_conf() to be stricter about when to add
  the filter to lvm.conf

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-19 15:33:43 -04:00
digimer
2d92f339c2 Fixed a bug related to changing the hostname during a manifest run
* The original hostname would be used to form the cluster, even though
  the hostname was updated.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-18 23:28:42 -04:00
digimer
870c990632 Added support for multiple IP's per interface
* Created Database->get_mac_to_ip()
* Updated Database->insert_or_update_mac_to_ip() to find an entry using
  both the IP and MAC address.
* Updated Network->get_ips() to store only the first IP it finds on an
  interface as the main IP (for use in /etc/hosts, etc) and to store it
  and any other IPs in a new hash.
* Updated scan-network to use the new hash above to record them in the
  'mac_to_ip' table. Similarly, before marking an IP as removed, it
  checks to see if it's an alternate IP.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-06 19:06:05 -05:00
digimer
ab0b1a262b Reworked Network->wait_for_bonds() to be ->wait_for_networks()
* Renamed the old ->wait_for_networks() to be ->wait_for_nm_online().
* The new ->wait_for_networks() waits for all interfaces we manage to be
  'activated' before returning.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-29 01:32:32 -05:00
digimer
fe2806b5df Bumped up timeouts
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-28 23:38:50 -05:00
digimer
0f1ff02e78 Added alarms around remote calls to better handle dropped networks.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-28 20:35:00 -05:00
digimer
480745c889 Disabled printing of the countdown.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-28 16:37:31 -05:00
digimer
fef0d8f83f Fixed a spacing issue.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-27 21:54:05 -05:00
digimer
b8c73fd3f2 Replaced hosts management in anvil-join-anvil with System->update_hosts.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-26 18:29:55 -05:00
digimer
495cb90ca6 Created Network->wait_for_network to hold startup for NM to be up.
Added the call to Network->wait_for_network to pause scancore and
anvil-daemon startups until NetworkManager says it's up and running.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-24 17:16:46 -05:00
digimer
b3d1e53623 Added a log message if waiting for bonds times out.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-23 19:15:50 -05:00
digimer
6a193bf710 Added extra checks to Network->wait_for_bonds()
* Added a default timeout of 180 seconds, and updated
  anvil-configure-host to reduce this to 60 seconds while configuring
  the host.
* Added a check for interfaces configured under a bond. If none are
  found, the bond is ignored.
* Updated Storage->update_config() to take the new 'append' attribute to
  allow adding a variable if it wasn't found already in the config.
* Added the new 'network::wait_for_bonds::timeout' variable to enable
  changing the default timeout for Network->wait_for_bonds().

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-22 23:46:49 -05:00
digimer
d5ceca3dc6 Added a message when holding on a bond to activate.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-22 02:06:44 -05:00
digimer
05de34c7bc Scancore and anvil-daemon now holds for bonds to be up.
Created Network->wait_for_bonds(), and added it to the startup for
scancore and anvil-daemon.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-22 02:01:33 -05:00
digimer
741bcfa908 Added default logging level 2 and secure logging in CI tests.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-21 21:46:27 -05:00
digimer
f40d25f2dd Fixed a bug with /etc/hosts generation
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-20 21:56:07 -05:00
digimer
a9850bef4e Added a global variable to force fresh SSH connections.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-16 23:28:58 -05:00
digimer
091ded803c Added an attempt to assemble storage groups if not yet exist.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-15 17:41:26 -05:00
digimer
46b298e7e8 Added debugging of missing bond data
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-14 19:42:56 -05:00
digimer
835d9e79cb Updated Scancore->post_scan_analysis_striker() to check the RC when
booting an unexpectedly off host and only update it's power state if the
boot actually succeeded.

* Started work on a new anvil-manage-daemons tool and
  anvil-monitor-daemons systemd unit.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-13 21:11:22 -05:00
Digimer
04fe4c1bff
Merge branch 'main' into anvil-tools-dev 2024-02-12 23:01:10 -05:00
digimer
43f4201861 Created Get->load_average().
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-12 18:31:00 -05:00
Digimer
6c6216fb88
Merge branch 'main' into issues/561-correct-load-ifaces 2024-02-03 13:45:02 -05:00
digimer
e166d181c4 Disabled RAM counting as a debug step.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-02 23:42:36 -05:00
Tsu-ba-me
2306776c37 fix: rename load_interfces->load_interfaces of Network.pm 2024-01-30 16:16:14 -05:00
digimer
7018703c24 Fixed a bug where the peer subnode would add a server to pacemaker
* Updated anvil-provision-server to only call add_server_to_cluster() if
  it's NOT the peer.
* Added the new 'ok_if_exists' parameter to Cluster->add_server() to
  return 0 if the server already existed in pacemaker as a resource.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
423f677716 Updated Storage->manage_lvm_conf() to only run on EL8.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
943bf2e8d3 Removed the no-longer-needed Network->check_network() method
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
dd0175e05c Now check for/backup/remove ifcfg-X files on EL8 hosts.
* Added caching to System->check_network_type()
* Changed anvil-configure-host job progress steps to 1.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
b0cede49e3 Removed calls to check apache config.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
e03219d1d8 Fixed a bug where non-strikers hung configuring their network.
* Updated Job->update_progress() to log and return if there are not DB
  connections.
* Bumped some logging in Database->connect().
* Deleted ifcfg code from anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
827cf1f331 Fixed a bug that was crashing anvil-daemon
* Network->find_matches() was trying to compare two IPs when the second
  IP wasn't actually defined.
* Disabled scancore's blocking of running before the host is configured.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
282fdbe7e0 Fixed a bug where IPs were being marked repeatedly as DELETEd.
* Database->get_ip_addresses() was marking IPs that weren't on a network
  we managed, the IP would be marked as DELETEd, which caused problems
  with initializing targets, and it generated a lot of repeat alerts.
* Updated logging in Network.pm to help with debugging.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
92ed77e05b Fixed a bug blocking most jobs from running.
* Also updated a bunch of 'apache' ownership calls to now use
  'striker-ui-api'.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
d7aa7966dc Fixed a couple bugs
* Network->collect_data() wasn't deleting old data before rescans.
* anvil-configure-host wasn't checking links that should be in a bond if
  the bond already existed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ff0e6c3575 Updated anvil-daemon to call scan-network if no interfaces exist.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
518fddfa82 More progress on the new NM version of anvil-configure-host
* It's technically done, but I know bugs remain.
* Updated Jobs->update_progress() to take 'file' and 'line' to make it
  easier in the logs to see the origin of the message, when logging the
  update.
* Created Network->modify_connection() to update network manager
  variables. Created ->reset_connection() to take an interface down and
  bring it back up again.
* Fixed a bug in scan-network where the device_to_uuid hash wasn't being
  stored.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
83057d0b45 Fixed several bugs around renaming interfaces
* Also fixed problems with scan-network related to the new network
  naming / NM system.
* Updated Database->insert_or_update_network_interfaces() to better
  search for a network_interface_uuid when not specified.
* Updated Network->collect_data() to take the new 'start' parameter
  which, when set, brings up unconfigured connections/devices.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
71735947dc Created Job->bump_progress() to make advancing job progress easier
* Updated Network->collect_data() to find the GENERAL.DEVICES and
  GENERAL.IP-IFACE from match.interface-name when the link is down.
* More work done on anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
cad524db9d Removed anvil-update-states
* Created new anvil-monitor-network daemon to trigger scan-server via
  anvil-monitor-network on network events.
* Moved functionality into scan-network

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
a27773a69d scan-network now records interfaces, bonds and bridges!
* Much testing still needed, but this is a significant milestone.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
9c67b97fdd Fixed a bug in initializing DROP'ed DBs.
* Got more work done on adding network_interfaces to the database in
  scan-server.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ec11335197 Fixed DB initialization bugs.
* More work done on the new network stack also.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
f575507c1e This begins adding support for EL9.
* Added the 'hostname' and 'hostnamectl --transient' to
  Get->host_name().
* Updated Database->insert_or_update_hosts() to log when no host_name,
  host_type or host_uuid is not passed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
52e7875252 Bumoed logging to find '!!error!!' related parsing errors.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
13a6c44aa4 Updated Cluster->parse_cib() to support new in_ccm and crmd values
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-20 16:33:27 -05:00
digimer
26f4446bf9 Continued work on anvil-watch-servers; Parsed server data now.
* Updated Cluster->parse_cib() to store DRBD fence node restrictions by
  server/node. Also updated to make it easier to get the server's
  preferred node.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-29 01:27:21 -05:00
digimer
207a014ae0 Got anvil-watch-servers showing the status of subnodes.
* Updated System->maintenance_mode() to take 'host_uuid' so that the
  maintenance mode of remote machines can be checked/set.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-27 23:47:29 -05:00
digimer
8ce1f04335 Finished CPU support in anvil-manage-server-system!
* Updated Get->available_resources() to record the maximum cores that
  can be allocated to a server. This is N-1 for hosts with 4 or less
  cores, or N-2 cores otherwise.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-24 16:51:15 -05:00