Commit Graph

1013 Commits

Author SHA1 Message Date
digimer
27152845fd Attempts to create an existing fence method no longer fails.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-13 16:25:48 -05:00
digimer
4e367acd11 Created anvil-monitor-performance tool and daemon.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-12 22:58:36 -05:00
digimer
43f4201861 Created Get->load_average().
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-12 18:31:00 -05:00
digimer
f180f1adfc Added anvil-monitor-network to Makefile
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-02 22:30:17 -05:00
digimer
a3c3077963 Called pcs directly for CIB data collection.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-30 20:07:29 -05:00
digimer
f100bc94cd Bumped logging to debug striker-collect-debug
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-29 21:18:32 -05:00
digimer
050891d751 Fixed inverted file copy check logic.
This should resolve issue #534

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-28 00:12:13 -05:00
digimer
bb78d65c77 Bumped debug logging.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 21:36:54 -05:00
digimer
b85e38d20d Added the short and full host names to hosts
* Updated anvil_join_anvil->wait_for_etc_hosts() to add the short host
  name and the FQDN to the /etc/hosts file using the first BCN network's
  IP address.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
7018703c24 Fixed a bug where the peer subnode would add a server to pacemaker
* Updated anvil-provision-server to only call add_server_to_cluster() if
  it's NOT the peer.
* Added the new 'ok_if_exists' parameter to Cluster->add_server() to
  return 0 if the server already existed in pacemaker as a resource.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
14022896aa Added a call for non-striker machines to call check_sshd if no DBs.
Also added a check for sshd_config.d so that it doesn't error on EL8
machines.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
bf693ed212 Updated anvil-daemon to enable root SSH access during startup
This is required as we need to be able to ssh into peer strikers and
into nodes and DR hosts during initialization.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
943bf2e8d3 Removed the no-longer-needed Network->check_network() method
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
d8ceb7fbf4 Updated to add all subnode nets to /etc/hosts before forming cluster
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
760d9f53d8 Bumped logging.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
de4bb0d001 Bumped logging for debugging.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
476b285607 Added a wait_for_access() function to anvil-join-anvil
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
023bcf46a4 Fixed a bug with hung cluster startup in some cases
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
dd0175e05c Now check for/backup/remove ifcfg-X files on EL8 hosts.
* Added caching to System->check_network_type()
* Changed anvil-configure-host job progress steps to 1.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
b0cede49e3 Removed calls to check apache config.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
e03219d1d8 Fixed a bug where non-strikers hung configuring their network.
* Updated Job->update_progress() to log and return if there are not DB
  connections.
* Bumped some logging in Database->connect().
* Deleted ifcfg code from anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
827cf1f331 Fixed a bug that was crashing anvil-daemon
* Network->find_matches() was trying to compare two IPs when the second
  IP wasn't actually defined.
* Disabled scancore's blocking of running before the host is configured.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
282fdbe7e0 Fixed a bug where IPs were being marked repeatedly as DELETEd.
* Database->get_ip_addresses() was marking IPs that weren't on a network
  we managed, the IP would be marked as DELETEd, which caused problems
  with initializing targets, and it generated a lot of repeat alerts.
* Updated logging in Network.pm to help with debugging.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
92ed77e05b Fixed a bug blocking most jobs from running.
* Also updated a bunch of 'apache' ownership calls to now use
  'striker-ui-api'.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
d7aa7966dc Fixed a couple bugs
* Network->collect_data() wasn't deleting old data before rescans.
* anvil-configure-host wasn't checking links that should be in a bond if
  the bond already existed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
76218dcd32 Updated logging and debugging.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
1373345f33 Fixed a bug where inactive links weren't started
* anvil-configure-host now requests a start on the first scan of
  network->collect_date().
* Fixed a minor bug where networks without bonds were being processed
  needlessly.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ff0e6c3575 Updated anvil-daemon to call scan-network if no interfaces exist.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
518fddfa82 More progress on the new NM version of anvil-configure-host
* It's technically done, but I know bugs remain.
* Updated Jobs->update_progress() to take 'file' and 'line' to make it
  easier in the logs to see the origin of the message, when logging the
  update.
* Created Network->modify_connection() to update network manager
  variables. Created ->reset_connection() to take an interface down and
  bring it back up again.
* Fixed a bug in scan-network where the device_to_uuid hash wasn't being
  stored.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
83057d0b45 Fixed several bugs around renaming interfaces
* Also fixed problems with scan-network related to the new network
  naming / NM system.
* Updated Database->insert_or_update_network_interfaces() to better
  search for a network_interface_uuid when not specified.
* Updated Network->collect_data() to take the new 'start' parameter
  which, when set, brings up unconfigured connections/devices.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
92ddf27979 Fixed bugs, got X_link2 interfaces configuring properly now
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
71735947dc Created Job->bump_progress() to make advancing job progress easier
* Updated Network->collect_data() to find the GENERAL.DEVICES and
  GENERAL.IP-IFACE from match.interface-name when the link is down.
* More work done on anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ef89a79162 More progress on anvil-configure-host
* Now working on the reconfiguring of interfaces.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
cad524db9d Removed anvil-update-states
* Created new anvil-monitor-network daemon to trigger scan-server via
  anvil-monitor-network on network events.
* Moved functionality into scan-network

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
9c67b97fdd Fixed a bug in initializing DROP'ed DBs.
* Got more work done on adding network_interfaces to the database in
  scan-server.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ec11335197 Fixed DB initialization bugs.
* More work done on the new network stack also.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
72325b9ed7 Finished IP assignment.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
9c57035b54 Got bridge support added to anvil-monitor-network.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ac2f9999ae Got anvil-monitor-network assembling bonds properly (I think)
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
a038a1c553 Got anvil-monitor-network successfully renaming interfaces.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
1f88abda04 * Further work done on anvil-monitor-network
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
9cbb5c1f52 Got the data collection done for the new anvil-monitor-network tool.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
f575507c1e This begins adding support for EL9.
* Added the 'hostname' and 'hostnamectl --transient' to
  Get->host_name().
* Updated Database->insert_or_update_hosts() to log when no host_name,
  host_type or host_uuid is not passed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
fb363b5b6c Increased logging for debugging issue #339
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
52e7875252 Bumoed logging to find '!!error!!' related parsing errors.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
Tsu-ba-me
17bd67d0c4 fix(tools): disable mail server auth translation in manage alerts 2024-01-26 17:52:43 -05:00
Tsu-ba-me
7b389d0ad3 fix(tools): make username, password optional in manage alerts for mail servers 2024-01-26 17:52:43 -05:00
digimer
fd880e2fdf Finished anvil-watch-servers
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-29 11:37:00 -05:00
digimer
26f4446bf9 Continued work on anvil-watch-servers; Parsed server data now.
* Updated Cluster->parse_cib() to store DRBD fence node restrictions by
  server/node. Also updated to make it easier to get the server's
  preferred node.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-29 01:27:21 -05:00
digimer
207a014ae0 Got anvil-watch-servers showing the status of subnodes.
* Updated System->maintenance_mode() to take 'host_uuid' so that the
  maintenance mode of remote machines can be checked/set.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-27 23:47:29 -05:00