Commit Graph

620 Commits

Author SHA1 Message Date
Madison Kelly
8b8be39717 Finished System->check_if_configured({thorough => 1}) support.
* Updated Database->get_variables() to store columns with
  variable_source_table values in a more useful hash.

Signed-off-by: Madison Kelly <mkelly@alteeve.com>
2024-06-08 15:55:49 -04:00
Madison Kelly
4b82c5f2bf Added 'timeout' logging to help debug SIGALARM exits.
Signed-off-by: digimer <mkelly@alteeve.ca>
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
2024-06-06 15:33:56 -04:00
Madison Kelly
9cb2446bea Cleaned up handling of lost DB access
* Updated Database->query() to track when a specific DB to read from is
  passed. If so, and that is lost, return an error. If not, and another
  DB is available, switch to it.
* Updated Database->write() to skip trying to write to a lost DB.

Signed-off-by: Madison Kelly <mkelly@alteeve.com>
2024-06-04 17:13:31 -04:00
Madison Kelly
9db9f81104 Reworked Database->_test_access to do a general reconnect
* Before, it would try to reconnect to just the lost DB, which could
  trigger an error.

Signed-off-by: Madison Kelly <mkelly@alteeve.com>
2024-06-04 16:39:24 -04:00
digimer
8c1c0597da Updated anvil-daemon to run anvil-configure-host in the foreground.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-05-30 14:49:02 -04:00
digimer
25a0454dce Better handling of lost DB connections.
* Added a sync call to Tools->nice_exit() to ensure logs are flushed.
* Updated Database->quote() to be in an eval block to better handle
  cases where the DB handle is lost.
* Added an hourly check to anvil-daemon and moved the memory in use
  check to run only once per hour.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-05-29 20:41:12 -04:00
digimer
4766ceff70 Added logging to debug network config issue.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-05-29 00:35:27 -04:00
digimer
d6c5aa3903 Added a timeout to Database->query() calls.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-05-27 21:11:54 -04:00
digimer
3e63b726d3 Added node 2 joining an Anvil! node if not started by node 1.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-04-14 01:36:28 -04:00
digimer
e00dec7cba Added loading existing corosync/authkey from peer during rebuild.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-04-13 17:46:19 -04:00
digimer
45e3a1e8a9 Updated Remote->_check_known_hosts_for_target() to replace updated keys
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-04-13 16:52:33 -04:00
digimer
60759cd9aa No longer fail is fence method exists already when trying to create it.
* Also fixed a string insertion variable typo

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-04-10 10:39:17 -04:00
digimer
f38b47f1e2 Reworked stonight levels; This should fix issue #522 and #613
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-04-05 00:25:31 -04:00
Digimer
e63ebf7ba8
Merge branch 'main' into libvirt_fixes 2024-03-27 18:47:39 -04:00
digimer
f684b15efd Removed unused, commented our table
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-27 13:33:01 -04:00
digimer
21c8084b2f Updated to support Sys::Virt::Domain generating PNG screenshots
* This should work with older versions still generating PPM screenshots.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-26 18:56:07 -04:00
digimer
aaa0350f2a Removed the check of the libvirtd daemon status.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-26 18:56:07 -04:00
digimer
2d92f339c2 Fixed a bug related to changing the hostname during a manifest run
* The original hostname would be used to form the cluster, even though
  the hostname was updated.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-03-18 23:28:42 -04:00
digimer
b3d1e53623 Added a log message if waiting for bonds times out.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-23 19:15:50 -05:00
digimer
a65bf5090e Updated anvil-monitor-performance to reduce logging volume.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-23 18:49:58 -05:00
digimer
c79b76e2ee Reduced the frequency of monitoring to 1/min and reduced log size
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-23 18:37:44 -05:00
digimer
6a193bf710 Added extra checks to Network->wait_for_bonds()
* Added a default timeout of 180 seconds, and updated
  anvil-configure-host to reduce this to 60 seconds while configuring
  the host.
* Added a check for interfaces configured under a bond. If none are
  found, the bond is ignored.
* Updated Storage->update_config() to take the new 'append' attribute to
  allow adding a variable if it wasn't found already in the config.
* Added the new 'network::wait_for_bonds::timeout' variable to enable
  changing the default timeout for Network->wait_for_bonds().

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-22 23:46:49 -05:00
digimer
d5ceca3dc6 Added a message when holding on a bond to activate.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-22 02:06:44 -05:00
digimer
741bcfa908 Added default logging level 2 and secure logging in CI tests.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-21 21:46:27 -05:00
digimer
f40d25f2dd Fixed a bug with /etc/hosts generation
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-20 21:56:07 -05:00
digimer
4b5894625e Updated anvil-configure-host to enable connection.autoconnect.
This closes issue #576

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-14 15:36:40 -05:00
digimer
e0c4ed6de5 Added log-only option to anvil-manage-daemons and enabled
anvil-monitor-daemons.service to only monitor daemons.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-14 13:55:03 -05:00
digimer
c399053ace Finished new anvil-manage-daemons tool.
This tool (and it's parent 'anvil-monitor-daemons' daemon) simplies
starting, stopping, enabling, and disabling all Anvil! daemons.

More importantly, the daemon will monitor for failed daemons and attempt
to restart them.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-14 01:39:56 -05:00
digimer
835d9e79cb Updated Scancore->post_scan_analysis_striker() to check the RC when
booting an unexpectedly off host and only update it's power state if the
boot actually succeeded.

* Started work on a new anvil-manage-daemons tool and
  anvil-monitor-daemons systemd unit.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-13 21:11:22 -05:00
digimer
27152845fd Attempts to create an existing fence method no longer fails.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-13 16:25:48 -05:00
digimer
4e367acd11 Created anvil-monitor-performance tool and daemon.
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-12 22:58:36 -05:00
digimer
43f4201861 Created Get->load_average().
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-12 18:31:00 -05:00
digimer
f25323ba9b Fixed type on words variable insertion
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-02-02 19:43:36 -05:00
digimer
bf693ed212 Updated anvil-daemon to enable root SSH access during startup
This is required as we need to be able to ssh into peer strikers and
into nodes and DR hosts during initialization.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
d8ceb7fbf4 Updated to add all subnode nets to /etc/hosts before forming cluster
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
dd0175e05c Now check for/backup/remove ifcfg-X files on EL8 hosts.
* Added caching to System->check_network_type()
* Changed anvil-configure-host job progress steps to 1.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
e03219d1d8 Fixed a bug where non-strikers hung configuring their network.
* Updated Job->update_progress() to log and return if there are not DB
  connections.
* Bumped some logging in Database->connect().
* Deleted ifcfg code from anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
d7aa7966dc Fixed a couple bugs
* Network->collect_data() wasn't deleting old data before rescans.
* anvil-configure-host wasn't checking links that should be in a bond if
  the bond already existed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
518fddfa82 More progress on the new NM version of anvil-configure-host
* It's technically done, but I know bugs remain.
* Updated Jobs->update_progress() to take 'file' and 'line' to make it
  easier in the logs to see the origin of the message, when logging the
  update.
* Created Network->modify_connection() to update network manager
  variables. Created ->reset_connection() to take an interface down and
  bring it back up again.
* Fixed a bug in scan-network where the device_to_uuid hash wasn't being
  stored.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
71735947dc Created Job->bump_progress() to make advancing job progress easier
* Updated Network->collect_data() to find the GENERAL.DEVICES and
  GENERAL.IP-IFACE from match.interface-name when the link is down.
* More work done on anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ef89a79162 More progress on anvil-configure-host
* Now working on the reconfiguring of interfaces.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
a27773a69d scan-network now records interfaces, bonds and bridges!
* Much testing still needed, but this is a significant milestone.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
9c67b97fdd Fixed a bug in initializing DROP'ed DBs.
* Got more work done on adding network_interfaces to the database in
  scan-server.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
ec11335197 Fixed DB initialization bugs.
* More work done on the new network stack also.

Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
1f88abda04 * Further work done on anvil-monitor-network
Signed-off-by: digimer <mkelly@alteeve.ca>
2024-01-27 15:39:01 -05:00
digimer
fd880e2fdf Finished anvil-watch-servers
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-29 11:37:00 -05:00
digimer
26f4446bf9 Continued work on anvil-watch-servers; Parsed server data now.
* Updated Cluster->parse_cib() to store DRBD fence node restrictions by
  server/node. Also updated to make it easier to get the server's
  preferred node.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-29 01:27:21 -05:00
digimer
207a014ae0 Got anvil-watch-servers showing the status of subnodes.
* Updated System->maintenance_mode() to take 'host_uuid' so that the
  maintenance mode of remote machines can be checked/set.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-27 23:47:29 -05:00
digimer
8ce1f04335 Finished CPU support in anvil-manage-server-system!
* Updated Get->available_resources() to record the maximum cores that
  can be allocated to a server. This is N-1 for hosts with 4 or less
  cores, or N-2 cores otherwise.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-24 16:51:15 -05:00
digimer
b4037fade5 Added RAM change support to anvil-manage-server-system
* Updated Database->insert_or_update_servers() to error if the RAM being
  recorded is less than 640 KiB. This is because, somewhere yet
  undiscovered, the RAM is being recorded in KiB which breaks things.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-23 22:53:36 -05:00