Commit Graph

948 Commits

Author SHA1 Message Date
digimer
e5e958a03b Updated anvil-configure-host to clear "config::map_network"
Also updated anvil-manage-host to properly display the network mapping
status.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-03 17:35:01 -04:00
digimer
3251154366 Updated anvil-daemon to run anvil-configure-host jobs when mapping net
Also fixed a bug in anvil-manage-host that prevented showing if the
network mapping flag was set.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-03 15:25:45 -04:00
digimer
ff3d1983e3 Bumped up logging to debug an anvil-provision-server hang.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-02 23:26:12 -04:00
digimer
081c5ea90e Possibly fixed the anvil-delete-server hang bug.
* Updated Server->connect_to_libvirt() to check that the target URI's
  SSH fingerprint is recorded before connecting. Also added an alarm
  wrapper around the Sys::Virt->new() call.
* Continued work on anvil-manage-server-system, working on the boot
  order section now.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-02 15:01:08 -04:00
digimer
8c97f478a8 Updated Server->update_definition() to undefine a server when needed.
* Boosted logging to debug anvil-delete-server hang in jenkins.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-01 16:06:59 -04:00
digimer
9b55504872 Updated anvil-manage-server-system to update defined servers.
* Updated Server->locate() to take the new 'anvil' parameter to speed up
  searches.
* Updated Server->update_definition() to use Server->locate() to find
  where updates are needed. It now also defines the server with the new
  config.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-11-01 00:15:13 -04:00
digimer
fd461f940d Fixed a bug in Remote->call()
* If the call to Remote-call() set the target that was actually the
  local short hostname, it would fail to make the call at all. Now if
  the 'target' is local, the shell call is instead passed to
  System->call() instead.
* Cleaned up logging.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-31 13:16:39 -04:00
digimer
49f194eac6 Fixed issue #515; anvil-join-anvil updates hostnames properly now
* Updated Get->host_name() to accept the new 'refresh' parameter. This
  forces a reread of the hostname, instead of using the cached value.
* Updated System->host_name() so that, when it's updating the hostname,
  it updates the database and cached variables.
* Updated Words->center_text() to avoid undefinied parameter issues.
* Updated anvil-join-anvil to ensure the 'sys::host_name' variable.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-27 19:58:37 -04:00
digimer
ab6acd594c Fixed a bug where 'all' host jobs would not be shown.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-25 15:45:12 -04:00
digimer
4f6fa4b6ed Working on a bug where broken manifests are saved.
* Updated Striker->generate_manifest() to add pod and make the prefix,
  sequence and domain parameters required.
* Created the check_for_broken_manifests() function for anvil-daemon to
  detect/remove broken manifests.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-24 13:36:30 -04:00
digimer
9ee8f782ee Continuing to try to resolve duplicate variables bug.
* Added a called to Database->_check_for_duplicates to Database->resync_databases
* Added 'check_for_resync => 1' to anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-21 14:31:14 -04:00
digimer
2a3f0bab24 Reworked how and when duplicate variables are checked/cleared.
Moved the logic to a new private method, and call it now from the active
Striker in the once per minute loop. The duplicate variable issue seems
to be not entirely uncommon.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-21 13:33:14 -04:00
digimer
1824bb2eed Added forced DB resyncs to striker-manage-peers
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-20 23:41:21 -04:00
digimer
a35e790a4d Fixed a minor striker-boot-machine bug.
Divide by zero error when no hosts with IPMI found.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-20 11:38:15 -04:00
digimer
be89bfb438 Updated striker-get-screenshots to create the screenshot directory.
Also fixed a typo in the POD for Storage->make_directory().

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-19 21:36:15 -04:00
digimer
5ec395c53a Reworked DB resync logic.
With this new system, a 'primary_db' is chosen (first connected DB UUID when sorted) and only it does resyncs. Further, resyncs have been pulled from all tools except anvil-daemon. So with this new system, the chances of duplicate, simultaneous resyncs should be removed (hopefully for real this time).

* Database->check_agent_data() no longer calls a resync after loading a
  schema.
* Removed the Database->coonnect() 'all' parameter
* The database used to read from is now always the same as the primary,
  even if there is a local DB.
* Database->connect() 'check_for_resync' parameter can now be set to
  '2', which means "check for resync _if_ I am primary", where '1' still
  checks for resync no matter what.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-19 20:41:57 -04:00
digimer
b1f89c2723 Finished initial version of striker-show-jobs
* Updated Database->get_jobs() to take 'job_host_uuid = all' to allow
  loading jobs from all cluster machines. Also updated it to record the
  'job_host_uuid' and the unix timestamp version of 'modified_date'.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-18 20:51:20 -04:00
digimer
4398ffe70c Updated striker-boot-machine to support booting all machines.
* Wrote the man page for striker-boot-machine, changing --host-name to
  --host, and adding the '--host all' support.
* Updated anvil-manage-host to support checking/enabling/disabling
  network mapping mode.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-12 22:18:07 -04:00
digimer
b3c067b016 Fixed a bug in anvil-manage-files where missing files weren't being downloaded.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-12 01:01:31 -04:00
digimer
7545df1e55 Fixed a bug in which host runs an anvil-delete-server job.
* Updated anvil-delete-server to use the new Server->locate method. This
  was done as the old Server->locate() was failing to find the server
running on the peer when anvil-delete-server was running on the backup
subnode.
* Updated Server->locate() to search hosts for XML definition and DRBD
  configs so that it can record where the server is recorded to run,
even if the server isn't running or defined at the time the locate ran.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 22:22:06 -04:00
digimer
68521cdab7 Updated striker-get-screenshots to set permissions properly.
This updates the /opt/alteeve/screenshot directories and the screenshots
in them to be readible by the WebUI.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
55b1380031 Finished (but need more testing) of Server->locate().
This includes the changes in PR#492.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
245f75de9b Added Server->update_definition()
* This takes a server and new definition XML and updated the database and any available hosts. Does not yet update defined or running servers.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
e361d0b424 More progress on anvil-manage-server-system
* It now edits the XML to change the boot menu, but doesn't save yet.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
201cd53265 Improved the logic behind Network->find_target_ip()
* This adds the new 'networks' and 'test_access' parameters to allow
  restricting/ordering matched networks, and adds 'test_access' to
  validate the link is working.
* Continued work on anvil-manage-server-system

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
62fe62a44b * Continued work on anvil-manage-server-system. It now displays the boot devices, CPU and RAM info.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-11 17:22:06 -04:00
digimer
7762540f85 Fixed wrappers to handle quoted arguments properly.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-10-10 10:10:54 -04:00
digimer
3d4d7abfe3 Increased logging to debug server install failure.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-29 13:01:58 -04:00
digimer
c3fe39e6a8 Added a check file unlinked files.
* On subnodes and DR hosts, a check is made now in Storage->check_files() for files not linked in file_locations. Any found are added, with a check to see if the file already exists locally and, if so, that the md5sum is accurate or not (to set if the file is ready for use or not).

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-29 00:05:29 -04:00
digimer
0bff40b21e Fixed a bug where files that are ready for use were not found.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 21:35:16 -04:00
digimer
5a8f775db4 Removed the reboot job at the end of anvil-configure-host.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 18:23:57 -04:00
digimer
30248760b5 Moved the wait_on_subnodes function call to earlier in the script.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 17:52:11 -04:00
digimer
77bae80534 Added default values for MTU and DNS if not set in a manifest.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 17:47:53 -04:00
digimer
b8fb3d62e3 Added checks for the screenshot directory before collection.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 16:30:36 -04:00
digimer
663a1e0527 Quieted screenshot logging in anvil-daemon.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 16:27:40 -04:00
digimer
fcbace6713 Updated anvil-join-anvil to hold if either node is still running anvil-configure-host
* Fixed a minor bug and added logging of maintenance_mode calls in anvil-configure-host.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 16:01:32 -04:00
digimer
e480337239 Fixed wait loop for subnodes
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 11:47:48 -04:00
digimer
582a8b292c Added more job updates to anvil-manage-power.
* This is a test to see if the job waiting for the uptime to be 300s,
  leaving the job_progress as 0, was causing the job to be repeatedly
  called.
* This is related to issue #479

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 01:19:52 -04:00
digimer
5b286f2696 Added missing switch reading in anvil-manage-power.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 00:24:29 -04:00
digimer
ef042eef25 Cleaned up logging while waiting for subnodes.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-28 00:15:14 -04:00
digimer
5d5270486e Added a wait loop when forming node clusters.
* This adds a check where anvil-join-anvil waits until both subnodes are
  marked as configured and not in maintenance mode.
* Should address issue #479 (maybe, this shouldn't trigger reboots, but
  it was certainly a race condition found while investigating).

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-27 22:38:07 -04:00
digimer
d58521ceca Added screenshot capture to striker-collect-debug.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-27 11:46:56 -04:00
Digimer
745081b649
Merge branch 'main' into patch-screenshot 2023-09-22 23:28:12 -04:00
digimer
b1562e0301 * Added the new screenshot tool.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-22 17:16:19 -04:00
digimer
c039c58128 * This commit moves taking screenshots of hosted servers onto the strikers using the Sys::Virt module. This was needed because the screenshots were being taken by scan-server, and that was causing it to take a long time to run. It should never have been handled by the scan agent anyway. This update requires a WebUI fix to use the new screenshot tool. This tool also adds holding multiple screenshots to allow users to "scrub" through screenshots up to 10 hours in the past.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-22 17:15:09 -04:00
digimer
8925dabb9d * Updated anvil-shutdown-server to take the new '--immediate' switch which forces a server to shut down immediately (akin to pulling the power on a traditional machine). This is needed to allow a user to recover a crash or hung server.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-21 18:56:12 -04:00
digimer
3c9086d1f3 Fixed bugs related to running jobs.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-07 20:06:00 -04:00
digimer
e8a84e1c97 Added job handling to anvil-manage-server-storage (needs more testing though).
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-07 15:37:31 -04:00
digimer
2f429d2bc7 Fixed bugs related to adding drives and extending drives to servers.
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-05 22:53:52 -04:00
digimer
e895e1f264 * Finished writting the anvil-manage-server-storage.
* Fixed handling --eject and --insert to work without a device target specified when only one exists, or to find the file path when only the file name is given.
* Updated anvil-manage-server-storage to show files when processing an optical devices without a file being passed.

Signed-off-by: digimer <mkelly@alteeve.ca>
2023-09-05 16:53:08 -04:00