anvil

Commit Graph

Author	SHA1	Message	Date
digimer	8c1c0597da	Updated anvil-daemon to run anvil-configure-host in the foreground. Signed-off-by: digimer <mkelly@alteeve.ca>	7 months ago
digimer	25a0454dce	Better handling of lost DB connections. * Added a sync call to Tools->nice_exit() to ensure logs are flushed. * Updated Database->quote() to be in an eval block to better handle cases where the DB handle is lost. * Added an hourly check to anvil-daemon and moved the memory in use check to run only once per hour. Signed-off-by: digimer <mkelly@alteeve.ca>	7 months ago
digimer	b86493fff4	More logging to debug apparent hang * Added an explicit 'sync' call when writing to logs. TO BE REMOVED! * Disabled anvil-monitor-daemons and anvil-monitor-performance in case this is somehow trigging program exits. * Converted prints to Log->entry calls in anvil-change-password * Added PID state info logging for running jobs. Signed-off-by: digimer <mkelly@alteeve.ca>	7 months ago
digimer	ab33c716cb	Created a specific check that there's a hosts entry for each DB * This is meant to deal with a case where, when a DB is added to anvil.conf but that new entry is not yet in hosts, the program crashes because of a duplicate key when calling insert_or_update_hosts for all DBs. Signed-off-by: digimer <mkelly@alteeve.ca>	7 months ago
digimer	8e53993f67	Shortened the anvil-daemon job start up delay. Signed-off-by: digimer <mkelly@alteeve.ca>	7 months ago
digimer	3e63b726d3	Added node 2 joining an Anvil! node if not started by node 1. Signed-off-by: digimer <mkelly@alteeve.ca>	8 months ago
digimer	e00dec7cba	Added loading existing corosync/authkey from peer during rebuild. Signed-off-by: digimer <mkelly@alteeve.ca>	8 months ago
digimer	ab0b1a262b	Reworked Network->wait_for_bonds() to be ->wait_for_networks() * Renamed the old ->wait_for_networks() to be ->wait_for_nm_online(). * The new ->wait_for_networks() waits for all interfaces we manage to be 'activated' before returning. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	2f5fb32769	Quieted logging Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	b8c73fd3f2	Replaced hosts management in anvil-join-anvil with System->update_hosts. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	495cb90ca6	Created Network->wait_for_network to hold startup for NM to be up. Added the call to Network->wait_for_network to pause scancore and anvil-daemon startups until NetworkManager says it's up and running. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	5cf0bbc6be	Added Want=NetworkManager to anvil-daemon and scancore unit files. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	05de34c7bc	Scancore and anvil-daemon now holds for bonds to be up. Created Network->wait_for_bonds(), and added it to the startup for scancore and anvil-daemon. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	741bcfa908	Added default logging level 2 and secure logging in CI tests. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	5517e43a81	Forcing anvil-daemon to run with log level 2 and secure logging. Signed-off-by: digimer <mkelly@alteeve.ca>	10 months ago
digimer	14022896aa	Added a call for non-striker machines to call check_sshd if no DBs. Also added a check for sshd_config.d so that it doesn't error on EL8 machines. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	bf693ed212	Updated anvil-daemon to enable root SSH access during startup This is required as we need to be able to ssh into peer strikers and into nodes and DR hosts during initialization. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	943bf2e8d3	Removed the no-longer-needed Network->check_network() method Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	b0cede49e3	Removed calls to check apache config. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	827cf1f331	Fixed a bug that was crashing anvil-daemon * Network->find_matches() was trying to compare two IPs when the second IP wasn't actually defined. * Disabled scancore's blocking of running before the host is configured. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	282fdbe7e0	Fixed a bug where IPs were being marked repeatedly as DELETEd. * Database->get_ip_addresses() was marking IPs that weren't on a network we managed, the IP would be marked as DELETEd, which caused problems with initializing targets, and it generated a lot of repeat alerts. * Updated logging in Network.pm to help with debugging. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	92ed77e05b	Fixed a bug blocking most jobs from running. * Also updated a bunch of 'apache' ownership calls to now use 'striker-ui-api'. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	ff0e6c3575	Updated anvil-daemon to call scan-network if no interfaces exist. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	cad524db9d	Removed anvil-update-states * Created new anvil-monitor-network daemon to trigger scan-server via anvil-monitor-network on network events. * Moved functionality into scan-network Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	ec11335197	Fixed DB initialization bugs. * More work done on the new network stack also. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	52e7875252	Bumoed logging to find '!!error!!' related parsing errors. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	3251154366	Updated anvil-daemon to run anvil-configure-host jobs when mapping net Also fixed a bug in anvil-manage-host that prevented showing if the network mapping flag was set. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	4f6fa4b6ed	Working on a bug where broken manifests are saved. * Updated Striker->generate_manifest() to add pod and make the prefix, sequence and domain parameters required. * Created the check_for_broken_manifests() function for anvil-daemon to detect/remove broken manifests. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	9ee8f782ee	Continuing to try to resolve duplicate variables bug. * Added a called to Database->_check_for_duplicates to Database->resync_databases * Added 'check_for_resync => 1' to anvil-configure-host. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	2a3f0bab24	Reworked how and when duplicate variables are checked/cleared. Moved the logic to a new private method, and call it now from the active Striker in the once per minute loop. The duplicate variable issue seems to be not entirely uncommon. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	5ec395c53a	Reworked DB resync logic. With this new system, a 'primary_db' is chosen (first connected DB UUID when sorted) and only it does resyncs. Further, resyncs have been pulled from all tools except anvil-daemon. So with this new system, the chances of duplicate, simultaneous resyncs should be removed (hopefully for real this time). * Database->check_agent_data() no longer calls a resync after loading a schema. * Removed the Database->coonnect() 'all' parameter * The database used to read from is now always the same as the primary, even if there is a local DB. * Database->connect() 'check_for_resync' parameter can now be set to '2', which means "check for resync _if_ I am primary", where '1' still checks for resync no matter what. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	3d4d7abfe3	Increased logging to debug server install failure. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	663a1e0527	Quieted screenshot logging in anvil-daemon. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	fcbace6713	Updated anvil-join-anvil to hold if either node is still running anvil-configure-host * Fixed a minor bug and added logging of maintenance_mode calls in anvil-configure-host. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	c039c58128	* This commit moves taking screenshots of hosted servers onto the strikers using the Sys::Virt module. This was needed because the screenshots were being taken by scan-server, and that was causing it to take a long time to run. It should never have been handled by the scan agent anyway. This update requires a WebUI fix to use the new screenshot tool. This tool also adds holding multiple screenshots to allow users to "scrub" through screenshots up to 10 hours in the past. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	d255adc7b4	* Updated anvil-daemon to set the mode of /mnt/shared/* to 0777 during creation and to check that that mode is set for existing sub-directories. This resolves issue #443 . * Cleaned up anvil-manage-dr.8 hyphen escapes. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	be290bf561	This commit fixes a bug where the drbd kernel module build was being killed mid-compile, leaving DBRD unusable. * Created System->wait_on_dnf() which was plucked from anvil-daemon, and now also called in scancore and anvil-safe-start. * Updated scancore and anvil-safe-start to check on start that DRBD's kernel module is available (and build if not). Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	f57ab1a78c	* Updated anvil-daemon to not hold jobs at startup is the host isn't configured yet. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	66c82e5e22	* Fixed a bug in anvil-update-system where updating a single package with --reboot wouldn't request a reboot. Finished reworking it so that a check is made to see if the kernel or DRBD kmod will be updated and, if so, removes the kmod-drbd RPMs prior to doing the update (as opposed to the sloppier check-on-error method). * Fixed a bug in System->reboot_needed() where the cache file path had a typo in the hash key. * Updated anvil-daemon to use the full path to dnf when determining if a dnf process was running. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	e278de4b5a	The main change in this commit deals with anvil-daemon startup. During OS updates, it would pick up the queued update job and run it while the other --no-db one was still running. This could become an issue for other tasks in the future, so updated anvil-daemon to not run any jobs for the first minute after startup. Also updated it to see if an OS update is underway (given how it can start mid-RPM update, before packages like kmod-drbd are ready to build). While doing this, implemented caching of daily tasks (like agine out data, archiving data, network scans, etc) to only run once per day, period. As it was before, they would always run on anvil-daemon startup, then wait 24 hours. Note that work has started it reworking anvil-update-system, but it is incomplete (and broken) in this commit. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	d741f4aa6f	* Updated anvil-daemon to not exit on high RAM use is any job is running. * Updated anvil-update-system to reboot a target whose kernel updated using an anvil-manage-power job, * Started making striker-update-cluster run as a job (not at all complete). Fixed a bug where the wrong IP was being used when finding access to a target. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	751687129a	* Updated anvil-daemon to not exit on RAM use if anvil-update-system is running. * Fixed a bug in anvil-safe-stop where it wouldn't trigger a migration when the peer is online. * Updated anvil-update-system to set job_data to 'failed' and exit with rc 4 if the os update failed. * Got striker-update-cluster to error out and exit if a called 'anvil-update-system' job failed. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
Tsu-ba-me	4f46bb43eb	fix(tools): remove server screenshot fetching in anvil-daemon	1 year ago
Tsu-ba-me	d95eb699f9	chore: disable web VNC, screenshot pieces to avoid libvirt deadlock	1 year ago
Tsu-ba-me	d98df4b2a4	fix(tools): isolate non-striker tasks in anvil-daemon	1 year ago
Tsu-ba-me	560d60c7e8	fix(tools): get server screenshots every minute and punt to strikers WIP	1 year ago
digimer	1d12fb32b4	* Completed the new anvil-watch-drbd which replaces watch_drbd. * Updated Email->get_current_server() to always load mail server data from the database. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	c9e11fbbfc	* Added checks to anvil-provision-server to fail out if either of the SN IPs are not found when generating a DRBD resource config. * Added logging to anvil-provision-server and anvil-daemon to try to find the cause of jobs being re-run after completing. May have fixed with a fix to job_progress updates going to 100 too early in some cases. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	156a0ca201	Updated anvil-daemon's new job launching logic to allow the restart of a running job that failed out early. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	47f7a35df3	The main purpose of this commit is to add serial execution of similar jobs to help reduce race conditions for scripted jobs, like multiple server creation. * Fixed a small logging bug in DRBD->allow_two_primaries(). * Updated Database->get_jobs() to record jobs sorted by modified_date so that jobs can be run in the order they were recorded. * Updated anvil-daemon to track which commands need to be run, and when two or more of the same command need to be run, they're run serially, with each subsequent run starting after the previous one completes. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago

1 2 3 4 5

236 Commits (feature/softwareraid)