Madison Kelly
4b82c5f2bf
Added 'timeout' logging to help debug SIGALARM exits.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
digimer
cfa3432e78
Added a catch for SIGALARM
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
Digimer
99eb177da2
Merge pull request #660 from ClusterLabs/net-config
...
Net config
7 months ago
Madison Kelly
5495a82595
Improved handling of lost DB connections.
...
* Updated Database->reconnect() to take 'lost_uuid' and, if passed,
deletes the cached file handle before calling ->disconnect().
* Updated Database->query() to return an empty hash reference instead of
'!!error!!', as almost always, callers do an array count, which
triggered errors as it's not a hash reference. Updated docs to reflect
this.
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
Madison Kelly
c00fd62ea6
Removed the lock release in Database->reconnect().
...
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
Madison Kelly
d3ddbd395f
Added logging for DB connection test bug
...
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
Madison Kelly
52643885d2
Added a check to avoid deep recursions when testing DB access
...
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
Madison Kelly
9cb2446bea
Cleaned up handling of lost DB access
...
* Updated Database->query() to track when a specific DB to read from is
passed. If so, and that is lost, return an error. If not, and another
DB is available, switch to it.
* Updated Database->write() to skip trying to write to a lost DB.
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
Madison Kelly
9db9f81104
Reworked Database->_test_access to do a general reconnect
...
* Before, it would try to reconnect to just the lost DB, which could
trigger an error.
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
Madison Kelly
574b2dccae
Updated Database->query to better handle a lost DB connection.
...
* Created Database->reconnect to clean up reconnecting to the DBs
Signed-off-by: Madison Kelly <mkelly@alteeve.com>
7 months ago
digimer
8c1c0597da
Updated anvil-daemon to run anvil-configure-host in the foreground.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
f7082c930b
Fixed a bug in parsing the fence agent for multi-device fence methods.
...
* Updated the fence_ipmilan timeouts to 30 seconds to help debug fence
config failures.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
25a0454dce
Better handling of lost DB connections.
...
* Added a sync call to Tools->nice_exit() to ensure logs are flushed.
* Updated Database->quote() to be in an eval block to better handle
cases where the DB handle is lost.
* Added an hourly check to anvil-daemon and moved the memory in use
check to run only once per hour.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
b86493fff4
More logging to debug apparent hang
...
* Added an explicit 'sync' call when writing to logs. TO BE REMOVED!
* Disabled anvil-monitor-daemons and anvil-monitor-performance in case
this is somehow trigging program exits.
* Converted prints to Log->entry calls in anvil-change-password
* Added PID state info logging for running jobs.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
4766ceff70
Added logging to debug network config issue.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
8dc3a8262f
Updated pod on requiring 'new' for manifest_uuid when creating new
...
manifests.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
566887462e
Fixed parameter names being sent to Striker->generate_manifest().
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
3c52d1e28e
Changed how parameters are picked up in Striker->generate_manifest
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
a3ac5cf7f8
Fixed a bug that prevented install manifests from being saved.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
f08df75384
Made resync checks happen on any striker running for less than two
...
hours.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
d6c5aa3903
Added a timeout to Database->query() calls.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
368673eac2
Added a flag for when NM is changed and, if set, NM is restarted.
...
* Also bumped nmcli sleeps to 5s.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
acf30229ef
Added code to restart NetworkManager if needed
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
b990d21dc3
Fixed a bug where migrations would needlessly fail memory checks.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
ab33c716cb
Created a specific check that there's a hosts entry for each DB
...
* This is meant to deal with a case where, when a DB is added to
anvil.conf but that new entry is not yet in hosts, the program crashes
because of a duplicate key when calling insert_or_update_hosts for all
DBs.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
3d50f45984
Added a 1 second delay to nmcli calls
...
* Also fixed a bug Database->get_storage_group_data() to add a missing
column to adding members.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
033052f449
Shortened the time to reboot when no DBs come back after net reconfig
...
* Also updated to directly call a reboot.
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
8e53993f67
Shortened the anvil-daemon job start up delay.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
7 months ago
digimer
6826b12188
Added a start for configured interfaces found to be down after boot.
...
* Added the 'up' parameter to Network->collect_data() that will bring up
an interface we configured that is down.
* Updated scan-network to call Network->collect_data() with 'up' if the
uptime is less than ten minutes.
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
6d121dc0c0
Mapped each interface name in match.interface-name to a UUID lookup.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
7925a3f42c
* Added more man pages.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
Fabio M. Di Nitto
9cfadcf096
Merge pull request #648 from ClusterLabs/fix-fence-opts-parsing
...
fence: do not load switches for deprecated agents options
8 months ago
Fabio M. Di Nitto
ef8bb19e60
fence: do not load switches for deprecated agents options
...
loading deprecated options causes switches to be overwritten during
xml parsing, generating incorrect pacemaker configs
Closes: https://github.com/ClusterLabs/anvil/issues/636
Signed-off-by: Fabio M. Di Nitto <fabbione@fabbione.net>
8 months ago
Fabio M. Di Nitto
494e538257
Merge pull request #647 from ClusterLabs/fix-distcheck-as-user
...
build: fix make distcheck as user vs root
8 months ago
Fabio M. Di Nitto
def90f2daa
build: fix make distcheck as user vs root
...
use proper autotool way to install / uninstall files
Signed-off-by: Fabio M. Di Nitto <fabbione@fabbione.net>
8 months ago
Digimer
e4bd962715
Merge pull request #644 from ClusterLabs/upgrade-tools
...
Upgrade tools
8 months ago
digimer
5c3d1860c8
Made the host_key check conditional on an available DB
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
9775612de7
Added an explicit check that IPs for a hostname are added in known_hosts
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
1152c50f3a
Added pcsd config, and -y support.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
3e63b726d3
Added node 2 joining an Anvil! node if not started by node 1.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
e00dec7cba
Added loading existing corosync/authkey from peer during rebuild.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
ec6acdd6d8
Reworked host validation to avoid warnings in logs.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
bd2e4c46ae
Updated Network->load_ips() to use the device_name when available.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
45e3a1e8a9
Updated Remote->_check_known_hosts_for_target() to replace updated keys
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
9999d6f522
Fixed a bug where nics were not being found by their NM device name
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
digimer
7ecd0a4d70
Starting work on rejoining a replacement subnode to an Anvil! node
...
Signed-off-by: digimer <mkelly@alteeve.ca>
8 months ago
Digimer
84e321ff7d
Merge pull request #635 from ClusterLabs/tools-dev
...
Tools dev
9 months ago
digimer
863a7b1b07
Added missing data being recorded in crm_mon parser
...
Signed-off-by: digimer <mkelly@alteeve.ca>
9 months ago
digimer
014136ddd0
Added manual parsing of crm_mon XML when parsing resource states.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
9 months ago
digimer
44aa0fb8d9
Bumped logging to debug periodic strike init resync failure
...
Signed-off-by: digimer <mkelly@alteeve.ca>
9 months ago