digimer
0e57836c8f
This commit addresses (hopefully) issue #329 .
...
* Updated DRBD->get_status() to attempt to recompile the drbd kernel module if the drbdsetup status fails. If it continues to fail, it exits gracefully now.
* Updated ocf:alteeve:server to test access over a given IP before calling Server->find to avoid timeouts when the peer is down. Also updated it to set the constraints to keep the server on the new host when the old host returns to the cluster.
* Fixed a bug in scan-cluster where a server that is FAILED but not running is now properly recovered.
Signed-off-by: digimer <mkelly@alteeve.ca>
1 year ago
Fabio M. Di Nitto
7cee742b67
Merge pull request #331 from ClusterLabs/digimer-patch-1
...
Delete notes
1 year ago
Digimer
711322f273
Delete notes
1 year ago
Digimer
91de6bb30e
Merge pull request #330 from ClusterLabs/anvil-tools-dev
...
* Fixes issue #329 ; When multiple attributes exist when checking if w…
2 years ago
digimer
284a2957d6
* Fixes issue #329 ; When multiple attributes exist when checking if we're in maintenance mode in fence_pacemaker, the expected hash reference was actually an array reference.
...
* Fixed a bug in anvil-version-changes where update_file_location_ready() needed to be called before update_file_locations().
* Added a bit more logging for future debugging.
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
Digimer
f3a65fc04d
Merge pull request #328 from ClusterLabs/anvil-tools-dev
...
This should resolve issue #271 .
2 years ago
digimer
8f375c58a9
* Fixed a typo in anvil-daemon that prevented compiling.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
110dceb55e
* Added a check to make sure files were ready before provisioning a server.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
c50a1936c0
* This adds the new 'file_locations' -> 'file_location_ready' column and associated methods. This is set to TRUE/1 when the file referenced is found on disk and it is the expected size and md5sum. This is meant to allow programs to wait/watch or a file to be ready if they need to use it. Files are now checked periodically via anvil-daemon.
...
* Removed hard-coded log levels in anvil-provision-server and anvil-manage-storage-groups.
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
Digimer
dfc0c2c492
Merge pull request #326 from ClusterLabs/anvil-tools-dev
...
* Fixed a bug where, when DRBD->gather_data() calls 'drbdadm dump-xml…
2 years ago
digimer
26fa3c7e32
Fixed a bug where Get->available_resources() was missing LVM/storage group data in some cases.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
510db70253
Another attempt to resolve the stoage group race condition. This moves the check for auto-assembly to scan-lvm. It only works for the first assemble, after that the user can/should use anvil-manage-storage-groups.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
e483840ceb
Second attempt to fix the storage group race condition. This time, we only let node 1 assemble storage groups.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
d64044c7d1
Test fix for storage group race condition.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
1bba56a5b1
Hard coded anvil-provision-server to log level 2 while chasing a race condition is storage groups.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
9a58f4d1ff
* This is a small commit to increase logging while chasing down a race condition issue with assembling storage groups.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
895f1ec262
This fixes a race condition when multiple servers are provisioned at (nearly) the same time.
...
* In DRBD->get_next_resource(), implemented a "hold" system where the DRBD minor and TCP port(s) returned are marked as being held for one minute. So subsequent calls won't use the same numbers.
* In anvil-daemon, added a check in run_jobs() where only one instance of a given job command will be started per 2-second loop. This should help reduce the chance of simultaneous race confitions in general.
* Removed from anvil-provision-server and most other tools the call to Job->get_job_uuid(). If the program is called without the job_uuid, don't try to find it. This allows a human (or script) to make repeated calls to a program without one of those calls running a pending job instead.
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
e7537b0ca3
* Fixed a bug where, when DRBD->gather_data() calls 'drbdadm dump-xml' and the output includes usage data, it breaks XML parsing.
...
* Fixed a bug in Get->available_resources() where DELETED servers were being counted in the used resources math.
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer-bot
cc32d5b606
Merge pull request #320 from ClusterLabs/anvil-tools-dev
...
Anvil tools dev
2 years ago
digimer
c11be1ad1a
Added a skip to ignore dot files when looking at new files.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
dc7b909bfc
More logging to debug storage group race condition
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
bd575c6a7d
Bumped logging for storage group management.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
0874ad571a
Updated anvil-safe-start to not give up on starting corosync/pacemaker if it fails on the first try.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
8ba613952c
Typo fix.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
83a527f4fa
* Removed enabling anvil-safe-start out of the RPM and into anvil-join-anvil.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
89eae7098e
NOTE: This updates the reserved RAM to 8 GiB from 4 GiB!
...
* Adds support for 'anvil_resources:🐏 :reserved' that can be set to a number of MiB to override the default 8192.
* Adds support for 'anvil::<anvil_uuid>::resources:🐏 :reserved' to allow for per-Anvil! node override on the reserved RAM default, and over the 'anvil_resources:🐏 :reserved' option.
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
f086c1be39
Fixed a bug where the total RAM was shown instead of the free RAM.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
fdf49c696f
Updated anvil-report-usage to ignore deleted servers. Also added a check to ensure hosts are loaded if not.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
c956f75406
Enabled anvil-safe-start in '%post node'.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
025c2a6f54
* Updated Email->get_next_server() to ignore DELETED mail servers, and it now loads mail servers if not yet in memory.
...
This resolves issue #306 .
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
fb70836126
This moves the call of anvil-safe-start out of scancore and into a new, dedicated systemd unit that runs on boot only.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
Digimer
6bce292969
Merge pull request #319 from ClusterLabs/anvil-tools-dev
...
Anvil tools dev
2 years ago
digimer
83aa4e6a5f
Updated scan-cluster to check for FAILED resources (servers) and, if found, attempt to recover it.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
1afa7ce09e
* Created Cluster->recover_server() that uses crm_resource to try to recover a server that has entered a FAILED state.
...
* Updated (not not yet completed) scan-cluster's check_resources() function to check if a FAILED server is ready to try to recover.
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
f9689a7106
Updated ocf:alteeve:server to look for /tmp/<resource>.fail' and, if that file exists, exits with rc:1. This is done to allow for testing.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
digimer
9bf0f50084
Added a check to see if the server's UUID exists and looping if not to prevent unitialized variable warnings.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
Digimer
660f38ac16
Merge branch 'main' into anvil-tools-dev
2 years ago
digimer
cf73d8ed36
* Updated System->configure_ipmi() to auto-configure DR hosts once they've been assigned a BCN IP address.
...
Signed-off-by: digimer <mkelly@alteeve.ca>
2 years ago
Tsu-ba-me
567abff9de
fix(striker-ui): add manage UPS tab
2 years ago
Tsu-ba-me
759cd6f58a
fix(striker-ui): add form validation and message in ManageUpsPanel
2 years ago
Tsu-ba-me
2f84f52090
fix(striker-ui): passthrough input validation in EditUpsInputGroup
2 years ago
Tsu-ba-me
aa5aad4689
fix(striker-ui): add input validation to AddUpsInputGroup
2 years ago
Tsu-ba-me
d3894081f6
fix(striker-ui): add input tests to CommonUpsInputGroup
2 years ago
Tsu-ba-me
afdd376759
fix(striker-ui): correct validity test on first render in InputWithRef
2 years ago
Tsu-ba-me
0c1ec5a88a
fix(striker-ui): expose blur and focus event handler slots in SelectWithLabel
2 years ago
Tsu-ba-me
36f9938767
fix(striker-ui): organize types in useFormUtils hook
2 years ago
Tsu-ba-me
26881c0436
fix(striker-ui): expose isRequired in build test batch functions
2 years ago
Tsu-ba-me
737850f9d0
fix(striker-ui): add hook useFormUtils
2 years ago
Tsu-ba-me
442427cf63
fix(striker-ui): add arbitrary slot before action area in ConfirmDialog
2 years ago
Tsu-ba-me
4400bf6645
fix(striker-ui): make buildMapToMessageSetter() handle array ids
2 years ago