anvil

Commit Graph

Author	SHA1	Message	Date
digimer	518fddfa82	More progress on the new NM version of anvil-configure-host * It's technically done, but I know bugs remain. * Updated Jobs->update_progress() to take 'file' and 'line' to make it easier in the logs to see the origin of the message, when logging the update. * Created Network->modify_connection() to update network manager variables. Created ->reset_connection() to take an interface down and bring it back up again. * Fixed a bug in scan-network where the device_to_uuid hash wasn't being stored. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	83057d0b45	Fixed several bugs around renaming interfaces * Also fixed problems with scan-network related to the new network naming / NM system. * Updated Database->insert_or_update_network_interfaces() to better search for a network_interface_uuid when not specified. * Updated Network->collect_data() to take the new 'start' parameter which, when set, brings up unconfigured connections/devices. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	cad524db9d	Removed anvil-update-states * Created new anvil-monitor-network daemon to trigger scan-server via anvil-monitor-network on network events. * Moved functionality into scan-network Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	a27773a69d	scan-network now records interfaces, bonds and bridges! * Much testing still needed, but this is a significant milestone. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	9c67b97fdd	Fixed a bug in initializing DROP'ed DBs. * Got more work done on adding network_interfaces to the database in scan-server. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	ec11335197	Fixed DB initialization bugs. * More work done on the new network stack also. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	52e7875252	Bumoed logging to find '!!error!!' related parsing errors. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	c5e72797fd	Made the match for the partition 'swap' more flexible. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	e4fc831284	Added a missing variable to an alert. Signed-off-by: digimer <mkelly@alteeve.ca>	11 months ago
digimer	822854f0c3	Fixed a bug in scan-ipmitool that was causing duplicate history entries * Increased logging to scan-ipmitool and scan-network to help trace a duplicate DB entry bug. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	51978e1609	Update scan-server to only alert on large boot time changes Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	a11b87458e	Gracefully handle errors from changed node host names in scan-cluster. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	5ec395c53a	Reworked DB resync logic. With this new system, a 'primary_db' is chosen (first connected DB UUID when sorted) and only it does resyncs. Further, resyncs have been pulled from all tools except anvil-daemon. So with this new system, the chances of duplicate, simultaneous resyncs should be removed (hopefully for real this time). * Database->check_agent_data() no longer calls a resync after loading a schema. * Removed the Database->coonnect() 'all' parameter * The database used to read from is now always the same as the primary, even if there is a local DB. * Database->connect() 'check_for_resync' parameter can now be set to '2', which means "check for resync _if_ I am primary", where '1' still checks for resync no matter what. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	b1f89c2723	Finished initial version of striker-show-jobs * Updated Database->get_jobs() to take 'job_host_uuid = all' to allow loading jobs from all cluster machines. Also updated it to record the 'job_host_uuid' and the unix timestamp version of 'modified_date'. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	829ae546a2	Beginning work on new Server->locate() method to find servers across an Anvil! cluster. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	122816255d	* Fixed a bug where a sensor value of '0' was being interpretted as the value not existing. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	580980717d	This commit covers the convertion of 'virsh' shell calls to using 'Sys::Virt' module, and fixes several small bugs related to scan-server; * Switched all calls to virsh to use Sys::Virt to deal with contention of simultaneous virsh calls. * Removed collecting screenshots from scan-server. * Fixed a bad variable substitution in an alert. * Fixed a bug where a server's boot time wasn't being recorded properly. * Reworked how we determine which server definition was most recently updated and propogated. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	a81a110261	* Remove forced log level and secure logging. This addresses issue #386 Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	a0cb791f47	This contains fixes needed for beta from additional testing. * Updated the pcs wrapper to flock anything but status calls. * Updated scan-apc-pdu to purge regardless of the host it's called on any host. * Fixed a bug striker-purge-target that wouldn't purge anvil nodes in various cases. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	6ee2ad75db	* Updated anvil-delete-server to actively check for and delete any drbd-fenced attributes left over in the CIB after a server is deleted. This addresses issue #374 . Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	6a7c9923ad	* Fixed second variable replacement bug, re issue #338 . Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	9ebe192306	This fixes a variable substitution but, addressing issue #338 . Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	7258781712	* Updated scan-cluster to detect stale drbd-fenced attributes in the CIB, generally left after a server is deleted. This addresses issue #374 . Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
Tsu-ba-me	dac247f66e	fix(scancore-agents): get screenshot of server(s) running on local node in scan-server	1 year ago
digimer	e0316da88b	* Got anvil-manage-server-storage working enough to grow existing disk's hard drive sizes, and to insert/eject optical disks. * Hit a bug where a server's definition file was written to disk while not being valid. Added logging in case it happens again, and additional safe-guards to help avoid it from recurring. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	dda0fbd7d5	* Updated DRBD->allow_two_primaries() to be more careful at evaluating peer-node-id. * Updated DRBD->manage_resource() to set allow-two-primaries=no when up'ing a resource (as no migration can be in progress during an up command). * Updated scan-drbd to look for StandAlone resources and call DRBD->manage_resource({task = 'up'}) if a connection to a peer node is StandAlone or if the local disk state is detached. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	929806cef7	Fixed variable substitution names in scan-server. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	b03587967b	* Updated Cluster->add_server() to batch the creation of the server and the location constraints in one commit to the CIB. * Updated scan-lvm to look for and delete duplicate entries. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	b7abc481e6	Updated scan-cluster to check to see that migrate_to and migrate_from are given a timeout of 600s and an on-fail of "block". Updated Cluster->add_server() to set migrate_from to timeout=600s and on-fail=block as well. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	bc3d04ad2e	* Updated Cluster->add_server() to wait up to 15 seconds for a server to appear to ensure that the pcs call to add the server with the right requested running state. * Updated Cluster->recover_server() to set the desired recovery state before calling the crm_resource refresh. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	0e57836c8f	This commit addresses (hopefully) issue #329 . * Updated DRBD->get_status() to attempt to recompile the drbd kernel module if the drbdsetup status fails. If it continues to fail, it exits gracefully now. * Updated ocf:alteeve:server to test access over a given IP before calling Server->find to avoid timeouts when the peer is down. Also updated it to set the constraints to keep the server on the new host when the old host returns to the cluster. * Fixed a bug in scan-cluster where a server that is FAILED but not running is now properly recovered. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	510db70253	Another attempt to resolve the stoage group race condition. This moves the check for auto-assembly to scan-lvm. It only works for the first assemble, after that the user can/should use anvil-manage-storage-groups. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	83aa4e6a5f	Updated scan-cluster to check for FAILED resources (servers) and, if found, attempt to recover it. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	1afa7ce09e	* Created Cluster->recover_server() that uses crm_resource to try to recover a server that has entered a FAILED state. * Updated (not not yet completed) scan-cluster's check_resources() function to check if a FAILED server is ready to try to recover. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	c7a923fdfb	* Fixed a bug in scan-server where DELETED servers were being set to 'shut off'. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	bf2e3e25fb	* Added a check for undefined variable/value pairs in cachevault data that was causeing SQL UPDATE errors. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
Deezzir	7d5f18b20d	fix: introduced optional arg for clean_spaces	2 years ago
Deezzir	deac1fc6a8	fix: introduced optional arg for clean_spaces	2 years ago
digimer	efebd135eb	* Removed more references to 'dr1_host_uuid' from the old way of linking DR hosts to Anvil! nodes. * Fixed a bug where servers protected by DR hosts aren't deleted when the server itself is deleted. * Updated DRBD->delete_resource() to remove the server's XML file if the host is a DR host. * Updated anvil-version-change and anvil.sql to enable update_audits and the audits table. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	fea10e5bb1	* Prefixed all 'virsh' calls with 'setsid --wait' to help prevent future hangs if the call happens without a shell. * Updated anvil-manage-server-storage to the point where it can now insert and eject optical disks! * Updated System->call to log parameters if 'shell_call' isn't set. * Fixed a bug in anvil-manage-server process_interactive where an $anvil->data reference was being scoped. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	7710d9d109	* Created the new anvil-manage-server-storage tool which will specifically handle managing a server's disks. * Created DRBD->parse_resource() to pass a specific DRBD resource's XML data. * Fixed a bug in Get->available_resources() so that if the threads is lower than CPU cores, the cores are used as the total available to VMs. * Fixed bugs in Get->server_from_switch() where it just wasn't working properly. * Updated scan_drbd to not reset a resource's size to 0-bytes when a resource goes offline. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	76c8088aee	* Updated scan-apc-pdu to only run on the active striker DB (as set during Database->connect()) to prevent contention from simultaneous scan agent runs from different machines. Signed-off-by: digimer <digimer@gravitar.alteeve.com>	2 years ago
digimer	0fa6ddebc5	Updated scan-network to see an interface state of 'activated' as up (used to check specifically for 'active'). Signed-off-by: digimer <digimer@gravitar.alteeve.com>	2 years ago
Digimer	eae2ab4d9f	* Undid the #!no_value!# -> !!no_value!! change as it broke language processing. * Fixed a bug in scan-apc-pdu that was preventing it from compiling. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago
Digimer	4528f07508	* Fixed a bug where fence-handler was repeatedly added by scan-drbd. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago
Digimer	4ba1982183	This is the start of a set of changes needed to rework how we handle DRBD fence requests, so that they create location constraints instead of triggering a full stonith fence. * In Cluster->parse_cib(), added parsers for node attributes and resource rules. Also stored the existence of and details of each under the server resources for easier referencing. * Updated scan-server to check for / add DRBD fence rules as needed. Scancore APC agent bugs; * For clarity, converted all '#!no_value!#' and '#!no_connection!#' to use '!!' instead in APC scan agents. * Fixed a bug to set/clear alerts related to phases disappearing to deal with concurrent logins from different hosts triggering false phase loss alerts. * Fixed missing variables not being passed to alerts/log entries. Started more work on anvil-manage-server, but on hold again while the DRBD fencing work is completed. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago
Digimer	13b0f5bdcc	Bumped 'Exhaust Temp' jump threshold to 30c in scan-ipmitool. Adjusted some logging. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago
Digimer	a4ef93404c	* Fixed a bug in DRBD->gather_data() to remove trailing commas for existing TCP ports. * Added the missing 'clear-mapping' switch to Get->switches in anvil-daemon. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago
Digimer	ac8135709a	Fixed a bug where scan-server faulted with a divide by zero error when the host had no swap. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago
Digimer	2fab7bc1b7	This adds support (testing needed) for "Long-Throw" DR; which is a wrapper for using 'drbd-proxy' to provide larger transmit buffers so slow/high-latency DR hosts. * Created DRBD->check_proxy_license() to do (some level of) sanity checks on the DRBD proxy license file. * Updated DRBD->gather_data() to parse out the inside and outside ports for resource configs using proxy. * Reworked DRBD->get_next_resource() to return 1, 3 or 7 TCP ports depending, with the new long_throw_ports parameter triggering the 7 ports. * Added 'tcpdump' to the anvil-core requires list. * Reworked scan-drbd to record the ports used in proxy configs. This required adding a check to change the 'scan_drbd_peer_tcp_port' column type to 'text' to support CSVs. * Reworked anvil-manage-dr (needs testing!) to support "long-throw" DR configs. * Updated anvil-safe-stop to check if the nodes are in the cluster before trying to migrate. Signed-off-by: Digimer <digimer@alteeve.ca>	2 years ago

1 2 3 4

182 Commits (518fddfa8286966098773b100e52f53aa7632b04)