anvil

Commit Graph

Author	SHA1	Message	Date
Tsu-ba-me	b3f2644d07	fix: allow parameter to overwrite cgi input in Account->login	1 year ago
Tsu-ba-me	226c423af0	fix: allow param override in generate_manifest in Striker.pm	1 year ago
digimer	156a0ca201	Updated anvil-daemon's new job launching logic to allow the restart of a running job that failed out early. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	cc15eca6fb	* Added anvil-watch-power to git. * Added a check to cleanup size input to Convert->human_readable_to_bytes() when passed pre-processed strings. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	47f7a35df3	The main purpose of this commit is to add serial execution of similar jobs to help reduce race conditions for scripted jobs, like multiple server creation. * Fixed a small logging bug in DRBD->allow_two_primaries(). * Updated Database->get_jobs() to record jobs sorted by modified_date so that jobs can be run in the order they were recorded. * Updated anvil-daemon to track which commands need to be run, and when two or more of the same command need to be run, they're run serially, with each subsequent run starting after the previous one completes. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	dda0fbd7d5	* Updated DRBD->allow_two_primaries() to be more careful at evaluating peer-node-id. * Updated DRBD->manage_resource() to set allow-two-primaries=no when up'ing a resource (as no migration can be in progress during an up command). * Updated scan-drbd to look for StandAlone resources and call DRBD->manage_resource({task = 'up'}) if a connection to a peer node is StandAlone or if the local disk state is detached. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	b6a249d5e7	* Updated Cluster->add_server() to set the preferred host based first on if the server is running on a node, and if not, on the primary node (where before it defaulted to node 1). * Updated DRBD->delete_resource() to call scan-drbd and scan-lvm to ensure that the database is updated with the newly freed resources. * Updated anvil-delete-server and anvil-provision-server to call select scan agents to ensure freed resources are immediately recorded. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	b03587967b	* Updated Cluster->add_server() to batch the creation of the server and the location constraints in one commit to the CIB. * Updated scan-lvm to look for and delete duplicate entries. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	b7abc481e6	Updated scan-cluster to check to see that migrate_to and migrate_from are given a timeout of 600s and an on-fail of "block". Updated Cluster->add_server() to set migrate_from to timeout=600s and on-fail=block as well. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	c82bd9d73a	* Created the new anvil-watch-power tool that shows the status of UPSes known on the system, including their "on battery" state, charge percentage, estimated hold up time, etc. * Updated Database->get_power() and ->get_upses() to store both the time stamp and unix time stamps. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	bc3d04ad2e	* Updated Cluster->add_server() to wait up to 15 seconds for a server to appear to ensure that the pcs call to add the server with the right requested running state. * Updated Cluster->recover_server() to set the desired recovery state before calling the crm_resource refresh. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	0e57836c8f	This commit addresses (hopefully) issue #329 . * Updated DRBD->get_status() to attempt to recompile the drbd kernel module if the drbdsetup status fails. If it continues to fail, it exits gracefully now. * Updated ocf:alteeve:server to test access over a given IP before calling Server->find to avoid timeouts when the peer is down. Also updated it to set the constraints to keep the server on the new host when the old host returns to the cluster. * Fixed a bug in scan-cluster where a server that is FAILED but not running is now properly recovered. Signed-off-by: digimer <mkelly@alteeve.ca>	1 year ago
digimer	c50a1936c0	* This adds the new 'file_locations' -> 'file_location_ready' column and associated methods. This is set to TRUE/1 when the file referenced is found on disk and it is the expected size and md5sum. This is meant to allow programs to wait/watch or a file to be ready if they need to use it. Files are now checked periodically via anvil-daemon. * Removed hard-coded log levels in anvil-provision-server and anvil-manage-storage-groups. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	26fa3c7e32	Fixed a bug where Get->available_resources() was missing LVM/storage group data in some cases. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	510db70253	Another attempt to resolve the stoage group race condition. This moves the check for auto-assembly to scan-lvm. It only works for the first assemble, after that the user can/should use anvil-manage-storage-groups. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	e483840ceb	Second attempt to fix the storage group race condition. This time, we only let node 1 assemble storage groups. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	d64044c7d1	Test fix for storage group race condition. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	9a58f4d1ff	* This is a small commit to increase logging while chasing down a race condition issue with assembling storage groups. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	895f1ec262	This fixes a race condition when multiple servers are provisioned at (nearly) the same time. * In DRBD->get_next_resource(), implemented a "hold" system where the DRBD minor and TCP port(s) returned are marked as being held for one minute. So subsequent calls won't use the same numbers. * In anvil-daemon, added a check in run_jobs() where only one instance of a given job command will be started per 2-second loop. This should help reduce the chance of simultaneous race confitions in general. * Removed from anvil-provision-server and most other tools the call to Job->get_job_uuid(). If the program is called without the job_uuid, don't try to find it. This allows a human (or script) to make repeated calls to a program without one of those calls running a pending job instead. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	e7537b0ca3	* Fixed a bug where, when DRBD->gather_data() calls 'drbdadm dump-xml' and the output includes usage data, it breaks XML parsing. * Fixed a bug in Get->available_resources() where DELETED servers were being counted in the used resources math. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	dc7b909bfc	More logging to debug storage group race condition Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	bd575c6a7d	Bumped logging for storage group management. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	89eae7098e	NOTE: This updates the reserved RAM to 8 GiB from 4 GiB! * Adds support for 'anvil_resources:🐏:reserved' that can be set to a number of MiB to override the default 8192. * Adds support for 'anvil::<anvil_uuid>::resources:🐏:reserved' to allow for per-Anvil! node override on the reserved RAM default, and over the 'anvil_resources:🐏:reserved' option. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	025c2a6f54	* Updated Email->get_next_server() to ignore DELETED mail servers, and it now loads mail servers if not yet in memory. This resolves issue #306. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	1afa7ce09e	* Created Cluster->recover_server() that uses crm_resource to try to recover a server that has entered a FAILED state. * Updated (not not yet completed) scan-cluster's check_resources() function to check if a FAILED server is ready to try to recover. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	f9689a7106	Updated ocf:alteeve:server to look for /tmp/<resource>.fail' and, if that file exists, exits with rc:1. This is done to allow for testing. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	cf73d8ed36	* Updated System->configure_ipmi() to auto-configure DR hosts once they've been assigned a BCN IP address. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	1c274ba58d	* Fixed a bug in anvil-delete-server that was preventing the complete deletion of a server if the DRBD resource had already been removed. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
Deezzir	109aa1ba3d	docs: added annotation for the new arg	2 years ago
Deezzir	7d5f18b20d	fix: introduced optional arg for clean_spaces	2 years ago
Deezzir	9241b5ef6a	docs: added annotation for the new arg	2 years ago
Deezzir	deac1fc6a8	fix: introduced optional arg for clean_spaces	2 years ago
digimer	ddc6965b60	* Fixed a bug where references to files on Anvil! nodes was broken in anvil-provision-server and anvil-manage-files. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	efebd135eb	* Removed more references to 'dr1_host_uuid' from the old way of linking DR hosts to Anvil! nodes. * Fixed a bug where servers protected by DR hosts aren't deleted when the server itself is deleted. * Updated DRBD->delete_resource() to remove the server's XML file if the host is a DR host. * Updated anvil-version-change and anvil.sql to enable update_audits and the audits table. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	41fb8baeda	* Fixed a bug in Database->get_storage_group_data() that was deleting DR host storage group members. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	8ff40ec42c	* Fixed a SQL query bug in Database->get_drbd_data(). * Got more work done on anvil-manage-server-storage; Now shows DRBD resource size, backing LV and size, and calculates/displayes metadata size. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	040bc02e26	* This adds the new Database->get_drbd_data() that, like ->get_lvm_data, collates the DRBD data collected by scan-drbd into more readibly parsable data structure. * Updated DRBD->parse_resource() to add references to a resource name and volume for a given backing disk. * Comtinued work on anvil-manage-server-storage. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	8e0e51544c	* Continued work on anvil-manage-server-storage. * Created the new Database->get_lvm_data to compile LVM data from scan-lvm * Updated DRBD->parse_resource to call Database->get_lvm_data if needed, and to track backing devices to Storage Groups. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	b144976853	This resolves Issue #310 . * Updated Database->get_file_locations() to record files available on Anvil! nodes by tracking hosts in Anvil! systems (needed after reworking how DR hosts are linked). * Updated Get->available_resources() to call Database->get_files() and ->get_file_locations() to restore tracking files available on Anvil! nodes. * Fixed a couple display bugs in anvil-provision-server when called with --ci-test --options. * Continued work on anvil-manage-server-storage. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	fea10e5bb1	* Prefixed all 'virsh' calls with 'setsid --wait' to help prevent future hangs if the call happens without a shell. * Updated anvil-manage-server-storage to the point where it can now insert and eject optical disks! * Updated System->call to log parameters if 'shell_call' isn't set. * Fixed a bug in anvil-manage-server process_interactive where an $anvil->data reference was being scoped. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	7891c9b2b1	* Fixed a bug in Network->load_ips() where interfaces were being marked as type 'bridge' or 'bond'. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	5dbdd20d7e	* Fixed a bug in Network->load_ips() where the IP address on a bridge or bond was having the device name recorded incorrectly. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	ab3e8afe6e	Fixed a bug in Storage->push_file() where file path wasn't updated from incoming to files, preventing the push to other hosts from working. Also fixed a minor issue where the file size was sometimes 0, making transfer calculations useless. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	254f7ef4e2	This should fix the tracking of what files belong where, using the new DR links system. It also should finish (though testing is still needed) the serial rsync issue. * Created Database->track_files() as a dedicated method as trying to verify the existence of file_locations during Database->load_anvils() was fragile and prone to recursive loops. * Updated Database->insert_or_update_file_locations() to take an anvil_uuid and recursively call for each host, to maintain compatibility with the old ways, and make it simpler to add an entry for both sub-nodes in an Anvil!. * Created Storage->push_file() that takes a file and rsync's it to all other machines, or creates a job for the file to be pulled if the target can't be accessed. * Updated anvil-manage-files and anvil-sync-shared to use the new Storage->push_files and Database->track_files methods. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	645f54ab89	This commit has more changes than I would normally like, but it's all linked to changing file uploads to rsync serially. * To update file handling for the new DR host linking mechanism, file_locations -> file_location_anvil_uuid was changed to file_location_host_uuid. This required a fair number of changes elsewhere to handle this, with a particular noted change to Database->get_anvils() to look at host_uuid's for the subnodes in an Anvil! and, if either is marked as needing a file, make sure the peer is as well. Similarly, any linked DRs are set to have the file as well. * Created a new Network->find_access that simply takes a target host name or UUID, and it returns a list of networks and IPs that the target can be accessed by. * Updated Network->load_ips() to find the network interface being used for traffic so that things like the interface speed can be recorded, even when an IP is on a bridge or bond. Unrelated, but in this commit, is a restoration of calling scan agents with a timeout now that the virsh hang issue has been resolved. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	7710d9d109	* Created the new anvil-manage-server-storage tool which will specifically handle managing a server's disks. * Created DRBD->parse_resource() to pass a specific DRBD resource's XML data. * Fixed a bug in Get->available_resources() so that if the threads is lower than CPU cores, the cores are used as the total available to VMs. * Fixed bugs in Get->server_from_switch() where it just wasn't working properly. * Updated scan_drbd to not reset a resource's size to 0-bytes when a resource goes offline. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	9751c883cb	* Updated Cluster->assemble_storage_groups() to remove refrences to anvil_dr1_host_uuid. Also added the logic for auto-adding DR host's VGs to a storage group. Commented it out though as, for now, this might be a bad idea. Needs more thought. * Fixed a bug in Database->get_storage_group_data() to load hosts data when needed. Also fixed a bug where new members didn't return the new storage_group_member_uuid. * Updated anvil-manage-host to use the new switch handler. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	7773e5f9b8	* Updated logging in DRBD->get_devices(). * Added a check and exit if anvil-manage-dr is asked to protect a server on a machine that doesn't know about that server. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	d88fde7733	Updated DRBD->delete_resource() to use '--force' instead of 'echo Yes' (which no longer works). Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago
digimer	e012d6016c	Tha major point of this commit is to add the new 'anvil-manage-storage-groups' program that, well, manages storage groups. * Updated the storage_group_members table to add the 'storage_group_member_note' that can be set to 'DELETED' to track when a member is deleted. Updated anvil-version-changes to check for and add this column as needed. Updated the anvil.sql schema for the same. * Updated Cluster->insert_or_update_storage_group_members to add the new column. Signed-off-by: digimer <mkelly@alteeve.ca>	2 years ago

1 2 3 4 5 ...

797 Commits (997c501d6a88a3780812149d988addc58b622e04)