Commit Graph

3194 Commits

Author SHA1 Message Date
Digimer
5fea8ff46a * Adds the anvil-boot-server man page.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-08-02 19:09:57 -04:00
Digimer
d8f31d9d84 * Added the anvil-boot-server man page.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-08-02 17:25:28 -04:00
Digimer
b3b185a43c * Added the alteeve-repo-setup man page and updated it to show that when called with '-h'.
* Updated scancore to use the new Get->switches() list parameter.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-08-02 14:31:46 -04:00
Digimer
d9910fc951 Finished the man page for anvil-daemon.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-29 17:43:53 -04:00
Digimer
be612ff878 * Updated Get->switches() to take 'list' and 'man' parameters. With list, the passed in switches can be checked to ensure they're valid. With 'man', if set to the name of a man page (usually $THIS_FILE) will be displayed if --help, -h or -? are used.
* Disabled striker-parse-oui until it can be reworked to store the the OUI data in a flat file instead of in the database.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-29 16:56:40 -04:00
Digimer
cd220e97dc Disabled striker-prep-databas and set Database->configure_pgsql() calls to use debug => 2.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-20 20:32:18 -04:00
Digimer
cf8198ac9a Fixed a typo causing a compilation error.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-20 15:05:27 -04:00
Digimer
3343ecaf9f * Added check to disable/stop firewalld if running when anvil-daemon starts.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-19 21:21:48 -04:00
Digimer
508e278359 Added the new 'anvil-network-profiler' tool.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-19 21:16:46 -04:00
Digimer
e025f5b927 Fixed line wraps
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-09 19:53:58 -04:00
Digimer
7fd6185445 * Disabled firewalling for now. There appears to be an issue starting up with DRBD.
* Updated Convert->time() to return whatever was passed in instead of '#!error!#'.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-09 19:46:38 -04:00
Digimer
171ea74000 * There is a fix in this commit to resolve a race condition where, when reconfiguring the network, the request to set a job to reboot would fail because the connections to all Strikers could be lost, causing Database->_test_access() would error out, blocking the reboot. When restarted, the network would not be changed, so no reboot would be requested, leaving the machine in an innaccesible state.
* Updated anvil-boot-server when called with '--all' to honour boot ordering, delays and condtions.
* Updated Database->get_servers() to collect the server's XML as well as data from the 'servers' table.
* Updated anvil-provision-server to make a new DRBD resource 'secondary' after forcing it to primary to begin the initial sync.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-06 19:22:28 -04:00
Digimer
64458198b3
Merge pull request #235 from ClusterLabs/anvil-tools-dev
Anvil tools dev
2022-07-06 09:11:58 -04:00
Digimer
0b3d282a2c * Updated Network->manage_firewall() to restore a debug level set to 1 for testing.
* Added firewall config rules for DR hosts.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-02 18:23:20 -04:00
Digimer
bce9e2caaf This is the first attempt at enabling firewalld completely. There is a decent chance that problems exist, so it won't be a surprise if a few more commits are needed to this branch before things work.
* Added multiple new private methods to Network that help in managing the firewall.
* Updated Server->boot_server to manage the firewall after the server boots. Updated ->migrate_server to create a job, if a database connection exists, for the migration target to update it's firewall as soon after the server appears as possible.
* Updated ocf:server:alteeve to manage the firewall when called post-migration, in case there was no DB connection and the job above didn't run. Fixed a bug where the disk state wasn't being evaluated properly.
* Updated scan-server to check that the firewall is managed when a server state has changed.
* Updated anvil-daemon to run Network->manage_firewall on startup.
* Heavily reworked 'anvil-manage-server' to either just run 'Network->manage_firewall', or if passed '--server X', to wait for the server to appear for up to 1 minute, then to check that the firewall is managed (to capture servers being migrated to the host.)
* Removed firewall management from striker-prep-database.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-07-02 17:06:04 -04:00
Digimer
9028611afb * Fixed a bug in Get->bridges were the bridge was not marked as found when parsing json output, breaking ocf:alteeve:server
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-30 14:09:12 -04:00
Digimer
f55d605270 * Disabled the new firewall management to prepare for a new merge with main. Needed to resolve in-field client issues.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-30 10:04:55 -04:00
Digimer
b2ea4f9adc * Moved System->manage_firewall() to Network->manage_firewall(). Started working on actually implementing it, which involves basically fully rewritting it.
* Updated tools/Makefile.am and scancore-agents/Makefile.am to add missing files.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-30 00:01:50 -04:00
Digimer
f2d06fa9b1 * Updated striker-parse-oui to only run if/when the system has been running for at least one hour.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-23 21:21:38 -04:00
Digimer
b4362eb643
Merge pull request #233 from ClusterLabs/anvil-tools-dev
Anvil tools dev
2022-06-22 14:18:35 -04:00
Digimer
5c7b174683
Merge branch 'main' into anvil-tools-dev 2022-06-22 08:41:48 -04:00
Fabio M. Di Nitto
5e5fc7fcdc
Merge pull request #234 from ClusterLabs/centos9
[spec] drop unnecessary Requires
2022-06-22 07:35:10 +02:00
Fabio M. Di Nitto
476b35d712 [spec] drop unnecessary Requires
fix installation on c9s and does not break earlier versions

Signed-off-by: Fabio M. Di Nitto <fabbione@fabbione.net>
2022-06-22 04:56:07 +02:00
Digimer
ab9b00a2f7 * Updated anvil-daemon, in its daily checks, to disable ksm and ksmtuned daemons.
* Updated scan-drbd to purge peer records that no longer have corresponding LVM data.
* Updated System->{en,dis}able-service to take the 'now' paramter which, when passed, causes the action to take immediate effect.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-21 22:25:07 -04:00
Digimer
1580ffbb24 Added the 'oui' table to the resync list again.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-20 22:49:01 -04:00
Digimer
b154ec816a * Added network_interfaces, bonds, bridges and ip_addresses tables to the age-out list.
* Confirmed that striker-purge-target works again.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-20 21:21:44 -04:00
Digimer
b77bb81343 * Found a bug where, if a record was deleted from the public schema but not from the history schema, and then later a resync was performed, the record would be added to the peer database's public schema (while still not existing locally). This condition should never occur as data in history should only exist to track the public record. This update checks for this condition and purges those records prior to resync'ng a database table.
* Continued work on fixing issues with striker-purge-target (which led the the discovery of the above bug). Added expliit checks to purge file_location and storage_group data when purging an sub-anvil from the database.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-20 20:52:07 -04:00
Digimer
6ea8365067
Merge pull request #232 from ClusterLabs/anvil-tools-dev
* Added a missing modified_date to ip_addresses in Database->get_ip_a…
2022-06-18 23:44:09 -04:00
Digimer
3caf43ed42 Updated striker-purge-target to check for problems on write of DELETEs.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-18 10:57:50 -04:00
Digimer
cdc23ad490 Small logging fix to striker-auto-initialize-all.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-17 16:43:50 -04:00
Digimer
6c5f48e8ca * Fixed a bug (I think) where initial synchronization was failing because the new locking system tried to register a lock against the peer striker before the peer striker was in the DB.
* Added an 'eval' wrapper around 'Database->write()' where it calls the given DB so that failures log properly instead of crash the program.
* Updated Database->_find_column() to no longer restrict to 'not null' calumn types.
* Fixed a couple typos in Database->read_state().

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-17 13:41:26 -04:00
Digimer
911f7cfb6a This is another big commit with a lot of DB work. Getting closer to sorting out the frequent resyncs.
* Changes Database->connect to always use the first DB connected to, not the local one if that applies. This treats the first DB (sorted by UUID) as "primary" and the second (or third...) as more of a backup.
* Moved db_in_use and lock_request to use the 'states' table instead of the variables table. These are set and removed so often that it was messing up things with resync's when the data is transient anyway. Fixed multiple bugs with both to better set and clear properly.
* Created Database->read_state() to assist with the above changes.
* Updated Database->refresh_timestamp() to specifically check that the returned time stamp differs from the previously used one, looping until they differ if needed.
* Disabled striker-manage-install-target when called to update the repos, as the Install Target function doesn't work at this point.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-16 20:10:43 -04:00
Digimer
24f5d39dff This is a set of changes all stemming from trying to debug frequent resyncs. More bugs still to be fixed.
* Updated Database->get_host_from_uuid() to cache results.
* Fixed a bug in Database->get_storage_group_data where a DELETE wasn't deleting from the history schema as well.
* In Database->resync_databases(), references to the old 'host_uuid' that we used to use to resync just the local host's data was removed. Added also a check where two or more entries in a given history schema had the same modified_date and, when found, the newest entry is preserved and the rest are deleted. Before this, a resync where two+ records had the same modified_time would only sync the last record, leaving a mismatch in history schema entries triggering repeated resyncs.
* Fixed a bug in Email->send_alerts() where the 'alerts' table was being updated without a modified_date being set.
* Fixed a bug in System->test_ipmi() where the 'hosts' table was being updated without a modified_date being set.
* Updated scan-network to clear up old deleted ip_addresses, bonds and bridges. Also fixed bugs where public schema records were being deleted without history records being deleted.
* Updated anvil-update-states to fix bugs where DELETEs were happening without setting the modified_date.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-12 23:14:49 -04:00
Digimer
1770e9e0e0 * Fixed a bug where Database resync's where trying to resync tables without history schema entries.
* Updated fence_delay to move the log filehandle close to a saner spot.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-06-08 21:27:41 -04:00
Digimer
e6dcff1cf1 * Added a missing modified_date to ip_addresses in Database->get_ip_addresses().
* Updated scan-network to purge old historical ip_addresses when clearing duplicates now.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-05-21 15:52:25 -04:00
Digimer
aa64b0f7fa
Merge pull request #231 from ClusterLabs/anvil-tools-dev
* Updated anvil-provision-server to handle human-readable sizes for d…
2022-05-20 15:14:05 -04:00
Digimer
1b70b49cf8 * Updated Network->find_matches() to try to populate the first and second parameters if they're not passed in.
* Updated Network->load_ips() to load extra information about the interfaces.
* Updated ocf:alteeve:server to not check libvirtd daemon state on server start.
* Updated scan-hardware to check for duplicate entries and purge if found.
* Updated scan-network to check for the 'default' virbr0 interface by checking if the config file exists instead of calling virsh.
* Updated scan-server to have better logging.
* Created the new (and incomplete) anvil-test-alerts tool
* Updated scancore to support --purge to pass to all agents and then exit.
* Updated ScanCore->call_scan_agents() to no longer use 'timeout' as it was causing issues with virsh calls.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-05-20 10:28:21 -04:00
Digimer
572167d034 * Updated Database->get_storage_group_data() to record the VG name for a given host's VG in a given storage group.
* Updated anvil-provision-server to fix more bugs related to --ci-test.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-05-10 18:00:37 -04:00
Digimer
d26a16e711 * Updated anvil-provision-server to handle human-readable sizes for disk and ram.
* Updated Database->get_anvils() to make it possible to translate a file name to a file UUID.
* Updated System->test_ipmi() to quote passwords properly. Also dropped the timeouts to 2 seconds.
* Updated anvil-provision-server to support pure CLI switch server provisioning using the --ci-test (and optional --options {--machine}) to allow CI tests.
* Continued work of anvil-manage-server.
* Fixed a bug in striker-prep-database to fix a bug in writing the pg_hba.conf file.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-05-10 00:42:40 -04:00
Digimer
d59ce5e677
Merge pull request #230 from ClusterLabs/anvil-tools-dev
Created the new anvil-show-local-ips that shows the IPs on the host i…
2022-04-26 23:02:59 -04:00
Digimer
929544bb90 Removed the '--' gateway holder to make it more consistent with the rest of the output.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-26 20:29:37 -04:00
Digimer
7ec4cee143 Created the new anvil-show-local-ips that shows the IPs on the host in an easier to read format, compared to 'ip addr list'.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-26 20:24:15 -04:00
Digimer
240c9b5061
Merge pull request #229 from ClusterLabs/anvil-tools-dev
Updated resync to no longer be tied to a host_uuid.
2022-04-26 11:07:49 -04:00
Digimer
e9a9e0dd4b * Finished (but needs more testing) the new 'anvil-report-usage' tool.
* Updated System->_check_anvil_conf() to create the 'admin' user in a more normal way (old way caused the 'admin' group to be a system GID.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-25 23:56:58 -04:00
Digimer
d2973e603b Updated anvil-update-states to make the speed of links to 10000 when they are virtio interfaces.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-13 10:55:35 -04:00
Digimer
1dbca79dde * Created Network->get_ip_from_mac() which takes a MAC address and returns an IP address.
* Updated ocf:alteeve:server to always try to bring up the peer's DRBD resource, even when the local resource is up.
* Fixed a bug in scan-network where purging duplicate bridges failed in some cases.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-08 23:09:34 -04:00
Digimer
142be7674e * Fixed a bug in striker-scan-network where the scan wasn't running properly when no network was specifically given.
* Updated DRBD->get_devices() to store information about the nodes for each resource.
* Got more work done on anvil-report-usage.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-08 18:54:55 -04:00
Digimer
4751c6e747 Updated DRBD->get_devices() and Server->parse_definition() to take 'anvil_uuid' so that server data can be parsed from anywhere.
Created, but not finished, tools/anvil-report-usage that will print a report of server resource allocation and Anvil! resource availability.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-06 00:16:32 -04:00
Digimer
3ad1793c23 Made scan-network more robust at determining when an interface is a virtio device.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-04 18:51:28 -04:00
Digimer
c9633aa3b0 * Updated Database->_find_behind_databases() to not run unless it's on a Striker.
* Updated scan_network to properly mark virtio network interfaces as being full duplex. Also updated it to purge interfaces flagged as 'DELETED'.
* Updated the VM OS list.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-04-04 16:38:14 -04:00