Commit Graph

124 Commits

Author SHA1 Message Date
Digimer
c7c6c8dee5 * Reworked the attempt to repair the network in anvil-daemon to not touch the network until the machine has been running for at least two minutes.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-15 12:04:27 -04:00
Digimer
1e7847d4dd * Added a call to Network->check_bonds() to be called while non-Striker machines wait to connect to a database.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-13 14:14:37 -04:00
Digimer
3f32a56d0c * Created Network->check_bonds() that checks to see if any bonds are down, or if any interfaces configured to be in a bond are not actually in it. It accepts a 'heal' parameter that, by default, will bring up a bond with no active links, but leaves degraded bonds alone. It call also take 'all' and will try to bring up any missing interfaces. This distinction exists so that if a link is flaky and someone takes it down manually until it can be repaired, it doesn't get turned back on.
* Updated anvil-daemon to call Network->check_bonds() with 'all' on startup, then woth 'down_only' once per minute to try to heal down'ed bonds.
* Updated anvil-watch-bonds to take a 'run-once' switch and exit after one report, if set.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-13 13:33:51 -04:00
Digimer
19c41c9171 * Added more logging while chasing a function test bug.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-08 23:56:48 -04:00
Digimer
daca6c887b * This contains a fairly major change to how time stamps are handled. All INSERT and UPDATE calls now generate a new timestamp via Database->refresh_timestamp, instead of using 'sys::database::timestamp'. This was done in responce to finding a bug where tables in a database differed in both counts of public and private schemas (ip_addresses table, specifically) that failed to resync because the timestamps were re-used too often.
* WIP - Continuing work on the new anvil-manage-server tool.
* Updated Database->get_anvils() to load information on the files available on each Anvil! system.
* Updated Database->insert_or_update_network_interfaces() to no longer take the 'timestamp' parameter.
* Removed all logging from Database->refresh_timestamp() to speed it up, given how often it will be called now.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-08 15:23:15 -04:00
Digimer
96fffb0b96 * Finished updating ocf:alteeve:server to no longer require a database connection. To do this, and still be able to track live migration times, the Server->migrate_virsh() method now writes out the server name and migration time to a /tmp/anvil/migration-duration.<server_name>.<unix_time> file. This file is checked for by the scan-server resource agent and, when found, is parsed and the migration duration is recorded, then the file is purged.
* Updated anvil-daemon to have a new function called "handle_special_cases" called during startup that does any weird bug mitigation required. For now, this is used to mitigate against rhbz#1961562, though certainly it will be used for other reasons later.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-06 00:01:11 -04:00
Digimer
24ec17f8f7 * Added a new parameter called 'sensitive' to Database->connect() that returns after connections before any ancilliary checks are done, minimizing connect time.
* Fixed a problem with Database->insert_or_update_variables() where variable_source_uuid being set to an empty string wasn't converted to NULL.
* Fixed Database->locking() where the way the lock variable was set was rather broken.
* Created Striker->check_httpd_conf() which configured apache to handle the integration of the new WebUI for Anvil! management with the existing WebUI.
* Updated System->update_hosts() to specifically set the 127.0.0.1 and ::1 lines to handle how cloud-init overrides /etc/hosts and breaks CI/CD tests.
* Removed the old index.html as it's now used for the new WebUI.
* Began work on removing DB connection requirements from ocf:alteeve:server.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-03 22:25:36 -04:00
Digimer
4dcd505753 * Biggest change in this commit; scan-apc-pdu and scan-apc-ups now only run on Striker dashboards! This was because we found that if two machines ran their agents at the same time, the reponce time from SNMP read requests grew a lot. This meant it was likely a third, fourth and so on machne would also then have their scan agent runs while the existing runs were still trying to process, causing the SNMP reads to get slower still until timeouts popped.
* Bumped scancore's scan delay from 30 seconds to 60.
* Shorted the age-out time to 24 hours and again boosted the archive thresholds. As we get a feel for the amount of data collected on multi-Anvil! systems over time, we may continue to tune this.l
* Moved Database->archive_database() to be called daily by anvil-daemon, instead of during '->connect' calls.
* Added locking to Database->_age_out_data to avoid resyncs mid-purge. Also moved the power, temperature and ip_address columns into the same 'to_clean' hash as it was duplicate logic.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-31 13:34:49 -04:00
Digimer
8807915bb7 The theme of this commit is database cleanup and fixes.
* Updated Database->_age_out_data() to check for certain scan agent tables and, for those found, purge out old records. This should go a long way to keeping the database data responsive.
* Fixed a bug in Jobs->update_progress() where the 'job_picked_up_by' column was being set to '0' instead of '$$' when clearing the job.
* Fixed a bug in System->update_hosts() where '127.0.0.1' would be used in hosts for the actual host name.
* Updated the default trigger, count and division values in anvil.conf to 100,000, 50,000 and 75,000 respectively. In combination with the aging of data, this should go a long way to minimizing database sizes and overheads.
* Updated anvil-daemon to call $anvil->Database->_age_out_data(); in it's daily tasks.
* Updated various striker-X tools to specifically request a DB resync on Database->connect calls.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-30 15:16:25 -04:00
Digimer
6abe06f125 The theme of these commits is improving DB responsiveness.
* Created Database->_age_out_data() to delete records from the database that are old enough to no longer be useful. This is designed to significantly reduce the size of the database, allowing a better focus on performance.
* Changed Database->connect() to default to NOT check for resync, reworking the old 'no_resync' to 'check_for_resync', so that resync checks happen on demard, instead of by default.
* Updated get_tables_from_schema() to now allow 'schema_file' to be set to 'all', which then loads the schema files of all scan agents as well as the core anvil schema file. Fixed a bug where commented out tables were being counted.
* Re-enabled triggering resyncs on 'last_updated' differences.
* Fixed a bug in scan-ipmitool where the history_id column in history.scan_ipmitool_value was incorrect.
* Created a new tool called striker-show-db-counts that shows the number of records in all public and history schema tables for all databases.
* Updated anvil-update-states to detect when a libvirtd NAT'ed bridge exists and to delete it when found.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-29 23:34:22 -04:00
Digimer
ff65712fd9 * Created the function check_daemons() in anvil-daemon to check that needed daemons are running when it starts. This was specifically added to address a periodic issue with machines booting without NetworkManager running.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-24 15:27:10 -04:00
Digimer
41cd1e0319 * Several bugs fixed and enhancements;
* DRBD is now configured to a ping-timeout of 3 seconds.
* Created Log->switches() that returnes the command line switches used by Anvil! tool command line calls based on the active log levels / secure logging. Appended this to all invocations of our tools.
* Updated Database->resync_databases() to now only skip 'jobs' and 'variables' tables with less than 10 record differences. All other differences will trigger a resync.
* Created System->_check_anvil_conf() that, as you might guess, checks in anvil.conf exists and created it (using defaults), if not. It also checks to see if the 'admin' group and user exists and creates them, if not.
* Updated anvil-daemon to check anvil.conf on start up and in each loop. Created the function check_journald() that checks (and sets, if needed) that journald logging is persistent.
* Made striker-manage-peers to check_if_configured on the Database->connect() when updating anvil.conf and the target UUID is the local machine. Also created a loop to make the reconnection a lot more robust.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-24 00:09:32 -04:00
Digimer
a846f9ecbc * Fix to the database resync logic. The previous change to only resync if 10+ lines differed broke striker-manage-peers as the difference in host counts is what triggered the pairing of strikers.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-22 12:25:29 -04:00
Digimer
fc0954d0c8 * Started work on, but not at all finished, anvil-manage-server which will allow manipulation of a server's resources.
* Changed the alteeve repo RPM to the new cimmunity/enterprise repo
* Fixed a bug where 'fence_data::updated' was causing the fences web page to break.
* Fixed a bug in Database->insert_or_update_network_interfaces() where certain interfaces were being repeatedly added to the database.
* Fixed a bug in Database->_find_behind_databases() was marking DBs as behind even though they had less than 10 columns off.
* Fixed a bug in Get->host_name() where, if the host name was changed on disk but the environment variable was still the old name, it would cause the hostname to waffle back and forth and cause constant updated to /etc/hosts.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-20 00:16:09 -04:00
Digimer
3fb81c1a0a * Updated Convert->time() to silently return if the given time was '--'.
* Added a new parameter to Database->connect() called 'no_resync' that, if set, prevents a resync check being performed. Updated ->resync_databases() to find a uuid_column where the table name ends in 'ies' and the UUID column is 'y_uuid'. Updated ->resync_databases() to not fire on updated table age anymore, and to trigger only if the number of rows differ in a given table by more than 10.
* Updated Log->entry() to prefix a tool's name, when the new 'log::scan_agent' value is set. Also set this value in ScanCore->agent_startup(), to help differentiate log entries.
* Fixed a bug in scancore's main loop where it logged the sleep message at the start of the run.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-04 12:33:31 -04:00
Digimer
ca7052dd53 The core logic is done!!!! Still need to finish end-points for the WebUI to hook into, but the core of M3 is complete! Many, many bugs are expected, of course. :)
* Created DRBD->check_if_syncsource() and ->check_if_synctarget() that return '1' if the target host is currently SyncSource or SyncTarget for any resource, respectively.
* Updated DRBD->update_global_common() to return the unified-format diff if any changes were made to global-common.conf.
* Created ScanCore->check_health() that returns the health score for a host. Created ->count_servers() that returns the number of servers on a host, how much RAM is used by those servers and, if available, the estimated migration time of the servers. Updated ->check_temperature() to set/clear/return the time that a host has been in a warning or critical temperature state.
* Finished ScanCore->post_scan_analysis_node()!!! It certainly has bugs, and much testing is needed, but the logic is all in place! Oh what a slog that was... It should be far more intelligent than M2 though, once flushed out and tested.
* Created Server->active_migrations() that returns '1' if any servers are in a migration on an Anvil! system. Updated ->migrate_virsh() to record how long a migration took in the "server::migration_duration" variable, which is averaged by ScanCore->count_servers() to estimate migration times.
* Updated scan-drbd to check/update the global-common.conf file's config at the end of a scan.
* Updated ScanCore itself to not scan when in maintenance mode. Also updated it to call 'anvil-safe-start' when ScanCore starts, so long as it is within ten minutes of the host booting.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-30 22:58:01 -04:00
Digimer
5f0b7740e2 * Fixed a typo that broke compiling anvil-daemon in the last commit. Yay for CI/CD!
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-10 01:28:12 -04:00
Digimer
fb0836f912 * THe get_cpu endpoint was completed.
* The get_mmeory endpoint was completed.
* The get_replicated_storage endpoint was completed, though it requires testing and likely has issues.

To prepare for the get_status endpoint work, I needed to update ScanCore and modules to track the host_status. This commit contains the work needed for this.
* Updated ScanCore->post_scan_analysis_striker() to use configured fence devices (except PDUs) to check if a target host is off or on, in there is no host_ipmi interface. In all cases, if a machine can be confirmed on or off, the host_status is now updated.
* To support the above fence based power checks, updated scan-cluster to store the on-disk CIB in the new scan_cluster -> scan_cluster_cib colume.
* Updated ScanCore->parse_cib() to map stonith primitive IDs to fence agents. Updated ->parse_crm_mon() to not call if the executable doesn't exist to avoid unhelpful error messages in the logs when called from a Striker.
* Update DRBD->gather_data() to get the size data from /sys/block/drbd<minor>/size' x '/sys/block/drbd<minor>/queue/logical_block_size so it works when a device is Secondary (and can't be promoted).
* Updated Database->get_hosts_info() to record the short host name as well as the stored host name. Created ->update_host_status() as a wrapper to ->insert_or_update_hosts() that only updates the host status.
* Updated anvil-join-anvil to disabled ksm and ksmtuned daemons.
* Updated scancore and anvil-daemon to set the host_status to 'online' on startup.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-09 20:51:29 -04:00
Digimer
15fd0e5ce8 * Updated anvil-daemon (and Database->insert_or_update_jobs) to now recognize jobs with the job_status of 'scancore_startup' to run only when ScanCore starts.
* Finished initial Striker setup in tools/striker-auto-initialize-all. Started working on peering.
* Cleaned up the handling of converting UIDs to user names in Remote->add_target_to_known_hosts() and ->_call_ssh_keyscan().
* Did a bunch of white-space/alignment cleanup.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-04 01:41:33 -05:00
Digimer
45a9cb04b0 * Fixed a bug introduced in the last commit that made Get->os_type() fail when called locally.
* Made the error reported by Remote->call() more verbose when called without 'target' being set.
* Updated anvil-daemon to not call jobs more that once per minute.
* Started work on striker-auto-initialize-all, still very far from complete.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-02-23 01:56:12 -05:00
Digimer
1b65f53faa * Remove host-health from the 'hosts' table as it wasn't needed, given the 'health' table. Bumped the SQL version to 0.0.2
* Updated Get->os_type() to use 'cat' instead of Storage->read_file() because 'rsync' may not be available when it is called during striker-initialize-host calls.
* Updated Database methods to skip 'oui' and 'state' during resync.
* Updatedb striker-initialize-host to detect when it's initializing a CentOS Stream Node / DR Host and enable the HA repo.
* Created the tools/striker-auto-initialize-all tool, which is very much incomplete, that will allow for the rapid creation of a full Anvil! from freshly installed machines autonomously.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-02-22 19:22:47 -05:00
Digimer
1a520b03d5 * Cleaned up a lot of logging in anvil-daemon and tools it calls.
* Deleted anvil-jobs as it never ended up being used.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-02-08 13:39:34 -05:00
Digimer
d9d347ce63 * Updated .spec for the new source location.
* Created a log disable flag to avoid deep recursion when logging at level 3.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-01-22 00:37:30 -05:00
Digimer
cda51e562d * Finished porting scan-hpacucli, the last M2 scan agent!
* Updated Database->insert_or_update_temperature() to accespt a 'delete' parameter (which, surprise, deletes a record).
* Updated (but not yet tested) the RPM .spec to require that 'core' and the other three packages are required to be the same version.
* Updated scan-ipmitool and scan-storcli scan agents to now delete temperature data belonging to lost sensors.
* Fixed tools/striker-manage-install-target to add multiple missing packages.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-11-25 19:06:21 -05:00
Digimer
d677d19ca0 * Moved Database->check_condition_age to Alert.
* Created (but not finished) scan-apc-pdu
* Added support to tracking maintenance-mode for nodes in Cluster->parse_cib
* Created Remote->read_snmp_oid().
* Created Server->get_definition.
* Updated Server->get_status() to write-out server XML files on-demand.
* Finished scan-cluster.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-10-23 01:28:21 -04:00
Digimer
1a1fa7ce88 * Created Cluster->get_anvil_uuid() that returns the 'anvil_uuid' of a given 'host_uuid'.
* Renamed the 'defitintions' table to 'server_definitions' to clarify the purpose, and made all the 'server' columns have then 'not null' constraint.
* Created Database->insert_or_update_servers(), ->get_servers(), ->insert_or_update_server_definitions() and ->get_server_definitions().
* Updated scancore, anvil-daemon, and scan agents to not run unless they're run with root privs.
* Got scan-server to update the servers / server_definition tables and the on-disk file when needed.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-28 00:20:13 -04:00
Digimer
dc5ec9c264 * Added checking the email server config to anvil-daemon. Email works now!
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-07 00:39:43 -04:00
Digimer
82acb4e104 * Fixed a resync bug where bridges needed to sync before bonds
* Re-enabled user-selected BCN subnet ranges (needs more testing).

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-03 22:52:38 -04:00
Digimer
49682a01d7 * Fixed a bug in Database->disconnect() where the database idenitification number wasn't being removed, so connecting again triggered the duplicate DB connection check.
* Fixed a bug in Tools->_set_defaults where the order the tables were sync'ed it caused primary/foreign keys would trigger DB errors when resync'ing in some cases.
* Created Database->log_connections to make it easier to log which databases are actively in use and other data about the connections.
* Fixed bugs in striker-manage-peers that (partly because of the above bugs) failed to connect to new peers properly.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-03 01:36:11 -04:00
Digimer
b2c7fd95fb * Renamed the ScanCore unit file to scancore.
* Added support to parsing location contraints to Cluster->parse_cib

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-08-24 12:54:31 -04:00
Digimer
c27cc7507f * Renamed striker-parse-fence-agents to anvil-parse-fence-agents and changed anvil-daemon to run it on all machines.
* Cleaned up a lot of logging.
* Updated Cluster->parse_cib() to track if a stonith device has 'delay' set.
* Got a lot more work done on anvil-join-anvil's stonith processing, but it still isn't complete. Updated it to change shell user passwords as well.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-08-06 01:20:53 -04:00
Digimer
1bf71f8428 * Updated Database->get_hosts() to run host_ipmi the Log->is_secure if the string contains 'passw'.
* Fixed Database->get_ip_addresses() to clear stale IP addresses.
* Finished (for now, more testing needed) System->configure_ipmi! Also created System->test_ipmi() that handles trying lanplus and various password lengths, updating hosts -> host_ipmi on successful check.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-07-24 16:22:27 -04:00
Digimer
597d9413a5 * Created the skeleton Cluster.pm.
* Got anvil-join-anvil to the point where is initializes and starts the cluster.
* Deleted the old ssh key handling logic in anvil-daemon.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-07-10 00:49:30 -04:00
Digimer
453f5c6223 * Fixed a bug where $anvil->nice_exit() was being passed 'exit' instead of 'exit_code' as a parameter.
* Update striker manifest run to add an entry into the 'anvils' table, and pass the anvil_uuid to the jobs rather than the various host_uuid's.
* Fixed a bug in the 'anvils' SQL procedure that copied data into the history schema (a few columns were missing).
* Updated anvil-configure-host to reboot when finished to be certain network changes have taken effect. Also updated the handling of virsh bridges to delete the autostart symlinks if libvirtd daemon isn't running.
* Added some logic to anvil-daemon to call 'anvil-update-states' with the -v{1,3} flag depending on the active debug level.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-06-24 00:39:56 -04:00
Digimer
4489111a65 * Fixed a bug in Job->clear() where it was not doing it's one job right.
* Updated System->generate_state_json() where when the full host name was short, it wouldn't set the short host name properly.
* Fixed a bug in 'tools/anvil-manage-power' where the node wouldn't mark the reboot as complete. Resolves issue #11.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-06-11 16:08:06 -04:00
Digimer
726a4374d1 * Renamed the database table 'host_keys' to 'ssh_keys' to better represent what it stores.
* Updated 'variables' -> 'variable_source_uuid' to type 'uuid' and removed the 'not null' constraint.
* Updated Database->insert_or_update_variables() to check/update 'variables_source_table' and 'variables_source_uuid'.
* Created the 'trusts' database table which will, when done, tell anvil-daemon which users@machines to trust (setup passwordkess SSH).
* Created (but not finished) System->manage_authorized_keys() and moved the logic over to it from anvil-daemon.
* Changed the host types "dashboard" to "striker".
* Moved the following methods from 'System' to 'Get';
** System->get_host_type to Get->host_type
** System->get_bridges to Get->bridges
** System->get_free_memory to Get->free_memory
** System->get_os_type to Get->os_type
** System->get_uptime to Get->uptime
* Updated striker to include the host_uuid for the 'node1', 'node2' and (if chosen) 'dr1' when running a job manifest.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-06-10 18:26:50 -04:00
Digimer
f71c16484a * Got the fence config confirmation screen working.
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-03-12 00:25:17 -04:00
Digimer
818ef23634 * Moved the fences_unified_metadata file from /tmp, which apache can not read, to /var/www/html/.
* Fixed a bug (well, made a work-around for an issue without a known reproducer) where, on some occassion, a record will end up in the public table without being copied into the history schema. When this happens, the next resync would crash out because the resynd reads in the history table only. Now, when about to INSERT a record into the public schema during a resync, an explicit check is made to see if the record alread
y exists. If it does, the INSERT is instead redirected to the history schema.
* Cleaned up the fence agent metadata when displaying to a user, converting the shell codes to underline a string with square brackets instead. We also now replace newlines with <br /> tags. Lastly, to help fence_azure_arm's metadata description to display cleanly, a check is made to format the table correctly.
* Began work on the Striker menu for handling fence device management

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-01-20 23:41:01 -05:00
Digimer
7df405afcb * Created the manifest database table and Database->insert_or_update_manifests().
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-01-10 15:57:11 -05:00
Digimer
76e9352717 * Added a flag that tells anvil-daemon when a node is having it's network mapped. When this happens, open ssh connections are closed each loop and only tasks related to mapping the network run. This improves responsiveness in Striker when reporting which network links have come up or gone down.
* Fixed a bug in Database->insert_or_update_variables() where, if 'update_value_only' was set but not variable_uuid was passed or could be found, an (incomplete) INSERT would be attempted.
* Added support for generating module metadata when setting up local repos on Striker.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-01-09 01:52:04 -05:00
Digimer
6d81e03fb2 * Created Network->match_gateway() to check if a gateway applies to a given network.
* Got more work done on confirming the user's request to setup the network of a node or DR host.
* Reworked network select boxes to sort by the network name instead of the MAC address.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-12-08 00:19:13 -05:00
Digimer
86af67ecda * Created a cachine mechanism for anvil-update-states so that it can record network interface link state changes when it loses contact with all databases, as can happen when cycling NICs to map a newly build DR host or node.
* Updated Database->insert_or_update_network_interfaces() to take the new 'link_only' and 'timestamp' parameters to support flushing out the cache file above.
* Updated anvil-daemon to run anvil-update-states when the database connection is lost. Also moved the 'handle_periodic_tasks()' function call to be conditional on there being a database connection.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-12-05 01:59:41 -05:00
Digimer
530d379f59 * Started work on caching network state change in tools/anvil-update-states.
* Fixed a bug where ip_addresses could break resync when 2+ machines had the same IP (ie: 192.168.122.1).
* Updated logging of DB transactions to show the DB host's IP instead of the UUID.
* Updated Get->date_and_time to take a 'use_utc' parameter to return the time using GMT time instead of the host's TZ.
* Updated anvil-daemon to periodically call tools/anvil-update-states. Also upadted anvil-daemon to delay daily jobs by 2 hours except for the dashboard with the highest sorted UUID to minimize dual runs of tasks that only need to run once per day per cluster.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-12-04 00:02:19 -05:00
Digimer
e8d15112da * Fixed a bug in anvil.js where the state of a link always said 'up', even when it was down.
* Fixed a couple logging bugs in System->call().
* Fixed a bug in anvil-daemon where it was trying to setup setuid-C wrappers on non-dashboards.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-12-03 01:23:31 -05:00
Digimer
90f5bf49d5 * Updated Network->load_interfces() to only assign a 'changed_order' to real interfaces.
* Fixed a bug in System->generate_state_json() where interfaces connected to a bridge were constantly having their 'network_interface_bond_uuid' cleared and reset.
* Finished (for now) the jquery code to update the network interface list when preparing the network interface configuration of a new node or DR host.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-12-03 00:30:26 -05:00
Digimer
e3a8c1a01d * Created System->generate_state_json() that reads, parse and writes out the network status of all known machines on a given Striker database.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-11-14 23:58:39 -05:00
Digimer
387c03aa7d * Added more text for the pending JSON state file generation.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-11-13 21:29:51 -05:00
Digimer
628f7faa45 * Updated the RPM spec file to generate '/etc/anvil/type.X' files to directly indicate the machine type. Updated System->get_host_type() to check for these files directly, falling back to parsing the host name if they don't exist.
* Created Database->get_hosts_info() (though it's not at all finished) that will write out a unified JSON file contain all data known about all hosts/Anvil! systems. This will be later used to create the WebUI parts.
* Also created, but also not finished, Network->load_interfces() that will work sort of like ->load_ups, but include all interfaces regardless of if they have an IP or not.
* Fixed a bug where the new bridge_interface_note parameter didn't exist in the Database->insert_or_update_bridge_interfaces() method.
* Updated anvil-update-states() to only write out the JSON/XML files if it's running on a dashboard. For nodes and DR hosts, it just needs to update the database.
* Created a new hook in anvil-daemon that will call tasks on a machine that is configured.
* As per RHEL 8.1 release notes, changed the package 'dnf-utils' to 'yum-utils' in the packages to load for install target repos.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-11-06 11:26:21 -05:00
Digimer
077977ad9c * Fixed tools/anvil-update-states so that it properly removed ip_addresses no longer assigned to a host. Also merged 'network::interfaces::by_name::${interface}' with 'network::local::interface::${interface}' for storing discovered interfaces.'
* Added 'ip_address_note' to the 'ip_addresses' table as there was no column convenient for flagging as DELETEd.
* Added 'uuid' to Database->insert_or_update_file_locations() and ->insert_or_update_files(), and actually used it in all ->inser_or_update_X() methods.
* Added 'delete' as a parameter to Database->insert_or_update_ip_addresses() to allow simple deletion of a referenced IP address.
* Addressed a few 'undefined variable' errors.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-11-02 00:02:36 -04:00
Digimer
af6e2c076d * Fixed a tricky deep recursion bug in Network->is_local when the passed in host was an empty string. Also created a cache system where a host name that has been checked before is immediately returned, without needing to run through the logic in 'is_local', which gets called quite frequently.
* Updated the loop detection logic in Log->entry where processing large strings was triggering it when it shouldn't.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-10-20 22:55:49 -04:00