Commit Graph

149 Commits

Author SHA1 Message Date
digimer
a81a110261 * Remove forced log level and secure logging. This addresses issue #386
Signed-off-by: digimer <mkelly@alteeve.ca>
2023-08-09 18:20:14 -04:00
digimer
64bb5ab8e1 * Updated striker to only complain about unconfigured networks on nodes, not DR hosts.
* Updated anvil-configure-host to ignore gracefully unconfigured networks.

Signed-off-by: digimer <digimer@gravitar.alteeve.com>
2023-01-15 01:41:55 -05:00
digimer
b8b4352117 * Added support for Migration Network configs in old striker and anvil-configure-host
Signed-off-by: digimer <digimer@gravitar.alteeve.com>
2023-01-15 01:24:26 -05:00
digimer
b27a43eaf7 * Updated striker to only require 6 interfaces when configuring a node.
Signed-off-by: digimer <digimer@gravitar.alteeve.com>
2023-01-14 16:59:41 -05:00
Digimer
3fd0db15bf * This rather heavily reworks how database shutdowns works. It adds much more intelligent shutdown, tracking who is using the database, being able to mark a database as "offline" and waiting for users of the database to disconnect before it shuts down.
* Also removed the variables for the database name and DB user name, setting them statically now.
* Created Database->shutdown() to more kindly stop a local database server.
* Added 'check_db_in_use_states()' to anvil-daemon to clean any stale entries marking a database as in use.

Signed-off-by: Digimer <digimer@alteeve.ca>
2022-03-14 16:41:37 -04:00
Digimer
763821a21d Fixed a variable substitution bug.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-02-24 22:25:37 -05:00
Digimer
dc989f0950 Added more logging to track when and how reboots happen in systems.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-02-16 21:55:33 -05:00
Digimer
831abd1981 Updated Striker to allow the DR host to not have an IP assisned.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-01-21 16:57:10 -05:00
Digimer
a633ab7f63 Added a periodic check to ensure all users can ping. This fixes a bug where a local striker dashboard whose DB was stopped wouldn't work.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-01-21 14:27:58 -05:00
Digimer
3811b9b2da Minor update to make the fence device name marked as a mandatory field.
Signed-off-by: Digimer <digimer@alteeve.ca>
2022-01-04 21:02:24 -05:00
Digimer
32f29861a4 * Fixed a bug (maybe) that was causing users to get immediately logged out of the WebUI
* Fixed a bug (maybe) that was breaking initial DB setup on Strikers.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-11-15 01:01:23 -05:00
Digimer
80bdac8e34 * Updated the pacemaker server config to drop the stop timeout to 5 minutes and the migration timeout to 10 minutes. This will avoid blocking the entire cluster when a stop or migrate operation times out. Will update scan-server to clean these up when they happen.
* Updated Database->archive_table() and ->_find_behind_databases() to loop through connected databases, instead of configured databases.
* Updated Network->get_ips() to only record the real MAC addresses on network interfaces (not bonds or bridges) in the "network::${host}::interface::${in_iface}::mac_address" hash. This should help avoid reboot loops caused by anvil-configure-host thinking the network needs to be reconfigured when it doesn't actually need to be.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-11 03:17:07 -04:00
Digimer
daca6c887b * This contains a fairly major change to how time stamps are handled. All INSERT and UPDATE calls now generate a new timestamp via Database->refresh_timestamp, instead of using 'sys::database::timestamp'. This was done in responce to finding a bug where tables in a database differed in both counts of public and private schemas (ip_addresses table, specifically) that failed to resync because the timestamps were re-used too often.
* WIP - Continuing work on the new anvil-manage-server tool.
* Updated Database->get_anvils() to load information on the files available on each Anvil! system.
* Updated Database->insert_or_update_network_interfaces() to no longer take the 'timestamp' parameter.
* Removed all logging from Database->refresh_timestamp() to speed it up, given how often it will be called now.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-08 15:23:15 -04:00
Digimer
e15c1651ed * Fixed a bug with deleting bad keys where jobs to delete keys on non-dashboard machine wasn't being assigned to the proper target machine.
* Fixed a bug with anvil-manage-keys where a state_uuid entry recorded on one database may not be read from a machine reading from another database.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-06-05 19:07:25 -04:00
Digimer
41cd1e0319 * Several bugs fixed and enhancements;
* DRBD is now configured to a ping-timeout of 3 seconds.
* Created Log->switches() that returnes the command line switches used by Anvil! tool command line calls based on the active log levels / secure logging. Appended this to all invocations of our tools.
* Updated Database->resync_databases() to now only skip 'jobs' and 'variables' tables with less than 10 record differences. All other differences will trigger a resync.
* Created System->_check_anvil_conf() that, as you might guess, checks in anvil.conf exists and created it (using defaults), if not. It also checks to see if the 'admin' group and user exists and creates them, if not.
* Updated anvil-daemon to check anvil.conf on start up and in each loop. Created the function check_journald() that checks (and sets, if needed) that journald logging is persistent.
* Made striker-manage-peers to check_if_configured on the Database->connect() when updating anvil.conf and the target UUID is the local machine. Also created a loop to make the reconnection a lot more robust.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-24 00:09:32 -04:00
Digimer
fc0954d0c8 * Started work on, but not at all finished, anvil-manage-server which will allow manipulation of a server's resources.
* Changed the alteeve repo RPM to the new cimmunity/enterprise repo
* Fixed a bug where 'fence_data::updated' was causing the fences web page to break.
* Fixed a bug in Database->insert_or_update_network_interfaces() where certain interfaces were being repeatedly added to the database.
* Fixed a bug in Database->_find_behind_databases() was marking DBs as behind even though they had less than 10 columns off.
* Fixed a bug in Get->host_name() where, if the host name was changed on disk but the environment variable was still the old name, it would cause the hostname to waffle back and forth and cause constant updated to /etc/hosts.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-05-20 00:16:09 -04:00
Digimer
798518ba5e * While working on the boot/shutdown server tools, ran into and fixed a bug where files uploaded before an Anvil! was added could not have those files sync'ed. This was fixed though the new Database->check_file_locations() method.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-04-14 22:56:18 -04:00
Digimer
0fb191c00f * Made more progress on tools/striker-auto-initialize-all, now to the point where it loads the variables needed to initialize Striker dashboard.
* Cleaned up / added some logging in various locations.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-03-02 01:18:18 -05:00
Digimer
68ea6da1d3 * Finished the web interface components of the Anvil! File Manager! Files can be purged, sync'ed or removed from specific Anvil! systems, renamed and their file types changed (and setting/removing the executable bits) as needed.
* Fixed a bug in Database->insert_or_update_jobs() where the 'job_host_uuid' being set to 'all' only translated to a job for the running host.

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-01-06 01:31:09 -05:00
Digimer
6da2b3b17b * Got more work done on file management. A file name is now clickable and that loads a menu to rename, change the file type, purge (delete from everywhere) and select which Anvil! systems the file belongs on. Got the code done to purge a file, but it's not tested yet.
Signed-off-by: Digimer <digimer@alteeve.ca>
2021-01-04 02:05:05 -05:00
Digimer
ea84ba68eb * Fixed a bug in tools/anvil-update-states that was causing deleted interfaces to update the network_interfaces every pass, growing the DB excessively.
* Cleaned up the file manager;
** Got the jquery file uploader JS to be sane and altered it to be more useful.
** Got the list of existing files to be displayed (links clickable but not working yet).

Signed-off-by: Digimer <digimer@alteeve.ca>
2021-01-03 01:14:04 -05:00
Digimer
713f77bc78 * Finally finished scan-apc-ups! Proved way harder than anticipated... (over a solid week of work!) In M3, this agent is no longer host-bound, and the UPSes to scan based on entries in 'upses' using this scan agent.
* Fixed a bug in Database->insert_or_update_power() where the check to see if 'power_ups_uuid' was passed in was reversed. Also fixed a bug where the convertion of the value to TRUE/FALSE for the old value wasn't being set correctly.
* Updated Server->get_definition() to only translate the host name to a uuid if the host uuid wasn't passed in. Added a sanity check on the UUID as well.
* Cleaned up how existing UPSes are displayed in Striker when managing UPSes. Also renamed the form's scan agents to match the real agent names.
* Fixed alert sorting in scan-apc-pdu.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-11-12 00:35:51 -05:00
Digimer
0f7267eae1 * Moved the '_host_name', '_short_host_name', and '_domain_name' private methods in Tools.pm over to Get.pm (removing the leading '_' in the method names).
* Created 'Cluster->which_node' that returns 'node1' or 'node2' to indicate which node a host is.
* Continued working on scan_cluster; decided to make it not host-dependent.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-20 00:27:36 -04:00
Digimer
925664762a * Created Database->check_for_schema() (not finished) that will check/add a schema for a scan agent.
* Renamed the scan-network skeleton scan agent to scan-hardware and started work on it based on the M2 version.
* Updated Database->get_recipients() to take the 'include_deleted' parameter, and changed the default behaviour to only return active records.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-08 01:07:23 -04:00
Digimer
fe7cdb18fb * Updated all methods to add (or fix) logging the method entry.
* More work done on Email->send_email() to, well, actually send email (which it isn't doing yet, but it's close).
* Updated Words->key() to include the bad key name when no entry for the requested key exists in the words.xml file.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-06 01:52:03 -04:00
Digimer
911523dfce * Got a lot of work done in generating emails. Doesn't work yet, but the code to generate emails for recipients using their preferred language and alert level is done (though limited testing so far).
* Dropped support for supporting imperial measurements in generated emails.
* Created Database->get_alerts() to read in alert data and ->get_recipients() to get the list of alert recipients.
* SQL Schema changes;
** Added 'alert_processed' to 'alerts' to track what alerts have been processed.
** Changed 'recipient_new_level' to 'recipient_level' now that we're only using 'notifications' as a per-host override for user/hosts alert levels.
** Removed 'recipient_units' as we're no longer supporting non-metric values.
* Updated Alert->register() to take strings for the alert level (which gets translated to integers).
* Created Email->get_current_server() to returned the mail_server_uuid of the active mail server (if any). Created ->send_alerts() to process unprocessed alerts and send emails to recipients.
* Updated Words->parse_banged_string() to take the 'language' parameter.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-05 01:18:42 -04:00
Digimer
f8a466a963 * Fixed a bug in Striker->load_manifest where when a single fence device or UPS was defined, the XML would fail to parse.
* Added a null entry when editing a manifest when no UPSes have been defined.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-04 00:41:25 -04:00
Digimer
82acb4e104 * Fixed a resync bug where bridges needed to sync before bonds
* Re-enabled user-selected BCN subnet ranges (needs more testing).

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-09-03 22:52:38 -04:00
Digimer
767148b538 * Updated Database->get_mail_servers() to clear old stored data, and to pull out the list of when a mail server was last used.
* Got email server configuration under way. A mail server can now be configured via Email->_configure_for_server(), but more work is needed on when to switch between configs.
* Fixed some logging of passwords that wasn't being checked to see if secure logging was enabled or not.
* Fixed a bug in Striker where the back arrow in email config sub-sections weren't going back to the main email menu.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-08-27 02:09:21 -04:00
Digimer
1498e1b53c * Got server migration working using ocf:alteeve:server in a test environment!
* Converted most 'eval { }' calls to localize $@ and test the output of the eval, instead of checking to see if $@ was set.
* Converted all 'local' hash references to instead use the short host name of the local machine as a new standard.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-08-19 18:54:09 -04:00
Madison Kelly
30f2b3fa8e * Switched all hash 'local' keys to be the host's short user name. Untested, likely bugs to be fixed in the next commit.
Signed-off-by: Madison Kelly <mkelly@alteeve.ca>
2020-08-18 19:34:08 -04:00
Digimer
d2d5d7b460 * Fixed a bug in Striker->load_manifest() where fences were parsed twice, the second time missing a hash reference.
* Updated striker to now only offer gateway for IFN networks. EL8 seems to ignore 'GATEWAY="x"' in interface configs which caused anvil-join-anvil to always think an interface needs to be updated. Updated as well to remove DNS entries set in interfaces that are not the default gateway.
* Fixed a bug where DNS entries were being missed, causing entries to be repeatedly added to the interface that was the gateway interface.
* In anvil-update-states, added Get->switches() so that verbosity switches are used.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-07-28 00:59:02 -04:00
Digimer
62d0a2aa39 * Created Cluster->parse_cib() that parses pacemaker's CIB (cluster information base) XML. This also switches to the XML::LibXML, starting the replacement of XML::Simple. It's far from finished, but parses out basic node data and fence data.
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-07-11 23:26:19 -04:00
Digimer
de43ea3ac1 * Renamed all Validate->is_X to Validate->X. Also created Validate->ipv6() to validate IPv6 addresses using Data::Validate::IP (and added it as a requirement to the .spec base RPM).
* Added the fix from the last commit for System->call to handle returned data without an ending newline to Remote->call.
* Got more work done on System->update_hosts(). It's able to add new hosts, but misses the short and FQDN host names. Need to fix that and the verify existing / manual entries aren't molested.

Signed-off-by: digimer <digimer@pulsar.alteeve.com>
2020-07-07 01:18:38 -04:00
Digimer
453f5c6223 * Fixed a bug where $anvil->nice_exit() was being passed 'exit' instead of 'exit_code' as a parameter.
* Update striker manifest run to add an entry into the 'anvils' table, and pass the anvil_uuid to the jobs rather than the various host_uuid's.
* Fixed a bug in the 'anvils' SQL procedure that copied data into the history schema (a few columns were missing).
* Updated anvil-configure-host to reboot when finished to be certain network changes have taken effect. Also updated the handling of virsh bridges to delete the autostart symlinks if libvirtd daemon isn't running.
* Added some logic to anvil-daemon to call 'anvil-update-states' with the -v{1,3} flag depending on the active debug level.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-06-24 00:39:56 -04:00
Digimer
726a4374d1 * Renamed the database table 'host_keys' to 'ssh_keys' to better represent what it stores.
* Updated 'variables' -> 'variable_source_uuid' to type 'uuid' and removed the 'not null' constraint.
* Updated Database->insert_or_update_variables() to check/update 'variables_source_table' and 'variables_source_uuid'.
* Created the 'trusts' database table which will, when done, tell anvil-daemon which users@machines to trust (setup passwordkess SSH).
* Created (but not finished) System->manage_authorized_keys() and moved the logic over to it from anvil-daemon.
* Changed the host types "dashboard" to "striker".
* Moved the following methods from 'System' to 'Get';
** System->get_host_type to Get->host_type
** System->get_bridges to Get->bridges
** System->get_free_memory to Get->free_memory
** System->get_os_type to Get->os_type
** System->get_uptime to Get->uptime
* Updated striker to include the host_uuid for the 'node1', 'node2' and (if chosen) 'dr1' when running a job manifest.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-06-10 18:26:50 -04:00
Digimer
613a7f0c58 * Created the new anvil-join-anvil tool that will run on nodes and the DR host to pick up the job to join an Anvil! system.
* Finished the saving of a "run manifest" job menu. Included filtering out potential machines already in other Anvil! systems from the select box and updating the password fields to not trigger a browser to save/auto-complete the field.
* Fixed a bug in Database->get_hosts() caused by the attempt to immediately return with a 0 if it had been called before. Now a check is made in ->insert_or_update_manifests() where the recursive loop was possible.
* Updated the RPM spec to v.33 after releasing .32 after the last commit. Also added the core requirement for perl-Data-Validate-Domain.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-27 22:39:00 -04:00
Digimer
22e7e8b03e * Finished the "run manifest" form / page. Actually saving / pushing the job to nodes is not yet done though.
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-26 21:37:02 -04:00
Digimer
e99bfac9be * Created Database->get_anvils() and ->insert_or_update_anvils().
* Updated the anvils database table to have columns for traching which hosts are used as node 1, node 2 and the DR host.
* Updated Database->get_hosts() to check which, if any, Anvil! a given hosts is attached to.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-25 00:02:52 -04:00
Digimer
c5c75f1ddf * Created Database->get_manifests() that loads manifests into a hash but UUID and name.
* Got deleting of manifests done.
* Fixed a couple bugs from the _network to _ip variable rename.
* Fixed a bug where fence variables with double-quotes in them weren't being handled properly.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-23 23:54:36 -04:00
Digimer
0dbb07dfb7 * Fixed a bug where fence option values with double-quotes where not being stored correctly.
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-22 13:45:44 -04:00
Digimer
1e8982704e * Finished the ability to load an install manifest to edit it.
* Renamed to clarify the IP address field from ..._network to _ip.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-18 23:20:33 -04:00
Digimer
099bc1401a * Finished the menus to save a new Install Manifest and got the create page showing the existing manifests.
* Updated Database->insert_or_update_manifests() so that not passing in 'manifest_last_ran' is OK.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-14 17:52:14 -04:00
Digimer
97bcab86d6 * Got the new Striker->generate_manifest() generating the manifest XML.
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-13 23:02:05 -04:00
Digimer
23c17dbb98 * Renamed Network->match_gateway() to ->is_ip_in_network(), and changed the parameters 'gateway -> ip' and 'ip_address -> network'. This is to clarify it's use as a general purpose "does this IP fall in the given network range?".
* Created the shell of Striker->generate_manifest() that will pull out CGI data to generate the install manifest itself and save/update it.
* Got step 3 menu finished along with the corresponding sanity checks.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-12 18:50:11 -04:00
Digimer
e54aaad807 * Added MTU, NTP and DNS fields to install manifest step 2.
* Got the first BCN part of step 3 working.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-07 11:17:56 -04:00
Digimer
7edfd0cb2d * Added gateway support to install manigest creation step 2.
* Finished sanity checking install manifest step2.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-05 23:11:47 -04:00
Digimer
5835e49ef4 * Got the install manifest form step 1 sanity checks done. Got the form for step 2 done.
* Updated Template->select_form() to add the 'style' paramter (for CSS style options), and added the ability to set the 'options' to the special value 'subnet' that builds an options box showing usable CIDR subnets.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-04 22:56:23 -04:00
Digimer
1351889b4d * Continued work on creating Install Manifests. Got the frame of step 1 done.
Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-03 23:01:44 -04:00
Digimer
e66bc32693 * Added the ability to store, edit and delete UPSes
** Created Database->get_upses() and ->insert_or_update_upses().
** Created Striker->get_ups_data(). This parses the special 'ups_XXXX' strings.
* Updated Validate->is_domain() and added ->is_host_name() to use the Data::Validate::Domain module (which is now required in the core RPM).
* Started work on manifest handling.
* Sorted the language keys alphabetically.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-05-02 23:05:58 -04:00