* Updated Cluster->add_server() to now set failure timeouts to actual numbers instead of INFINITY after discovering that INFINITY doesn't work in those cases.
* Updated Databsae->get_hosts to now check if other entries have the same host name, and if so, to set their host_key to 'DELETED'. This should make it easier to handle when a hardware machine is replaced by new hardware but uses the same host_name.
* Updated Email->check_queue() to start and enable postfix.service if it's found to not be running.
* Updated Get->available_resources() to return '!!no_data!!' when a given host hasn't got any data in scan_lvm_vgs. Now use this in anvil-provision-server to exit if a node or dr host hasn't run scancore yet.
* Fixed a bug in scan-lvm where the pvs_uuid wasn't being loaded properly, preventing lost PVs, VGs and LVs from being flagged as deleted.
* Started work on anvil-migate-server, though it's far from complete.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Cluster->shutdown_server() where the wrong variable was being evaluated when checking the server state.
* Created DRBD->delete_resource() that deletes a resource's backing device and configuration. Note that this wipes the DRBD MD and and FS signatures before removing the LV. Updated DRBD->gather_data() to record the backing devices for volumes.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Cluster->parse_cib() when a server that is off wasn't setting 'status'.
* Renamed 'server::location::<server>::host' to '...::host_name' in several places.
* Got more work done on anvil-delete-server, up to the point where it calls the new Cluster->delete_server() method.
* Updated fence_pacemaker to call 'drbdadm adjust all' to dampen an issue where in-memory fence configs seem to change, preventing reconnection of the peer after it reboots from the fence. More testing needed on this issue.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Removed the exit-if-no-DB check in ocf:alteeve:server so that (hopefully, needs testing), running servers won't be impacted if the nodes lost contact with both/all strikers.
* Updated scan-server to make an explicit check for missing XML definition files on startup and write them if needed.
* Very beginning work on anvil-delete-server has been started.
* Updated anvil-provision-server to wait when it's running in peer mode until the new XML definition is in the DB and then write it out to disk before exiting. Also updated it to add the new server to pacemaker before exiting.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created a new tools/striker-parse-os-list tool that parses 'osinfo-query os' and prints out entries for words.xml for any new OSes.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished DRBD->get_next_resource() that returns the next available minor and the next free TCP port (with two free ports available after it).
* Created Storage->get_storage_group_details() that pulls together the LVM, storage group members and storage groups into one block of data.
* Made more progress on tools/anvil-provision-server. It now gets up to the point of creating LVs, creating DRBD resource files, loading them, creating metadata and up'ing the resource. It doesn't yet (successfully) force a new resource to primary.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in scan-drbd that was still looking for the scan_drbd_resource_uuid from the resource config file. Also added a check to see if 'scan-drbd::resource_status' directory exists before trying to read the files in it.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created the new method Cluster->get_primary_host_uuid() that returns the 'host_uuid' of the primary node in the given cluster. This is useful for external programs to figure out which node is primary. Example is provisioning a new server being assigned to the active node. Also created ->is_primary() that is a similar test to see if the active node is the primary node or not.
* Updated Cluster->parse_cib() and ->parse_crm_mon() to work on remote hosts.
* Updated Database->get_hosts() to store the short host names.
* Created Get->host_from_ip_address() that translates an IP address to a host_uuid and host_name, if it's an IP assigned currently to a known host.
* Created Network->find_target_ip() that simplifies finding which IP address to use when the caller wants to connect to a target host.
* Reworked the anvil-join-anvil to parse fence_arguments in a way that handles passwords with spaces in them.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Database->insert_or_update_jobs() where the 'job_host_uuid' being set to 'all' only translated to a job for the running host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up the file manager;
** Got the jquery file uploader JS to be sane and altered it to be more useful.
** Got the list of existing files to be displayed (links clickable but not working yet).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up upload.pl now that it isn't responsible for loading the file details into the database. It only sets a job for the local Striker to process the file and move it into /mnt/shared/files, copy it to peer dashboards, then load jobs for Anvil! members to sync the new file.
* Created Database->get_files() and ->get_file_locations() to load the respective data.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got tools/striker-sync-shared to pick up 'upload::move_incoming' jobs, move the uploaded file to /mnt/shared/files/, copies it to peer dashboards, adds it to the 'files' table and adds it to 'file_locations'.
* Reworked the 'file_locations' table to now map files to Anvil! systems, not hosts. It simply tracks if a given file should be on Anvil! members or not. Later, striker-sync-shared on the Anvil! members will pull the file down.
* Updated Storage->get_file_stats() to record the file's mimetype.
* Fixed up a few issues in cgi-bin/upload.pl.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Cluster->assemble_storage_groups() and moved the logic to auto-assemble groups out of Get->available_resources().
* Created Cluster->get_anvil_name() that will return an Anvil! name for a given anvil_uuid, or the name of the Anvil! if the host is a member of an Anvil!.
* Updated Cluster->get_anvil_uuid() to return the 'anvil_uuid' if passed a specific 'anvil_name'.
* Updated Jobs->clear() to use 'switches::job-uuid' when a job_uuid is not passed but the value exists in 'switches::job-uuid'.
Signed-off-by: Digimer <digimer@alteeve.ca>
ing or create new servers.
* Created Database->insert_or_update_storage_groups() and ->insert_or_update_storage_group_members() to manage the new associated tables. Together, these tables create storage groups and track the VGs that are members of the group.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started work on Get->available_resources() that will take an 'anvil_uuid' and figure out what resources are still available for use by new servers or that can be added to existing servers.
* Fixed a bug in ScanCore->agent_startup() where tables weren't being generated properly from the agent's SQL file.
* Made Storage->change_mode() return silently if it's called without a mode being passed. This happens frequently and is harmless so it's not worth filling the logs with errors.
* Renamed the 'start_time' key to 'at_start' when recording files' MD5 sums in Storage->record_md5sums and ->check_md5sums.
* When we moved the directory scan logic out of the 'scancore' daemon and into 'Storage->scan_directory', the logic to record scan agent names in 'scancore::agent::<file>' was removed. This broke a few things and, so, it was restored when it was found that a file starts with 'scan-' and the directory matches the scancore agent directory.
* Moved the 'scancore' daemon's 'load_agent_strings' to 'Words'
* Updated Words->parse_banged_string() to look for variables in the format 'value=X:units=Y' and translate it properly.
* Fixed a bug in scan-ipmitool where discovered sensor INSERT SQL queries were queued, but not committed.
* Fixed a bug in scan-storcli where a while loop was broken, preventing execution.
* Fixed a bug in the 'scancore' daemon where it wouldn't exit if sums changed. Fixed a bug where alerts weren't being sent between loops. Fixed a bug where command-line log level wasn't surviving inside the main loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->get_power() that loads data from the special 'power' table.
* Fixed a bug in calls to Network->ping() where some weren't formatted properly for receiving two string variables.
* Updated Database->get_anvils() to record the machine types when recording host information.
* Updated Database->get_hosts_info() to also load the 'host_ipmi' column.
* Updated Database->get_upses() to store the link to the 'power' -> 'power_uuid', when available.
* Created ScanCore->call_scan_agents() that does the work of actually calling scan agents, moving the logic out from the scancore daemon.
* Created ScanCore->check_power() that takes a host and the anvil it is in and returns if it's on batteries or not. If it is, the time on batteries and estimate hold-up time is returned. If not, the highest charge percentage is returned.
* Created ScanCore->post_scan_analysis() that is a wrapper for calling the new ->post_scan_analysis_dr(), ->post_scan_analysis_node() and ->post_scan_analysis_striker(). Of which, _dr and _node are still empty, but _striker is complete.
** ->post_scan_analysis_striker() is complete. It now boots a node after a power loss if the UPSes powering it are OK (at least one has mains power, and the main-powered UPS(es) have reached the minimum charge percentage). If it's thermal, IPMI is called and so long as at least one thermal sensor is found and it/they are all OK, it is booted. For now, M2's thermal reboot delay logic hasn't been replicated, as it added a lot of complexity and didn't prove practically useful.
* Created System->collect_ipmi_data() and moved 'scan_ipmitool's ipmitool call and parse into that method. This was done to allow ScanCore->post_scan_analysis_striker() to also call IPMI on a remote machine during thermal down events without reimplementing the logic.
* Updated scan-ipmitool to only record temperature data for data collected locally. Also renamed 'machine' variables and hash keys to 'host_name' to clarify what is being stored.
* Updated scancore to clear the 'system::stop_reason' variable.
* Added missing packages to striker-manage-install-target.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->insert_or_update_temperature() to accespt a 'delete' parameter (which, surprise, deletes a record).
* Updated (but not yet tested) the RPM .spec to require that 'core' and the other three packages are required to be the same version.
* Updated scan-ipmitool and scan-storcli scan agents to now delete temperature data belonging to lost sensors.
* Fixed tools/striker-manage-install-target to add multiple missing packages.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Get->os_type() to work on local and remote hosts.
* Fixed a but in tools/striker-initialize-host where calls to Network->find_matches() where being checked properly for success/failure.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Database->insert_or_update_power() where the check to see if 'power_ups_uuid' was passed in was reversed. Also fixed a bug where the convertion of the value to TRUE/FALSE for the old value wasn't being set correctly.
* Updated Server->get_definition() to only translate the host name to a uuid if the host uuid wasn't passed in. Added a sanity check on the UUID as well.
* Cleaned up how existing UPSes are displayed in Striker when managing UPSes. Also renamed the form's scan agents to match the real agent names.
* Fixed alert sorting in scan-apc-pdu.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created (but not finished) scan-apc-pdu
* Added support to tracking maintenance-mode for nodes in Cluster->parse_cib
* Created Remote->read_snmp_oid().
* Created Server->get_definition.
* Updated Server->get_status() to write-out server XML files on-demand.
* Finished scan-cluster.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updates servers -> server_host_uuid to drop the 'NOT NULL' constraint.
* Created the new Get->server_uuid_from_name() that does what it says on the tin. Fixed a bug in ->host_uuid_from_name() where the host name was being returned instead of the UUID.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the servers table to remove the 'not null' constraints on the server_start_after_server_uuid, server_pre_migration_file_uuid and server_post_migration_file_uuid columns.
* Updated ScanCore->agent_startup() to connect to the database(s) when there isn't a table list.
* Updated Server->migrate_virsh() to test for DB access before making DB calls (to allow ocf:alteeve:server to function even if all ScanCore DBs are offline).
* Updated ocf:alteeve:server to connect to the databases (though work without it), and changed '$FILE_NAME' to be 'ocf:alteeve:server' (to make logging more legible)
* Created the skeleton for 'scan-storage'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed servers -> 'server_clean_stop' to 'server_user_stop' to make it clearer what the column represents.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Server->migrate_virsh() to set 'servers' -> 'server_state' to 'migrating' and clear it again once the migation completes. Also added support for cold (frozen) versus live migrations.
* Updated Cluster->parse_cib() to check if a server with the server_state set to 'migrating' isn't actually migrating anymore and, if not, to clear that state. This is needed as scan-server will blindly ignore/skip any migrating server, and if a migration call is interrupted, the state could get stuck.
* Updated the 'servers' database table (and associated Database methods) to add columns for;
** server_ram_in_use - tracking RAM used by a running server
** server_configured_ram - RAM allocated to a running server (used with the above to alert a user and track _currently_ available RAM)
** server_updated_by_user - To be set by Striker tools to indicate when the user made a change that needs to push out to nodes / running server.
** server_boot_time - Tracks the unixtime when the server booted (to track uptime even if the server migrates across nodes).
* Created Get->anvil_name_from_uuid() to easily convert an Anvil! UUID into a name. Also created ->host_uuid_from_name() to translate a host name into a host UUID.
* Created Server->get_runtime() that translates a server name into a process ID and then uses that to determine how long (in seconds) it has been running. This is used when a server transitions from 'shut off' to 'running' to determine exactly when the server booted (current time - runtime).
* Renamed all 'Server->parse_definition' calls that used 'from_memory' to 'from_virsh' to clarify the data source.
* Made scan-hardware smarter about RAM change alerts.
* Updated scancore to load agent strings on startup so that processing pending alerts works properly.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed the 'defitintions' table to 'server_definitions' to clarify the purpose, and made all the 'server' columns have then 'not null' constraint.
* Created Database->insert_or_update_servers(), ->get_servers(), ->insert_or_update_server_definitions() and ->get_server_definitions().
* Updated scancore, anvil-daemon, and scan agents to not run unless they're run with root privs.
* Got scan-server to update the servers / server_definition tables and the on-disk file when needed.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Began (but haven't finished) Database->insert_or_update_servers().
* Created Storage->get_file_stats() to collect the (l)stat information for a file.
* Got more work done on scan-server.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added support in anvil.conf to disable scan agents with 'scancore::<agent_name>::disable', and added handling this to agents. Also allowed for '--force' to override this setting.
* Updated ScanCore->agent_startup() to allow for empty scan agent table lists.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Did more work on parsing server data out of the CIB. There is still an issue with determining which node currently hosts a resource, however.
* Renamed Server->boot to ->boot_virsh, ->shutdown to ->shutdown_virsh and ->migrate to ->migrate_virsh to clarify that these methods work on the raw virsh calls, outside of pacemaker (indeed, they are what the pacemaker RA uses to do what pacemaker asks).
* Got more work done on the scan-cluster SA.
* Created the empty files for the pending scan-server SA.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Alert->check_alert_sent() and the 'alert_sent' table to remove 'alert_name' as it really isn't helpful given record_locator.
* Created 'Database->purge_data' that takes an array of tables and, in reverse, purges them from the database(s). This also disables archiving and resync functions now.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->insert_or_update_updated() to manage entries in the special 'updated' table.
* Updated Alert->check_alert_sent() to not use the 'type' parameter, but instead use the 'clear' parameter, to be more consistent with other methods.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Alert->register to take a hash reference for message variables to simplify when a caller plans to log and register an alert at the same time.
* Updated Convert->bytes_to_human_readable() to name the 'size' variable used internally for 'bytes' to actually be 'bytes' for better consistency.
* Created multiple new Database methods;
** ->check_condition_age() is meant to be used by scan agents to see how long a given condition has been in play (ie: how long ago power was lost to a UPS or a sensor became unreadable).
** ->insert_or_update_health() handles recording data to the new 'health' table, used for determining ideal hosts for servers between nodes.
** ->insert_or_update_power() handles recording data to the new 'power' table, used for determining how power events are handled.
** ->insert_or_update_temperature() handles recording temperature data to the new 'temperature' table, used to determine how thermal events are handled.
* Got a lot more done on the scan-hardware scan agent. Only part left now is post-scan health processing.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Dataase->check_for_schema() that scancore agents can use to check/load the SQL schema
* Updated Database->write to take the 'transaction' parameter.
* Used scan-hardware as the test mule for scan agent SQL handling.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed the scan-network skeleton scan agent to scan-hardware and started work on it based on the M2 version.
* Updated Database->get_recipients() to take the 'include_deleted' parameter, and changed the default behaviour to only return active records.
Signed-off-by: Digimer <digimer@alteeve.ca>
* More work done on Email->send_email() to, well, actually send email (which it isn't doing yet, but it's close).
* Updated Words->key() to include the bad key name when no entry for the requested key exists in the words.xml file.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Dropped support for supporting imperial measurements in generated emails.
* Created Database->get_alerts() to read in alert data and ->get_recipients() to get the list of alert recipients.
* SQL Schema changes;
** Added 'alert_processed' to 'alerts' to track what alerts have been processed.
** Changed 'recipient_new_level' to 'recipient_level' now that we're only using 'notifications' as a per-host override for user/hosts alert levels.
** Removed 'recipient_units' as we're no longer supporting non-metric values.
* Updated Alert->register() to take strings for the alert level (which gets translated to integers).
* Created Email->get_current_server() to returned the mail_server_uuid of the active mail server (if any). Created ->send_alerts() to process unprocessed alerts and send emails to recipients.
* Updated Words->parse_banged_string() to take the 'language' parameter.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Tools->_set_defaults where the order the tables were sync'ed it caused primary/foreign keys would trigger DB errors when resync'ing in some cases.
* Created Database->log_connections to make it easier to log which databases are actively in use and other data about the connections.
* Fixed bugs in striker-manage-peers that (partly because of the above bugs) failed to connect to new peers properly.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got email server configuration under way. A mail server can now be configured via Email->_configure_for_server(), but more work is needed on when to switch between configs.
* Fixed some logging of passwords that wasn't being checked to see if secure logging was enabled or not.
* Fixed a bug in Striker where the back arrow in email config sub-sections weren't going back to the main email menu.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Server->get_status() where the call to Storage->rsync's returned output checked for '!!errer!!' instead of '!!error!!'.
* Fixed a bug in Storage->rsync where, when no port was passed in, it would try to specify an empty port and fail.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Converted most 'eval { }' calls to localize $@ and test the output of the eval, instead of checking to see if $@ was set.
* Converted all 'local' hash references to instead use the short host name of the local machine as a new standard.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the anvil.conf option 'sys::privacy::strong' that controls if the Anvil! ever "calls home". Initially, this controls DRBD's usage flag.
* Updated DRBD->get_devices() to track resources by their 'by-res' names as well and by the normal '/dev/drbdX' devices.
* To mitigate https://bugzilla.redhat.com/show_bug.cgi?id=1868467, updated Get->bridges() to parse the normal (non-JSON) data if we get invalid JSON output.
* Updated anvil-join-anvil to not disable, and in fact enable, libvirtd on boot. With DRBD 9, the original fear of a user accidentally booting a VM that's running on the peer no longer is an issue. By enabling it and leaving it on, Striker dashboard users won't lose their virtual machine manager link unless the node powers off. Also enabled actually updating the job progress, completing this tool!
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up a lot of logging.
* Updated Cluster->parse_cib() to track if a stonith device has 'delay' set.
* Got a lot more work done on anvil-join-anvil's stonith processing, but it still isn't complete. Updated it to change shell user passwords as well.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated striker to now only offer gateway for IFN networks. EL8 seems to ignore 'GATEWAY="x"' in interface configs which caused anvil-join-anvil to always think an interface needs to be updated. Updated as well to remove DNS entries set in interfaces that are not the default gateway.
* Fixed a bug where DNS entries were being missed, causing entries to be repeatedly added to the interface that was the gateway interface.
* In anvil-update-states, added Get->switches() so that verbosity switches are used.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed Database->get_ip_addresses() to clear stale IP addresses.
* Finished (for now, more testing needed) System->configure_ipmi! Also created System->test_ipmi() that handles trying lanplus and various password lengths, updating hosts -> host_ipmi on successful check.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got more work done to System->configure_ipmi() to warm reset HP IPMI BMCs. It also now finds the IPMI user have started the password management.
* Created Words->shorten_string() that shortens a string to a number of bytes (as opposed to shortening to a character length).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added calling 'debug => $debug' in System->X methods.
* Got more work done on System->configure_ipmi(). It should now determine if a BMC exists and pull the OEM and network details automatically.
* Updated anvil-configure-host to log more data in an attempt to find a reproducer for an odd bug where (apparently) a host was picking up the wrong job data meant for another host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Get->host_type() the type wasn't being set for nodes and dr hosts.
* Fixed a bug in Validate->host_name() where the wrong method was being called.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Cluster->get_peers() that figures out who the peer node (and DR host, if applicable) are.
* Updated Cluster->parse_cib() to dig out more information.
* Created Cluster->start_cluster() to start pacemaker (via pcsd) locally or on all (both) nodes.
* Started working on ocf:alteeve:server to start/stop the libvirtd/drbd daemons as needed, instead of having pacemaker do it.
* Got more work done on anvil-join-anvil. Node 2 now waits for the cluster to start, and node 1 will do setup as needed, then wait for the cluster to start as well.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got anvil-join-anvil to the point where is initializes and starts the cluster.
* Deleted the old ssh key handling logic in anvil-daemon.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the fix from the last commit for System->call to handle returned data without an ending newline to Remote->call.
* Got more work done on System->update_hosts(). It's able to add new hosts, but misses the short and FQDN host names. Need to fix that and the verify existing / manual entries aren't molested.
Signed-off-by: digimer <digimer@pulsar.alteeve.com>
* Created Get->trusted_hosts() that finds the dashboards the host uses and, if the host is in an Anvil!, the peers in the same anvil.
* Created (but not finished yet) System->update_hosts() that will add and edit entries for all IPs to trusted hosts.
* Fixed a logging bug in Striker->load_manifest().
* Fixed a bug in System->call where, the the output from the shell call didn't end in a new-line, it would not parse the return code and lease the return code string appended to the shell output.
* Fixed a big in System->change_shell_user_password() where a new-line (\n) meant for the shell call wasn't escaped properly. There was also a duplicate 'return_code' variable preventing the actual return code from being read.
* Got more work done on anvil-join-anvil to update the hacluster password (when needed).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Update striker manifest run to add an entry into the 'anvils' table, and pass the anvil_uuid to the jobs rather than the various host_uuid's.
* Fixed a bug in the 'anvils' SQL procedure that copied data into the history schema (a few columns were missing).
* Updated anvil-configure-host to reboot when finished to be certain network changes have taken effect. Also updated the handling of virsh bridges to delete the autostart symlinks if libvirtd daemon isn't running.
* Added some logic to anvil-daemon to call 'anvil-update-states' with the -v{1,3} flag depending on the active debug level.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added a global switch --resync-db which takes a UUID and forces that DB to be marked as needing a resync.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->get_hosts() to store 'host_key' and 'host_uuid' data.
* Created Database->get_ssh_keys().
* Fixed a couple bugs where Get->host_type() now returns 'striker' but tests checked for 'dashboard'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated 'variables' -> 'variable_source_uuid' to type 'uuid' and removed the 'not null' constraint.
* Updated Database->insert_or_update_variables() to check/update 'variables_source_table' and 'variables_source_uuid'.
* Created the 'trusts' database table which will, when done, tell anvil-daemon which users@machines to trust (setup passwordkess SSH).
* Created (but not finished) System->manage_authorized_keys() and moved the logic over to it from anvil-daemon.
* Changed the host types "dashboard" to "striker".
* Moved the following methods from 'System' to 'Get';
** System->get_host_type to Get->host_type
** System->get_bridges to Get->bridges
** System->get_free_memory to Get->free_memory
** System->get_os_type to Get->os_type
** System->get_uptime to Get->uptime
* Updated striker to include the host_uuid for the 'node1', 'node2' and (if chosen) 'dr1' when running a job manifest.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished the saving of a "run manifest" job menu. Included filtering out potential machines already in other Anvil! systems from the select box and updating the password fields to not trigger a browser to save/auto-complete the field.
* Fixed a bug in Database->get_hosts() caused by the attempt to immediately return with a 0 if it had been called before. Now a check is made in ->insert_or_update_manifests() where the recursive loop was possible.
* Updated the RPM spec to v.33 after releasing .32 after the last commit. Also added the core requirement for perl-Data-Validate-Domain.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the anvils database table to have columns for traching which hosts are used as node 1, node 2 and the DR host.
* Updated Database->get_hosts() to check which, if any, Anvil! a given hosts is attached to.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got deleting of manifests done.
* Fixed a couple bugs from the _network to _ip variable rename.
* Fixed a bug where fence variables with double-quotes in them weren't being handled properly.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created the shell of Striker->generate_manifest() that will pull out CGI data to generate the install manifest itself and save/update it.
* Got step 3 menu finished along with the corresponding sanity checks.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Template->select_form() to add the 'style' paramter (for CSS style options), and added the ability to set the 'options' to the special value 'subnet' that builds an options box showing usable CIDR subnets.
Signed-off-by: Digimer <digimer@alteeve.ca>
** Created Database->get_upses() and ->insert_or_update_upses().
** Created Striker->get_ups_data(). This parses the special 'ups_XXXX' strings.
* Updated Validate->is_domain() and added ->is_host_name() to use the Data::Validate::Domain module (which is now required in the core RPM).
* Started work on manifest handling.
* Sorted the language keys alphabetically.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Removed fences -> fence_type from the database schema, wasn't needed.
* Updated Striker->get_fence_data() to store data in the 'fence_data' parent hash so to not conflict with 'fences' used in Database->get_fences().
* Got the saving and loading of fence devices working.
Signed-off-by: Digimer <digimer@alteeve.ca>
** Needed to add a couple more packages to CentOS's package list.
** Changed the PXE kickstart template to create a dedicated '/boot' partition (raw partition or on RAID 1). This seems to be required now on 8.1.
** Added PXE's UEFI support to the template system (untested, but it's at least generated now).
* Filtered out 'debug' and 'verbose' options when configuring fence devices.
* Added an internet test to tools/striker-manage-install-target and skipped attempting to download packages when there's no internet. Also made loading the host OS info into a small function.
* Started creating the man pages.
Signed-off-by: Madison Kelly <mkelly@alteeve.ca>
* Added fonts to Striker's RPM list and to the anvil-striker RPM dependency list so that the terminal is actually useful.
Signed-off-by: Madison Kelly <digimer@neutron.digimer.ca>
* Added filters Striker->get_fence_data() for parameters. Manually change 'action' entries from 'string' to 'select' and use the data in the 'actions' element to populate it, with actions that don't make sense filtered out.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug (well, made a work-around for an issue without a known reproducer) where, on some occassion, a record will end up in the public table without being copied into the history schema. When this happens, the next resync would crash out because the resynd reads in the history table only. Now, when about to INSERT a record into the public schema during a resync, an explicit check is made to see if the record alread
y exists. If it does, the INSERT is instead redirected to the history schema.
* Cleaned up the fence agent metadata when displaying to a user, converting the shell codes to underline a string with square brackets instead. We also now replace newlines with <br /> tags. Lastly, to help fence_azure_arm's metadata description to display cleanly, a check is made to format the table correctly.
* Began work on the Striker menu for handling fence device management
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Striker->get_fence_data() that reads/parses the unified fence metadata file created by tools/striker-parse-fence-agents.
* Created the new 'fences' database table and Database->insert_or_update_fences() to handle it.
* Added hosts -> host_ipmi that will, later, store information on how to access the host's IPMI interface, when available.
* Sketched out how the new Install Manifests are going to work.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Database->insert_or_update_variables() where, if 'update_value_only' was set but not variable_uuid was passed or could be found, an (incomplete) INSERT would be attempted.
* Added support for generating module metadata when setting up local repos on Striker.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->get_recipients() (from what used to be get_alert_recipients), as well as ->get_mail_servers() and ->insert_or_update_notifications().
* Renamed 'recipients -> notification_anvil_uuid' to 'notification_host_uuid'.
* Started work on scancore -> check_email.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished (but not yet tested) the menu to manage alert recipients.
* Created Words->language_list() that creates a hash reference of available languages.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started work on Striker's notification recipient management page. Cleaned up the variable names in the mail_server management function.
* Added recipients -> recipient_units column to the sql schema.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got Striker to the point where it can save mail servers (not load existing or delete yet, though).
* Added a check to striker-parse-oui so that it only runs once per day.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->insert_or_update_mail_servers() to insert/update entries in the 'mail_servers' table.
* Got more work done on the email server menu. It now shows the entered values, but not yet save the data.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Accounts->read_cookies where, when a user's hash had expired, the logged error message didn't show the user's name.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added 'no_ping' to Database->connect() to disable pinging before connection, regardless of the anvil.conf setting.
* Created Network->read_nmcli() that reads, parses and stores the verbose output from 'nmcli'.
I can not properly explain in this commit message how much getting network manager working tripped me up. omg the complexity of what used to be such a simple process...
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got more work done on confirming the user's request to setup the network of a node or DR host.
* Reworked network select boxes to sort by the network name instead of the MAC address.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->insert_or_update_network_interfaces() to take the new 'link_only' and 'timestamp' parameters to support flushing out the cache file above.
* Updated anvil-daemon to run anvil-update-states when the database connection is lost. Also moved the 'handle_periodic_tasks()' function call to be conditional on there being a database connection.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where ip_addresses could break resync when 2+ machines had the same IP (ie: 192.168.122.1).
* Updated logging of DB transactions to show the DB host's IP instead of the UUID.
* Updated Get->date_and_time to take a 'use_utc' parameter to return the time using GMT time instead of the host's TZ.
* Updated anvil-daemon to periodically call tools/anvil-update-states. Also upadted anvil-daemon to delay daily jobs by 2 hours except for the dashboard with the highest sorted UUID to minimize dual runs of tasks that only need to run once per day per cluster.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a couple logging bugs in System->call().
* Fixed a bug in anvil-daemon where it was trying to setup setuid-C wrappers on non-dashboards.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got the menu for mapping a host's network displaying (much work still to be done).
* Updated the anvil.js funtion to run dependent on the page being shown. For the main menu, the json is now properly reread and display updated as json content changes.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed unclear error logging in Network->find_matches().
* Updated System->generate_state_json() Striker->parse_all_status_json() to determine if another machine can be reached from the local dashboard. If it can be reached, the first matching interface and IP are recorded.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added missing foreign key references to the SQL schema.
* Added support to tools/anvil-update-states to connect bonds to bridges, as appropriate.
* Finished the logic in test.pl to pull the network data (with connections between bridges, bonds and interfaces) needed for the WebUI.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added recording the last-change order for network interfaces in System->generate_state_json() so that the most recently unplugged and plugged back in interfaces can be tracked.
* Worked out a faster way to ping scan subnets with nmap in striker-scan-network. Dropped average scan time from 35 minutes to 4~5 minutes for a /16 subnet.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in tools/anvil-update-states where the MAC of an interface that is the backup in an active-backup bond would be the MAC of the active member instead of its real MAC.
* Fixed a bug in Convert->add_commas() where a passed in value of '0' returned an empty string.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed 'bridges' -> 'bridge_mac' to 'bridge_mac_address' to be consstent with other MAC address column names.
* Finished Network->load_interfces().
* Updated anvil-update-states to check for interfaces under bridges that are missing their 'network_interface_bridge_uuid' reference UUID.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->get_hosts_info() (though it's not at all finished) that will write out a unified JSON file contain all data known about all hosts/Anvil! systems. This will be later used to create the WebUI parts.
* Also created, but also not finished, Network->load_interfces() that will work sort of like ->load_ups, but include all interfaces regardless of if they have an IP or not.
* Fixed a bug where the new bridge_interface_note parameter didn't exist in the Database->insert_or_update_bridge_interfaces() method.
* Updated anvil-update-states() to only write out the JSON/XML files if it's running on a dashboard. For nodes and DR hosts, it just needs to update the database.
* Created a new hook in anvil-daemon that will call tasks on a machine that is configured.
* As per RHEL 8.1 release notes, changed the package 'dnf-utils' to 'yum-utils' in the packages to load for install target repos.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added 'ip_address_note' to the 'ip_addresses' table as there was no column convenient for flagging as DELETEd.
* Added 'uuid' to Database->insert_or_update_file_locations() and ->insert_or_update_files(), and actually used it in all ->inser_or_update_X() methods.
* Added 'delete' as a parameter to Database->insert_or_update_ip_addresses() to allow simple deletion of a referenced IP address.
* Addressed a few 'undefined variable' errors.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->get_host_from_uuid() that takes a host UUID and returns the host's name.
* Reworked Network->find_matches() to return both the match's IP and subnet.
* Finished getting tools/striker-initialize-host to add all known peers to the target, using IPs on the target's available subnet.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the loop detection logic in Log->entry where processing large strings was triggering it when it shouldn't.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated striker-scan-network to only run once per day unless --force or a given --network is used. This avoids repeated scans when the anvil-daemon restarts frequently for whatever reason.
* Fixed (for real this time) Convert->time's handling of the 'long' parameter.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added job parsing to tools/striker-parse-oui and tools/striker-scan-network, and enabled them in anvil-daemon.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Convert->time() where the suffix was long when it should have been short, and vice-versa.
* Updated Network->download() to check if the target file exists and, if so, to abort unless 'overwrite' is given or the existing file is 0-bytes long. Also updated it to not exit on immediate error after the wget call and instead check to see if a zero-byte file exists and remove it, if so.
* Created Validate->is_hex() to check hexadecimal strings.
* Updated Words->clean_spaces() to remove MS-DOS-style ^M cr/lf characters.
* Updated anvil-daemon to have a section for periodic tasks that run daily, and added striker-parse-oui as well as moved striker-manage-install-target refresh to that check. Also made those tools run on dashboards only.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->insert_or_update_bridge_interfaces() to handle updating the new bridge_interfaces database table that records which network interfaces are connected to which bridges.
* Added 'bridge_mac' and 'bridge_mtu' to the 'bridges' table.
* Started Server->map_network which will, eventually, try to map MAC addresses to IPs and record server -> vnetX -> bridge data. Made getting a server status target-dependent.
* Worked on anvil-update-states to parse and record bridge data.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished the detection of and handling of initialization of a host when the host has no Internet access.
* Disabled (for now) anvil-daemon's check_ssh_keys function.
* Fixed a couple small bugs elsewhere.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Moved System->is_local to Network->is_local, and System->ping to Network->ping.
* Added a check to tools/striker-get-peer-data that will report if the target has Internet access or not.
* Cleaned up the form that prompts the user to enter their Red Hat credentials.
* Updated tools/anvil-manage-keys (and related code) to no longer distinguish by user. If a target is flagged as changed, it is removed from the root and all user's known_hosts files.
* Updated Storage->write_file() and ->update_file() to accept the 'backup' parameter to control if an file that exists is backed up before being updated/replaced.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated System->host_name to work locally and on remote targets.
* Renamed all 'hostname' instances to 'host_name' to standardize on a spelling throughout the program.
* Removed use of and dependency on 'hostname'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Get->host_uuid() where the call to get the host UUID from dmidecode was broken.
* Updated striker -> Initialize host to allow the user to set the host name of a node or host being initialized, allowing it to be registered with Red Hat under the proper name and make it easier to track which machine is which during initial Anvil! build.
* Fixed a few minor bugs with variable insertions into translated strings.
* Updated striker-initialize-host to use a progressive progress value rather than statically assigned steps.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the pxe.txt file to now write a caller for anvil-update-issue in /etc/NetworkManager/dispatcher.d/ifup-local to have the /etc/issue file is updated as soon as the network is brought up, before the GDM login prompt is shown.
* Fixed a couple bugs in tools/anvil-manage-keys, including to ensure that the permissions are retained when a file is updated.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added a check to Remote->call() where, when a connect attempt fails because of a changed/bad key, it is reported as such to the user/logs and an entry is recorded in the state file.
* Started adding a Striker menu function showing users a list of bad keys in known_hosts files and the ability to remove old keys.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Remote->call() to detect when a connection fails because the target's known_hosts entry has changed. Still need to add the function to report this to the user.
* Fixed a bug where new-lines in Words->parse_banged_string() where a double-banged word string's variable value would cause an infinite loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Tools->refresh to reload anvil.conf in one call.
* Created Anvil::Tools::Network to hold network-related tasks.
** Created Network->is_remote() that tests to see if a string (containing a target) refers to the remote machine (versus a local machine). Updated all previous checks to use this new method.
** Moved Get->network_details() and Get->network() to the new Network module. Renamed Get->network() to Network->get_network().
** Made Network->get_ips() work locally and remotely.
** Created Network->find_matches() that compares two scanned machines IPs (via two previous calls to Network->get_ips())
* Created Database->manage_anvil_conf() that will add, update or remove a given database connection in a local or remote anvil.conf file.
* Fixed bugs in Storage->backup() where the bash calls were quite broken. I'm not sure how it ever worked before... x_x
* Updated anvil-daemon to not initialize a database unless it's running on dashboard. Also added a check at the startup of anvil-daemon where it will go into a loop waiting for a database to become available, re-reading anvil.conf each loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Remote->call where a shell call that ended in a newline would work, but throw an error and not get the return code.
* Created Database->get_job_details() which takes a job_uuid and returns the job details, if found.
* Fixed a bug in Jobs->update_progress() where 'clear' wasn't removing the old job_progress data.
* Added the parameters 'no_files' to skip stat'ing/recording non-directories, and 'search_for' which will set the parent directory in 'scan::searched' and stop scanning if found. This allows this method to act as a directory tree scanner and as a search engine.
* Created Striker->get_local_repo() that builds a repo file body suitable for adding to peers, nodes and DR hosts.
* Fixed bugs in the Striker WebUI related to initializing a target node / DR host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated striker to no longer try to SSH to a remote machine. To enable this, we'd have to give apache a shell and an SSH key, which is dumb and dangerous when considered.
* Created tools/striker-get-peer-data which is meant to be invoked as the 'admin' user (via a setuid c-wrapper). It collects basic data about a target machine and reports what it finds on STDOUT. It gets the password for the target via the database.
* Updated anvil-daemon to check/create/update setuid c-wrapper(s), which for now is limited to call_striker-initialize-host.
* Created Anvil/Tools/Striker.pm to store Striker web-specific methods, including get_peer_data() which calls tools/striker-initialize-host via the setuid admin call_striker-initialize-host c-wrapper.
* In order to allow striker via apache to read a peer's anvil.version, which it can no longer do over SSH, any connection to a peer where the anvil.version is read is cached as /etc/anvil/anvil.<peer>.version. When Get->anvil_version is called as 'apache', this file is read instead.
* Updated Database->resync_databases() and ->_find_behind_databases() to ignore the 'states' table.
* Created tools/striker-initialize-host which will be called as a job to initialize a node/dr host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Storage->read_file() where a remote read, where the remote user wasn't specified, would cause the call to hange.
* Cleaned up striker->add_sync_peer() to use more clear variable names.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up the striker->add_sync_peer() function to more clearly differentiate the ssh port from the pgsql port.
* Improved the HTML form to not have the browser treat host login fields as credentials to autofill or save.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Switched the icons for prep'ing a node or DR host and building an Anvil!.
* Started work on the node/dr host initial setup webUI.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created tools/anvil-check-memory to report how much RAM is used by a given program.
* Added documentation for some previously undocumented methods.
* Updated Database->archive_database() to take the 'tables' parameter.
* Updated Storage->scan_directory() to record a directory's mode and type, even when recursive isn't used.
* Finished System->check_memory().
* Updated ocf:alteeve:server to now NOT stop a DRBD resource unless 'stop_drbd_resources'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created DRBD->allow_two_primaries() and ->reload_defaults() that enables (and resets/disables) dual-primary operation (allow-two-primaries=yes), used to enable live migration.
* Created Remote->test_access() that simply verifies that a remote target can be accessed (as a given user).
* Created Server->migrate() that actually migrates a server. It can push already, and pull will be added next.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Modified Server->boot() to now only work locally. Also updated it to optionally take the XML definition path for the server to boot.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Server->boot() that starts a server and verifies that it did actually start.
* Got ocf:anvil:server smarter about starting DRBD resources, properly handing resources where auto-promote isn't enabled. The 'start' process is now complete (baring bugs).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated DRBD->get_devices() to store data all under 'drbd::config::<host>::x'.
* Created Server->find() that takes a target and collects the servers running on it.
* Updated System->check_storage() to redirect all calls STDERR to /dev/null to supress errors about failing to open /dev/drbdX when LVM's filter isn't setup.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created System->active_lv() that, surprise, activates an inactive logical volume. Also created ->check_storage() that parses out the LVM data.
* Fixed a bug in tools/fence_pacemaker that was preventing it from compiling and running.
* Updated ocf:alteeve:server to validate the target server's storage.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where Get->host_uuid() wasn't reading from the host.uuid file.
* Updated Remote->call() to record a target's fingerprint when needed.
* The ocf:alteeve:server resource agent now properly stopps a server and the corresponding DRBD resource.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated package list to fix changed dependencies from RHEL 8 beta to final.
* Changed anvil_daemon to only check DHCP once per minute instead of every loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Replaced 'screen' with 'tmux' in the spec file.
* Fixed a couple typos in the SQL schema that prevented it from loading.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->insert_or_update_jobs() to use job_data when looking for a job uuid. Also fixed logging and adapted 'jobs::X' variable feeding to prevent them being undefined. Also made the search find jobs with the program name anywhere in the string, instead of just the start of the strong.
* Started work on Jobs->update_progress() to handle updating downloading file information.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added to Convert->time the 'translate' parameter that controls if the returned string is already translated or not.
* Updated Storage->change_mode and ->change_owner to rename the parameter 'target' to 'path' to help prevent future confusion woth most other instances of the 'target' parameter meaning a target machine.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added a 'timeout' parameter to Remote->call() to limit the time that a command on a remote host can run, with a default of '10' (seconds).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started working on Convert->time().
* Changed anvil-manage-files skip /mnt/shared/temp when looking for files to add to the database.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the files table to add the file_mtime. In the future, if two versions of the same file exists on different machines, the one with the more recent mtime will be copied over the others.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished getting anvil-manage-files to find and process new files in /mnt/shared/incoming. Created a 'convert_mimetype' function to translate returned mimetype to a file type we care about.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed some bugs in Database->insert_or_update_files() and the 'files' table/procedure in the SQL schema.
* Got more work done on anvil-manage-files.
* Created Job->get_job_details().
* Added an executable check for files in Storage->scan_directory().
* Cleaned up some logging and switched to Job->get_job_details() in anvil-update-states and striker-configure-host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Get->uuid to take the new 'short' parameter that, when passed, asks for just the first 8 bytes of the UUID string.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started work of "Files" (replacement for the media library), including database tables, planned sync flow and web UI.
* Added a check for the /mnt/shared directories and create them as needed in the periodic anvil-daemon checks.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed a couple Striker-only tools to use the 'striker' prefix instead of 'anvil'.
* Updated the core_tables list.
* Renamed 'sys::log::main' to 'sys::log::file'.
* Fixed some "Back" and "Refresh" links.
* Started planning out the file sync system.
* Started work on the Anvil! setup / host prep system.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Job->html_list() to return the list of running or recently completed jobs, and used it instead of the code formarly used in 'striker' when Striker was unavailable.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added a 'test' parameter to Log->entry, Storage->make_directory and Words->key to help debug in places that Log->x may not be usable.
* Converted many $anvil->Log->x calls to print if $test to help prevent recursive loops, but not all fixed yet.
* Added the new 'host_keys' database table to the schema for a possible new feature of removing passwords in favour if machines adding peers' public keys to their authorized_hosts file.
* Cleaned up the opening calls to $anvil->Tools->new() in most tools.
* Cleaned up some variables in tools/anvil-update-states after reading their values from files (clean trailing newlines).
Signed-off-by: Digimer <digimer@alteeve.ca>
* The 'notes' file has a lot of RHEL8 migration notes added, including RPM build orders.
* The anvil.spec file has switched the source from 'master.tar.gz' to 'anvil-3.0b.tar.gz' and moved the source to our webserver. Updated the dependencies as well.
* Updated anvil.sql to add the 'anvils' table and fixed some SQL schema problems.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated anvil.sql to add the new tables needed for alert mail delivery.
* Update anvil.sql and Database->initialize to now default the user to 'admin' and swap that out if needed, instead of using the #!variable!user!#' replacement variable.
* Started updating anvil.spec for EL8.
* Added support for 'striker::repo::extra-packages' which users can use to add additional packages to the Striker repositories.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed Alert->register_alert() to ->register() and updated it to take 'clear_alert' and used it and the alert level to set the title automatically if not set by the user.
* Updated Log->_adjust_log_level() to record when the user set the log level at the command line so that invoked child processes get called with the same log level switch.
* Got the framework for actually calling scan agents in scancore in place. Untested so far.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated anvil-daemon to re-read the main words file on each loop.
* Updated scancore to read and purge each scan agent's words file between invocations.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got scancore scanning the agents directory, and properly holding on startup until at least one database is available (instead of exiting), and holding on startup until the local system is configured.
* Created the skeleton of the first scan agent; scan-network.
* Fixed a bug in Storage->check_md5sums() where dynamically loaded modules, loaded after the initial md5sum calcs, would cause the calling daemon to exit (possibly on every invocation).
* Created the scancore.README that will eventually be the main scan agent guide / API document.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started expanding Alert->register_alert() to actually implement it.
* Improved handling errors in Words->key().
* Started work on Striker's "Anvil!" menu section. Also cleaned up the power handling.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Log->entry() to accept 'print => [0|1]' to send a log message to STDOUT (minus prefix) to avoid tools that were repeatedly calling print and Log->entry back to back.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Words->parse_banged_string where some variable strings were not being cleared, causing infinite loops.
* Added job progress reporting in striker-manage-install-target, and made it only refresh the RPM repo when '--refresh' is specified (with --force now forcing the issue). This was done to allow adding it into anvil-daemon in such a way that it would only update the RPM repo once a day.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Get->anvil_version where the version of local systems and remote systems differed in closing new lines.
* Fixed a bug in Database->insert_or_update_variables() where the 'debug' parameter wasn't working.
* Renamed System->determine_host_type -> System->get_host_type.
* Fixed a bug in System->get_uptime where there was a newline after the uptime integer.
* Updated anvil-daemon to track and record the state of the Install Target feature on Striker dashboards.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Work on anvil-manage-install-target to enable/disable dhcpd. Also to only refresh the RPM repo periodically. Fixed a bug where it always reported that the kickstart files were not updated, even when they were.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got anvil-manage-install-target finished creating config files and enabling daemons needed for PXE. Still untested in function though.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Storage->read_file() where the last newline wasn't always being faithfully recorded.
* Created System->restart_daemon (as opposed to ->reload_daemon).
* Got creating/updating dhcpd.conf / dhcpd working in tools/anvil-manage-install-target.
Signed-off-by: Digimer <digimer@alteeve.ca>
* After much time wasted chasing a dnf bug (https://bugzilla.redhat.com/show_bug.cgi?id=1641947), tools/anvil-manage-install-target now populates the <DOCROOT>/<os_type>/<os_arch>/os/Packages/ directory with needed RPMs.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added 'sys::database::failed_connection_log_level' to allow silencing of log messages when a Striker peer database is not available.
* Started updating the .spec for the new release to add supported packages needed for PXE/dhcp/tftpboot.
* Added to repo tftpboot files as pulling them out of the packages and moving them into the right place relative to the modest size of adding them directly to our source wasn't justified.
* Created the still very very early 'tools/anvil-manage-firewall' tool.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Jobs->update_progress to take 'picked_up_by' as an optional parameter, defaulting to '$$' (the caller's PID).
* Created System->get_uptime() to return the current uptime in seconds.
* Added a delay to anvil-manage-power to not proceed with a reboot if the uptime is less than 600 seconds. This way, if any future bug causes an infinite reboot, there will be more time to determine what's wrong and debug the system between reboots.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up some logging.
* Made the "Reload" buttons work more sensibly and cleaned up some webui display stuff.
* Got deleting peers mostly working (well, it works, but then it goes into a loop thinking it needs to resync the now-gone database until the daemon restarts).
* Fixed a race condition bug where if a job exited between the time that anvil-daemon got a list of PIDs and when it checked to see if that specific pid was alive, a job that actually completed could be restarted.
* Added a loop check to anvil-manage-striker-peers where it would hold until a database connection to the newly added peer was available, preventing a condition where re-adding a peer (and so the host_uuid is in hosts) cause the job belonging to the peer to be recorded locally and then never synced to the peer.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added 'sys::log_date' which controls if the date and time is pre-pended to log entries.
* Created Get->host_name() which takes a host UUID and returns the 'host_name' from the 'hosts' table, if found.
* Cleaned up some HTML templates and logging.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database methods that can take the caller's file or line to append the local file and line.
* Updated Jobs methods to call Database->insert_or_update_jobs().
* Updated the remaining UPDATEs of 'jobs' table to use 'Database->refresh_timestamp()' for the 'modified_date'.
Signed-off-by: Digimer <digimer@alteeve.ca>
The resync of the databases was originally designed (on m2) with the expextation that any given column would have only one change per 'modified_date' time. That was never a great approach, but it worked in m2 and just bit me on m3. With job processing, for an example, the job_progress will change repeatedly in one pass, all with the same 'modified_date'. So only one record per run would resync. To fix this, the plan is to drop 'history_id' (and the procedure/trigger in pgsql to copy INSERT and UPDATEs to the history schema). The new plan is to use 'change_uuid' with a per-transaction UUID created in Database so that the per-DB 'history_id' is replaced with a per-update/insert UUID in 'change_uuid'. This will become the unique record used to sync databases, instead or 'modified_date'. To keep things consistent, 'modified_date' was renamed to 'change_date' to match 'change_uuid'. This work is very much "in progress" and not finished.
This commit also changes Get->uuid to use UUID::Tiny to create v4 UUIDs instead of making making a system call to 'uuidgen'. This sped up UUID generation by almost 100x.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the Database module to not sort or reorder the 'core_tables' array, and reordered them in the hash they're declared in.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added a check to all 'Database->insert_or_update_*' methods to check if the passed-in reference UUID was found and return an empty string if not.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->insert_or_update_jobs() to also use the job_command when looking for an existing job (when a specific job_uuid was not included).
* Fixed a bug with a missing ? in striker->add_sync_peer function. Also updated it to not try to record the peer's job as it is unlikely the peer will be in hosts. Instead, the job_command to add the peer is appended to the local job't job_data and the updated anvil-manage-striker-peers looks for that at the end of the add and sync, and records the job once the peer's UUID is in 'hosts'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Bumped the RPM spec file to 15, though haven't actually rolled the new RPM yet. Also added 'htop' as an anvil-core dependencies.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated anvil-configure-striker to use Job methods and reboot using anvil-manage-power. Also updated it to set/clear maintenance mode and mark a reboot required at the end of it's run just prior to reboot.
* Lots of log cleanup.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Job->clear() to clear the job_picked_up_by column. Created Job->get_job_uuid() to return the job_uuid of an unfinished job matching a given job_command string (if any found).
* Updated striker->process_power to log the user out after confirming a poweroff or reboot action.
* Added anvil-daemon --startup-only to not enter the main loop and exit.
* Finished getting poweroff and reboot working (though more testing needed).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got the webui portion of requesting a poweroff and reboot done, but still working on finishing anvil-manage-power (work on which lead to the above improvement).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created the Anvil::Tools::Jobs module to handle general job processing task. Moved 'update_progress' from tools/anvil-update-system to it and generalized it.
* Added some missing CDATA wrappers to the words XML file strings with '>' in it.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added '--refresh-json' to anvil-daemon that auto-selects '--run-once', '--main-loop-only' and '--no-start'.
* Updated anvil-update-system to not go more than a second between updates to the progress (save for when we're holding on data from 'dnf').
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished (for now) adding support for monitoring jobs while a node is in maintenance mode!
* Cleaned up the display of job data and redid how buttons (real and classed links) are displayed to be consistent.
* Fixed a bug in anvil-daemon where a disconnect wasn't being called between loops, causing DB connections to pile up.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started adding the display of running and recently finished jobs to Striker when in maintenance mode. Still lots to do.
* Started working on the logic for what will soon be Words->decypher_string in anvil-daemon to process strings stored as '<key>,!!<name1>!<value1>!!,...,!!<nameN>!<valueN>!!'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed 'database::locking' to 'sys::database::locking' to avoid collisions with 'database' keys.
* Fixed a problem with System->call where reidrects were missing the Proc::Simple method name.
* Updated anvil-daemon to check if there is no database connections on start-up, run prep-database if not, and try connecting again. If it still fails, exit. Also updated the main loop to reconnect to the database(s) and skip if non are available. Did more work on the keep_running() function.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added a reboot button to managing the Striker. This button and the text associated with it reflects the need for the reboot.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed tools/anvil-clear-reboot to tools/anvil-reboot-needed and changed it to behave like anvil-maintenance-mode.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created tools/anvil-clear-reboot to clear the "reboot needed" flag. Also created, but not yet using (and may not use) units/anvil-boot-time.service.
* Started work on having jobs show their data via JSON / jquery.
* Updated anvil-update-system to record messages indicating the progress so far.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->connect to take a specific UUID to attempt a connection to.
* Renamed some old 'sys::x' variables related to the database to 'sys::database::x' to conform better to coding standards.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the RPM to .13 to disable postun's disabling of postgres, which breaks Anvil! software using the database during RPM updates.
* Fixed a logging bug where the number of DB connections was not inserting the number properly.
* Fixed exits in tools/anvil-prep-database.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started adding support for updating the striker (and later, all) systems. This will be handled by the in-progress tools/anvil-update-system program.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got anvil-manage-striker-peers working properly (so far).
* Updated anvil-prep-database to call anvil-manage-striker-peers, but testing still needed.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Update Anvil::Tools->new() to access the parameters 'log_level', 'log_secure' and 'debug', streamlining the frequent calls to $anvil->Log->level and ->secure in program startup, and allowing the values to take effect during the ->new constructor.
* Passed 'debug' to child method calls in more places (still more to do though).
* Fixed a bug where 'test_table' wasn't set in the right place, causing the database to try to initialize repeatedly.
* Made Database->archive_database only run if called with root access.
* Now the number of database connections are stored in 'sys::db_connections' instead of checking the returned number, and that is cleared on disconnect.
* Started working more on 'anvil-daemon', including adding support for System->call being taking 'background', 'stderr_file' and 'stdout_file' paramters which, when set, used Proc::Simple to background the process.
* Did some more work on database archiving, though still far from done.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where "form::error_massage" was default to ' ' which caused simple checks to thing there was an error when all was fine.
* Got the "add new sync peer" form confirmation box displaying and cleaned up the CSS a bit.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Get->users_home() to default to return the hore directory for the user running the program.
* Updated Remote->call() to start working on handling timeouts.
* Updated Storage->change_owner(), ->make_directory() and ->write_file() to default the the user and group running the program.
* Fixed a bug in home reporting the MAC address of NICs when confirming configuration of Striker. Also changed showing the domain to the hostname.
* Got more work done on sync peers.
* Updated the RPM spec file to install on Fedora 28.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Set Log->entry to chmod the log file to 666 when the file is opened to ensure apache can write to it.
* Fixed a string replacement variable name.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added another check and better error handling to Template->get() to print a more useful message if a template is found but fails to parse.
* Moved some strings into the words file.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Moved templates around different files to clean things up.
* Moved the "back" and "refresh" icons over to the right by the logo, and added a new icon for handling mail, alerts, install targets and manifests.
* Started work on the Striker configuration sub-page.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where the old style '#!replace!...!#' replacement variables were not being escaped when processing variable insertions into strings.
* Made the body variable be stored in 'form::body' instead of passing around the '$body' variable.
* Created a set of new icons.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added error messages to Striker configuration forms.
* Fixed a bug in home->get_network_details() function to handle single IPs in network.xml.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed tools/anvil-configure-network to tools/anvil-configure-striker given that it will also now update system passwords.
* Started working on tools/anvil-update-states to properly handle a Striker with already-configured networking.
* Cleaned up tools/anvil-change-password.
* Fixed a bug in Storage->update_config to set the ownership of anvil.conf to 'apache:apache' so that the web server can read it.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added user_algorithm and user_hash_count to the new users database table so that we can remember how a hash was generated, should it be changed down the road.
* Made the salt length configurable by the user (as well as the algorithm and loop count).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Changed 'database:❌:...' so that 'x' is now the database host's UUID instead of a simple integer. This will simplify sync'ing configs. Also removed default entries, and made it so that anvil-prep-database injects the local config during first setup. Renamed Database->get_local_id to get_local_uuid and changed the 'id' parameter to 'uuid'. Changed Database->initialize's 'id' parameter to 'host_uuid'. The Database->query, Database->write, Database->_mark_database_as_behind and Database->_find_behind_databases methods had their 'id' parameter changed to 'uuid'.
* Added the 'remote_user' parameter to Get->anvil_version, System->ping and System->change_shell_user_password for conencting to remote targets.
* Added the 'remote_user' parameter to all internal Remote->call uses.
* Updated Storage->backup, Storage->copy_file, Storage->make_directory,
Signed-off-by: Digimer <digimer@alteeve.ca>
* Made logging between journald and a traditional file configurable via 'sys::log_file'. Also made the file handle unbuffered when logging to a file.
* Fixed a bug with loading the anvil.conf config file in a few locations.
* Created System->stty_echo() to handle enabling/disabling shell echo, and added restoring the echo to Tools->catch_sig.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug with handling ssh fingerprints (and removed comments going to the known_hosts file).
* Added more nested debug parameter passing when methods call other methods (though more work is needed to catch up)
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Storage->read_file and Storage->write_file to support reading and writing on remote systems (untested though)
* Created System->change_shell_user_password() that changes a shell user's password by manually generating an sha512 salted hash of the given password and uses the resulting hash to modify the target user's password, so the password should never be visible in the process list. Works on both local and remote systems, though it still needs testing.
* Created Storage->rsync() to handle moving files between the local and a remote system.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where some '$anvil->{}' variables should have been '$anvil->data->{}'.
* Started merging message keys on 'error_xxxx', 'warning_xxxx', etc.
* The anvil-configure-network now configures the network. Commented out, the tool can reconfigure the entire network without a reboot, but a current issue with the post-configured system refusing to use the allocated interface as the default gateway is to be reviewed at a future time. For now, a closing reboot will be issued.
* Started creating 'anvil-change-password' that will update passwords, including apache (and configure .htpasswd when needed).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got anvil-configure-network writing out the new network config properly, but renaming already-active interfaces isn't working yet.
* Updated System->get_ips() to record the interface name of a given network by MAC address using 'sys::mac::<mac_address>::iface'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got anvil-configure-network setting the new hostname.
* Updated anvil-configure-network to exit only if the job was picked up by a still-running PID.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed some logging in Get->cgi() and generally cleaned up logging levels.
* Got striker to the point where the job to reconfigure the network is saved in the database and the dashboard goes offline until it is done.
* Created the start of the new anvil-configure-network tool.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added 'file' and 'line' arguments to the Database->insert_or_update_X methods to allow for the original caller's file and line number to be recorded in the SQL call logs.
* Cleaned up how logging to 'anvil.log' logging is handled.
* Updated anvil-update-states to ignore libvirt bridges and to manually set the speed and duplex of virtio network based interfaces.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug with resync, but others remain as resync is incomplete (at least for network_interfaces).
* Currently, tools/anvil-update-states is broken while working on the above issue.
* Reworked the jobs table and removed the units/anvil-jobs.service unit. Jobs will be invoked and backgrounded in all calls.
* Started adding missing hidden form fields.
* Updated the 'server' OCF resource agent version and metadata.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added Get->anvil_version() to check the local or remote Anvil! version.
* Added a check in Database->connect() to see if a database server's Anvil! version matches the local version. If the versions don't match, the database is not used.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the 'debug' parameter to System->call(). Also added a check to make sure that the called executable exists.
* Added the 'debug' parameter to System->start_daemon().
Signed-off-by: Digimer <digimer@alteeve.ca>
* Moved all executables to /usr/sbin/
* Made /root/anvil-backups/ the backup directory.
* Started debuging anvil-prep-database
Signed-off-by: Digimer <digimer@alteeve.ca>