* Updated Database->insert_or_update_states() to switch to an active UUID if the passed in UUID is not an active handle.
* Updated Database->query() to swutch to 'sys::database::read_uuid' if the passed in 'uuid' is not an active handle.
* Updated Database->_test_access() to return immediately if the passed in uuid is not an active handle.
* Started working on a Storage->get_storage_group_from_path() bug where the storage group isn't being returned.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated scan-filesystems to set swap usage alerts to notice level only.
* Updated scan-network to pull the permanent MAC address from an 'ethtool -P <iface>' call to deal with the fact that wireless interfaces don't have their real MAC in the sysfs address file.
* Updated anvil-provision-server to set the rtc_tickpolicy to catchup.
Signed-off-by: Digimer <digimer@alteeve.ca>
Fixed a bug in System->update_hosts() that was causing hosts to be constantly rewritten. (Well, I hope fixed, this has been a notoriously buggy part of the program...)
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->backup_database() that creates a pg_dump of the active database.
* Created Database->load_database() that loads the database from a flat file, optionally creating a backup before doing so, and using iptables to block access during the process.
* Updated Database->configure_pgsql() to not start the postgresql daemon unless it just initialized the DB.
* Much work, not yet complete, to Database->connect() to stop after the first successful connection. Added logic that, if not connection was established and the host is a Striker, to load a peer's backup, if it exists, and then start the local daemon.
* Updated anvil-daemon to now have a section to run tasks on a ten minute cycle, which will later be used for the primary Striker to dump / copy its database to peer(s).
Signed-off-by: Madison Kelly <mkelly@alteeve.ca>
Created Storage->get_vg_name() to assist with anvil-manage-dr, which is still a WIP.
Continued work on anvil-manage-dr (which exposed the issue that required the update to Database->get_storage_group_data().
Signed-off-by: Digimer <digimer@alteeve.ca>
Created Storage->get_storage_group_from_path() that takes a block device path and tried to find the Storage Group it belongs to.
Updated Storage->get_storage_group_data() to make it possible to look up a storage group UUID using the SG's name.
Updated DRBD->gather_data() to take a pre-generated XML via the new 'xml' parameter.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished scan-filesystems!
* Realized that filesystem UUIDs are not always actual UUIDs, and so created an additonal column in filesystems -> filesystem_internal_uuid and created a normal filesystem_uuid table.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Storage->write_file() to add a short UUID suffix to the temp file before rsync'ing to the target to help avoid source temp file name collisions in parallel running copies to different targets.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Storage->parse_df that, shock!, parses 'df' output. Finished the long-ago started ->parse_lsblk as well.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Cluster->add_server() to now set failure timeouts to actual numbers instead of INFINITY after discovering that INFINITY doesn't work in those cases.
* Updated Databsae->get_hosts to now check if other entries have the same host name, and if so, to set their host_key to 'DELETED'. This should make it easier to handle when a hardware machine is replaced by new hardware but uses the same host_name.
* Updated Email->check_queue() to start and enable postfix.service if it's found to not be running.
* Updated Get->available_resources() to return '!!no_data!!' when a given host hasn't got any data in scan_lvm_vgs. Now use this in anvil-provision-server to exit if a node or dr host hasn't run scancore yet.
* Fixed a bug in scan-lvm where the pvs_uuid wasn't being loaded properly, preventing lost PVs, VGs and LVs from being flagged as deleted.
* Started work on anvil-migate-server, though it's far from complete.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished DRBD->get_next_resource() that returns the next available minor and the next free TCP port (with two free ports available after it).
* Created Storage->get_storage_group_details() that pulls together the LVM, storage group members and storage groups into one block of data.
* Made more progress on tools/anvil-provision-server. It now gets up to the point of creating LVs, creating DRBD resource files, loading them, creating metadata and up'ing the resource. It doesn't yet (successfully) force a new resource to primary.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Database->insert_or_update_jobs() where the 'job_host_uuid' being set to 'all' only translated to a job for the running host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up upload.pl now that it isn't responsible for loading the file details into the database. It only sets a job for the local Striker to process the file and move it into /mnt/shared/files, copy it to peer dashboards, then load jobs for Anvil! members to sync the new file.
* Created Database->get_files() and ->get_file_locations() to load the respective data.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got tools/striker-sync-shared to pick up 'upload::move_incoming' jobs, move the uploaded file to /mnt/shared/files/, copies it to peer dashboards, adds it to the 'files' table and adds it to 'file_locations'.
* Reworked the 'file_locations' table to now map files to Anvil! systems, not hosts. It simply tracks if a given file should be on Anvil! members or not. Later, striker-sync-shared on the Anvil! members will pull the file down.
* Updated Storage->get_file_stats() to record the file's mimetype.
* Fixed up a few issues in cgi-bin/upload.pl.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Cluster->assemble_storage_groups() and moved the logic to auto-assemble groups out of Get->available_resources().
* Created Cluster->get_anvil_name() that will return an Anvil! name for a given anvil_uuid, or the name of the Anvil! if the host is a member of an Anvil!.
* Updated Cluster->get_anvil_uuid() to return the 'anvil_uuid' if passed a specific 'anvil_name'.
* Updated Jobs->clear() to use 'switches::job-uuid' when a job_uuid is not passed but the value exists in 'switches::job-uuid'.
Signed-off-by: Digimer <digimer@alteeve.ca>
ing or create new servers.
* Created Database->insert_or_update_storage_groups() and ->insert_or_update_storage_group_members() to manage the new associated tables. Together, these tables create storage groups and track the VGs that are members of the group.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Started work on Get->available_resources() that will take an 'anvil_uuid' and figure out what resources are still available for use by new servers or that can be added to existing servers.
* Fixed a bug in ScanCore->agent_startup() where tables weren't being generated properly from the agent's SQL file.
* Made Storage->change_mode() return silently if it's called without a mode being passed. This happens frequently and is harmless so it's not worth filling the logs with errors.
* Renamed the 'start_time' key to 'at_start' when recording files' MD5 sums in Storage->record_md5sums and ->check_md5sums.
* When we moved the directory scan logic out of the 'scancore' daemon and into 'Storage->scan_directory', the logic to record scan agent names in 'scancore::agent::<file>' was removed. This broke a few things and, so, it was restored when it was found that a file starts with 'scan-' and the directory matches the scancore agent directory.
* Moved the 'scancore' daemon's 'load_agent_strings' to 'Words'
* Updated Words->parse_banged_string() to look for variables in the format 'value=X:units=Y' and translate it properly.
* Fixed a bug in scan-ipmitool where discovered sensor INSERT SQL queries were queued, but not committed.
* Fixed a bug in scan-storcli where a while loop was broken, preventing execution.
* Fixed a bug in the 'scancore' daemon where it wouldn't exit if sums changed. Fixed a bug where alerts weren't being sent between loops. Fixed a bug where command-line log level wasn't surviving inside the main loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Database->check_agent_data() where the list of tables wasn't passed in, and thus the table list wasn't then passed on to Database->_find_behind_databases().
* Started work on a new method called Storage->parse_lsblk().
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Get->os_type() to work on local and remote hosts.
* Fixed a but in tools/striker-initialize-host where calls to Network->find_matches() where being checked properly for success/failure.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Began (but haven't finished) Database->insert_or_update_servers().
* Created Storage->get_file_stats() to collect the (l)stat information for a file.
* Got more work done on scan-server.
Signed-off-by: Digimer <digimer@alteeve.ca>
* More work done on Email->send_email() to, well, actually send email (which it isn't doing yet, but it's close).
* Updated Words->key() to include the bad key name when no entry for the requested key exists in the words.xml file.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Server->get_status() where the call to Storage->rsync's returned output checked for '!!errer!!' instead of '!!error!!'.
* Fixed a bug in Storage->rsync where, when no port was passed in, it would try to specify an empty port and fail.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the anvil.conf option 'sys::privacy::strong' that controls if the Anvil! ever "calls home". Initially, this controls DRBD's usage flag.
* Updated DRBD->get_devices() to track resources by their 'by-res' names as well and by the normal '/dev/drbdX' devices.
* To mitigate https://bugzilla.redhat.com/show_bug.cgi?id=1868467, updated Get->bridges() to parse the normal (non-JSON) data if we get invalid JSON output.
* Updated anvil-join-anvil to not disable, and in fact enable, libvirtd on boot. With DRBD 9, the original fear of a user accidentally booting a VM that's running on the peer no longer is an issue. By enabling it and leaving it on, Striker dashboard users won't lose their virtual machine manager link unless the node powers off. Also enabled actually updating the job progress, completing this tool!
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Storage->rsync() to only create the rsync wrapper if a password was given, allowing for rsync to work to/from a remote system when passwordless SSH is enabled.
* Updated anvil-join-anvil to disable/stop drbd.service, and to properly wait until both nodes are online.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Update striker manifest run to add an entry into the 'anvils' table, and pass the anvil_uuid to the jobs rather than the various host_uuid's.
* Fixed a bug in the 'anvils' SQL procedure that copied data into the history schema (a few columns were missing).
* Updated anvil-configure-host to reboot when finished to be certain network changes have taken effect. Also updated the handling of virsh bridges to delete the autostart symlinks if libvirtd daemon isn't running.
* Added some logic to anvil-daemon to call 'anvil-update-states' with the -v{1,3} flag depending on the active debug level.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated the loop detection logic in Log->entry where processing large strings was triggering it when it shouldn't.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where '$target' being preset to 'local' was causing bad calls to 'Remote->call'.
* Updated Storage->change_mode and -> change_owner to work locally and on remote hosts.
* Barely started work on striker->process_anvil_menu().
Signed-off-by: Digimer <digimer@alteeve.ca>
* Finished the detection of and handling of initialization of a host when the host has no Internet access.
* Disabled (for now) anvil-daemon's check_ssh_keys function.
* Fixed a couple small bugs elsewhere.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Moved System->is_local to Network->is_local, and System->ping to Network->ping.
* Added a check to tools/striker-get-peer-data that will report if the target has Internet access or not.
* Cleaned up the form that prompts the user to enter their Red Hat credentials.
* Updated tools/anvil-manage-keys (and related code) to no longer distinguish by user. If a target is flagged as changed, it is removed from the root and all user's known_hosts files.
* Updated Storage->write_file() and ->update_file() to accept the 'backup' parameter to control if an file that exists is backed up before being updated/replaced.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated System->host_name to work locally and on remote targets.
* Renamed all 'hostname' instances to 'host_name' to standardize on a spelling throughout the program.
* Removed use of and dependency on 'hostname'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Remote->call() to detect when a connection fails because the target's known_hosts entry has changed. Still need to add the function to report this to the user.
* Fixed a bug where new-lines in Words->parse_banged_string() where a double-banged word string's variable value would cause an infinite loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Tools->refresh to reload anvil.conf in one call.
* Created Anvil::Tools::Network to hold network-related tasks.
** Created Network->is_remote() that tests to see if a string (containing a target) refers to the remote machine (versus a local machine). Updated all previous checks to use this new method.
** Moved Get->network_details() and Get->network() to the new Network module. Renamed Get->network() to Network->get_network().
** Made Network->get_ips() work locally and remotely.
** Created Network->find_matches() that compares two scanned machines IPs (via two previous calls to Network->get_ips())
* Created Database->manage_anvil_conf() that will add, update or remove a given database connection in a local or remote anvil.conf file.
* Fixed bugs in Storage->backup() where the bash calls were quite broken. I'm not sure how it ever worked before... x_x
* Updated anvil-daemon to not initialize a database unless it's running on dashboard. Also added a check at the startup of anvil-daemon where it will go into a loop waiting for a database to become available, re-reading anvil.conf each loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Remote->call where a shell call that ended in a newline would work, but throw an error and not get the return code.
* Created Database->get_job_details() which takes a job_uuid and returns the job details, if found.
* Fixed a bug in Jobs->update_progress() where 'clear' wasn't removing the old job_progress data.
* Added the parameters 'no_files' to skip stat'ing/recording non-directories, and 'search_for' which will set the parent directory in 'scan::searched' and stop scanning if found. This allows this method to act as a directory tree scanner and as a search engine.
* Created Striker->get_local_repo() that builds a repo file body suitable for adding to peers, nodes and DR hosts.
* Fixed bugs in the Striker WebUI related to initializing a target node / DR host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated striker to no longer try to SSH to a remote machine. To enable this, we'd have to give apache a shell and an SSH key, which is dumb and dangerous when considered.
* Created tools/striker-get-peer-data which is meant to be invoked as the 'admin' user (via a setuid c-wrapper). It collects basic data about a target machine and reports what it finds on STDOUT. It gets the password for the target via the database.
* Updated anvil-daemon to check/create/update setuid c-wrapper(s), which for now is limited to call_striker-initialize-host.
* Created Anvil/Tools/Striker.pm to store Striker web-specific methods, including get_peer_data() which calls tools/striker-initialize-host via the setuid admin call_striker-initialize-host c-wrapper.
* In order to allow striker via apache to read a peer's anvil.version, which it can no longer do over SSH, any connection to a peer where the anvil.version is read is cached as /etc/anvil/anvil.<peer>.version. When Get->anvil_version is called as 'apache', this file is read instead.
* Updated Database->resync_databases() and ->_find_behind_databases() to ignore the 'states' table.
* Created tools/striker-initialize-host which will be called as a job to initialize a node/dr host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Storage->read_file() where a remote read, where the remote user wasn't specified, would cause the call to hange.
* Cleaned up striker->add_sync_peer() to use more clear variable names.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Cleaned up the striker->add_sync_peer() function to more clearly differentiate the ssh port from the pgsql port.
* Improved the HTML form to not have the browser treat host login fields as credentials to autofill or save.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created tools/anvil-check-memory to report how much RAM is used by a given program.
* Added documentation for some previously undocumented methods.
* Updated Database->archive_database() to take the 'tables' parameter.
* Updated Storage->scan_directory() to record a directory's mode and type, even when recursive isn't used.
* Finished System->check_memory().
* Updated ocf:alteeve:server to now NOT stop a DRBD resource unless 'stop_drbd_resources'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created System->active_lv() that, surprise, activates an inactive logical volume. Also created ->check_storage() that parses out the LVM data.
* Fixed a bug in tools/fence_pacemaker that was preventing it from compiling and running.
* Updated ocf:alteeve:server to validate the target server's storage.
Signed-off-by: Digimer <digimer@alteeve.ca>