* This adds a check where anvil-join-anvil waits until both subnodes are
marked as configured and not in maintenance mode.
* Should address issue #479 (maybe, this shouldn't trigger reboots, but
it was certainly a race condition found while investigating).
Signed-off-by: digimer <mkelly@alteeve.ca>
* Updated anvil-manage-server-storage to the point where it can now insert and eject optical disks!
* Updated System->call to log parameters if 'shell_call' isn't set.
* Fixed a bug in anvil-manage-server process_interactive where an $anvil->data reference was being scoped.
Signed-off-by: digimer <mkelly@alteeve.ca>
* Updated anvil-boot-server when called with '--all' to honour boot ordering, delays and condtions.
* Updated Database->get_servers() to collect the server's XML as well as data from the 'servers' table.
* Updated anvil-provision-server to make a new DRBD resource 'secondary' after forcing it to primary to begin the initial sync.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed the special job status 'scancore_startup' to 'anvil_startup', given it's handled by anvil-daemon.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated anvil-configure-host to not reboot on network chane (will verify when this commit is function tested).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the 'print' parameter to Log->variables() to allow printing to STDOUT when set.
* Renamed Network->check_bonds() to Network->check_networks() in anticipation of adding bridge monitoring / repair to it later.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Upped the aging of jobs and alerts data from 2 to 24 hours. Also added a check to prevent deleting a job of any age that is incomplete.
* Major update to anvil-configure-host to not touch the network unless something has actually changed. Not yet tested on a fresh system, will verify nothing broke in the CI tests this commit will trigger. Also changed it so that, if after reconfiguring the network it times out trying to reconnect to a database, it calls a reboot instead of simply exiting. Further, a reboot is now not called on exit unless something changed to require it.
* Updated Network->check_bonds() to return '1' if anything was done to heal a bond.
* Updated anvil-update-states to be more careful about clearing virsh bridges. Specifically, it checks to see if virsh is running and that the returned bridges aren't actually error codes.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->archive_table() and ->_find_behind_databases() to loop through connected databases, instead of configured databases.
* Updated Network->get_ips() to only record the real MAC addresses on network interfaces (not bonds or bridges) in the "network::${host}::interface::${in_iface}::mac_address" hash. This should help avoid reboot loops caused by anvil-configure-host thinking the network needs to be reconfigured when it doesn't actually need to be.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a problem with Database->insert_or_update_variables() where variable_source_uuid being set to an empty string wasn't converted to NULL.
* Fixed Database->locking() where the way the lock variable was set was rather broken.
* Created Striker->check_httpd_conf() which configured apache to handle the integration of the new WebUI for Anvil! management with the existing WebUI.
* Updated System->update_hosts() to specifically set the 127.0.0.1 and ::1 lines to handle how cloud-init overrides /etc/hosts and breaks CI/CD tests.
* Removed the old index.html as it's now used for the new WebUI.
* Began work on removing DB connection requirements from ocf:alteeve:server.
Signed-off-by: Digimer <digimer@alteeve.ca>
* DRBD is now configured to a ping-timeout of 3 seconds.
* Created Log->switches() that returnes the command line switches used by Anvil! tool command line calls based on the active log levels / secure logging. Appended this to all invocations of our tools.
* Updated Database->resync_databases() to now only skip 'jobs' and 'variables' tables with less than 10 record differences. All other differences will trigger a resync.
* Created System->_check_anvil_conf() that, as you might guess, checks in anvil.conf exists and created it (using defaults), if not. It also checks to see if the 'admin' group and user exists and creates them, if not.
* Updated anvil-daemon to check anvil.conf on start up and in each loop. Created the function check_journald() that checks (and sets, if needed) that journald logging is persistent.
* Made striker-manage-peers to check_if_configured on the Database->connect() when updating anvil.conf and the target UUID is the local machine. Also created a loop to make the reconnection a lot more robust.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->insert_or_update_storage_group_members() to use the host_uuid when trying to find existing members.
* Added the skeleton of a bunch of new json endpoints for the new UI features.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Remote->test_access() to not used cached SSH access.
* Updated anvil-configure-host to abort if the host is in a cluster.
* Updated anvil-join-anvil to clean up some variable checks to help avoid unitialized variable messages.
* Updated striker-initialize-host to check if an anvil RPM is installed and, if so, not install the Anvil! repo.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Remote->call() to return ('!!error!!', '!!error!!', 9999) when an error hits. Made Remote->test_access() explicitely check for '1' to be returned in order to confirm access, fixing a bug where bad target value caused false positives. Updated ->_check_known_hosts_for_target() to no longer explicitely check for 'ssh-rsa' so that machine keys using different cyphers are detected as being in known_hosts properly.
* Updated striker-auto-initialize-all to initialize nodes and DR hosts networks before trying to form them into an Anvil!. Fixed several other bugs as well. More testing is needed, but it works now.
* Updated striker-initialize-host to check for the alteeve repo and, it not found, check for accress to alteeve.com. If access, it will install our repo now.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created 'Cluster->which_node' that returns 'node1' or 'node2' to indicate which node a host is.
* Continued working on scan_cluster; decided to make it not host-dependent.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added calling 'debug => $debug' in System->X methods.
* Got more work done on System->configure_ipmi(). It should now determine if a BMC exists and pull the OEM and network details automatically.
* Updated anvil-configure-host to log more data in an attempt to find a reproducer for an odd bug where (apparently) a host was picking up the wrong job data meant for another host.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the fix from the last commit for System->call to handle returned data without an ending newline to Remote->call.
* Got more work done on System->update_hosts(). It's able to add new hosts, but misses the short and FQDN host names. Need to fix that and the verify existing / manual entries aren't molested.
Signed-off-by: digimer <digimer@pulsar.alteeve.com>
* Added the fix from the last commit for System->call to handle returned data without an ending newline to Remote->call.
* Got more work done on System->update_hosts(). It's able to add new hosts, but misses the short and FQDN host names. Need to fix that and the verify existing / manual entries aren't molested.
Signed-off-by: digimer <digimer@pulsar.alteeve.com>
* Update striker manifest run to add an entry into the 'anvils' table, and pass the anvil_uuid to the jobs rather than the various host_uuid's.
* Fixed a bug in the 'anvils' SQL procedure that copied data into the history schema (a few columns were missing).
* Updated anvil-configure-host to reboot when finished to be certain network changes have taken effect. Also updated the handling of virsh bridges to delete the autostart symlinks if libvirtd daemon isn't running.
* Added some logic to anvil-daemon to call 'anvil-update-states' with the -v{1,3} flag depending on the active debug level.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated 'variables' -> 'variable_source_uuid' to type 'uuid' and removed the 'not null' constraint.
* Updated Database->insert_or_update_variables() to check/update 'variables_source_table' and 'variables_source_uuid'.
* Created the 'trusts' database table which will, when done, tell anvil-daemon which users@machines to trust (setup passwordkess SSH).
* Created (but not finished) System->manage_authorized_keys() and moved the logic over to it from anvil-daemon.
* Changed the host types "dashboard" to "striker".
* Moved the following methods from 'System' to 'Get';
** System->get_host_type to Get->host_type
** System->get_bridges to Get->bridges
** System->get_free_memory to Get->free_memory
** System->get_os_type to Get->os_type
** System->get_uptime to Get->uptime
* Updated striker to include the host_uuid for the 'node1', 'node2' and (if chosen) 'dr1' when running a job manifest.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added 'no_ping' to Database->connect() to disable pinging before connection, regardless of the anvil.conf setting.
* Created Network->read_nmcli() that reads, parses and stores the verbose output from 'nmcli'.
I can not properly explain in this commit message how much getting network manager working tripped me up. omg the complexity of what used to be such a simple process...
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a couple more bugs in tools/anvil-configure-host, getting it now to the point where it writes out the network config files.
Signed-off-by: Digimer <digimer@alteeve.ca>