* Updated anvil-daemon->prep_database() to only run if the database dump file doesn't exist. (If it does, it's clearly configured).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Database->archive_database() to return the full path to the dump file.
* Disabled enabling the postgresql daemon.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created Database->backup_database() that creates a pg_dump of the active database.
* Created Database->load_database() that loads the database from a flat file, optionally creating a backup before doing so, and using iptables to block access during the process.
* Updated Database->configure_pgsql() to not start the postgresql daemon unless it just initialized the DB.
* Much work, not yet complete, to Database->connect() to stop after the first successful connection. Added logic that, if not connection was established and the host is a Striker, to load a peer's backup, if it exists, and then start the local daemon.
* Updated anvil-daemon to now have a section to run tasks on a ten minute cycle, which will later be used for the primary Striker to dump / copy its database to peer(s).
Signed-off-by: Madison Kelly <mkelly@alteeve.ca>
* More bugs fixed in anvil-manage-dr, tested repeatedly as a job and so far, so good. Other functionality still to come.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Got anvil-manage-dr to the point where it writes the updated resource configuration to enable DR support. (untexted)
Signed-off-by: Digimer <digimer@alteeve.ca>
Created Storage->get_vg_name() to assist with anvil-manage-dr, which is still a WIP.
Continued work on anvil-manage-dr (which exposed the issue that required the update to Database->get_storage_group_data().
Signed-off-by: Digimer <digimer@alteeve.ca>
* Created ScanCore->agent_shutdown() that writes out the time the scan agent last ran, and how many databases were available when it last ran.
* Updated ScanCore->agent_startup() to read the the last run data created above.
* Updated Database->connect() to set 'sys::database::last_db_count' to the scan agent's recorded last DB count.
* Updated all agents to call ScanCore->agent_shutdown() at the end of their run.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug where anvil-safe-stop was not recording the stop-reason. Also made '--poweroff' an alias for '--power-off'.
Signed-off-by: Digimer <digimer@alteeve.ca>
Created Storage->get_storage_group_from_path() that takes a block device path and tried to find the Storage Group it belongs to.
Updated Storage->get_storage_group_data() to make it possible to look up a storage group UUID using the SG's name.
Updated DRBD->gather_data() to take a pre-generated XML via the new 'xml' parameter.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Server->parse_definition() to call DRBD->get_devices() so that referenced LVs can be loaded properly.
* Continued WIP in anvil-manage-server
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in Cluster->parse_cib() where the local machine's ready state was being set to the node name.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Alert-register so that, if 'sort_position' is not set (or set to 9999), an internal counter for each alert level is created and used so that alert entries sort naturally by the order they're registered.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Fixed a bug in anvil-delete-server where, if a server was off already, the server would not be removed from pacemaker.
* WIP - continuing on scan-network
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated Jobs->get_job_uuid() to accept the new 'incomplete' parameter that, when set, will look for jobs whose progress is > 1 and < 100.
* Updated ScanCore-agent_startup() to take the new 'no_db_ok' parameter which returns with '0' if no DB is available and that parameter is set to '1'.
* Fixed a logging bug in 'anvil-join-anvil'.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated DRBD->allow_two_primaries() to take the 'set_to' parameter which can be 'yes' to all and 'no' to disallow dual-primary.
* Updated ocf:alteeve:server to call allow_two_primaries() with 'set_to' = 'no' instead of calling 'adjust' after a migration completes.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated scan-cluster to get the CIB from pcs instead of reading the CIB from disk.
* Updated anvil-daemon to always call striker-prep-database at log level 2 while trying to find the cause of rare postgres config failures. Also updated striker-prep-database to use the new method of initializing the DB.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Renamed the special job status 'scancore_startup' to 'anvil_startup', given it's handled by anvil-daemon.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated scan-cluster to check / set which node should be preferred if a netsplit causes a fence race.
* Fixed a bug in Server->shutdown_virsh() where a shutdown timeout would go into a loop.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated anvil-configure-host to not reboot on network chane (will verify when this commit is function tested).
Signed-off-by: Digimer <digimer@alteeve.ca>
* Added the 'print' parameter to Log->variables() to allow printing to STDOUT when set.
* Renamed Network->check_bonds() to Network->check_networks() in anticipation of adding bridge monitoring / repair to it later.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Upped the aging of jobs and alerts data from 2 to 24 hours. Also added a check to prevent deleting a job of any age that is incomplete.
* Major update to anvil-configure-host to not touch the network unless something has actually changed. Not yet tested on a fresh system, will verify nothing broke in the CI tests this commit will trigger. Also changed it so that, if after reconfiguring the network it times out trying to reconnect to a database, it calls a reboot instead of simply exiting. Further, a reboot is now not called on exit unless something changed to require it.
* Updated Network->check_bonds() to return '1' if anything was done to heal a bond.
* Updated anvil-update-states to be more careful about clearing virsh bridges. Specifically, it checks to see if virsh is running and that the returned bridges aren't actually error codes.
Signed-off-by: Digimer <digimer@alteeve.ca>
* Updated anvil-daemon to call Network->check_bonds() with 'all' on startup, then woth 'down_only' once per minute to try to heal down'ed bonds.
* Updated anvil-watch-bonds to take a 'run-once' switch and exit after one report, if set.
Signed-off-by: Digimer <digimer@alteeve.ca>