Commit Graph

45 Commits

Author SHA1 Message Date
Digimer
726a4374d1 * Renamed the database table 'host_keys' to 'ssh_keys' to better represent what it stores.
* Updated 'variables' -> 'variable_source_uuid' to type 'uuid' and removed the 'not null' constraint.
* Updated Database->insert_or_update_variables() to check/update 'variables_source_table' and 'variables_source_uuid'.
* Created the 'trusts' database table which will, when done, tell anvil-daemon which users@machines to trust (setup passwordkess SSH).
* Created (but not finished) System->manage_authorized_keys() and moved the logic over to it from anvil-daemon.
* Changed the host types "dashboard" to "striker".
* Moved the following methods from 'System' to 'Get';
** System->get_host_type to Get->host_type
** System->get_bridges to Get->bridges
** System->get_free_memory to Get->free_memory
** System->get_os_type to Get->os_type
** System->get_uptime to Get->uptime
* Updated striker to include the host_uuid for the 'node1', 'node2' and (if chosen) 'dr1' when running a job manifest.

Signed-off-by: Digimer <digimer@alteeve.ca>
2020-06-10 18:26:50 -04:00
Digimer
934c9b1286 * Updated logging to now log anything with 'priority' set to a new 'anvil.alert.log' file (while still also logging as normal to anvil.log). This should make it easier to watch for alert messages.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-10-22 21:17:30 -04:00
Digimer
7cdd2f60e9 * Created Network->download() to handle downloading a file on the local system. Created ->bridge_info() to parse 'bridge' output. Created ->load_ips() to load IP address information from the database (as opposed to ->get_ips() which queries a system).
* Created Database->insert_or_update_bridge_interfaces() to handle updating the new bridge_interfaces database table that records which network interfaces are connected to which bridges.
* Added 'bridge_mac' and 'bridge_mtu' to the 'bridges' table.
* Started Server->map_network which will, eventually, try to map MAC addresses to IPs and record server -> vnetX -> bridge data. Made getting a server status target-dependent.
* Worked on anvil-update-states to parse and record bridge data.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-10-12 00:32:32 -04:00
Digimer
3a86bed694 * Fixed tools/striker-initialize-host so that it set the hostname on the target, not locally.
* Updated System->host_name to work locally and on remote targets.
* Renamed all 'hostname' instances to 'host_name' to standardize on a spelling throughout the program.
* Removed use of and dependency on 'hostname'.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-10-02 15:39:21 -04:00
Digimer
bc341809ca * Finished (for now) ocf:alteeve:server! It can boot, migrate and stop a server cleanly. It still checks to see if DRBD needs to be started and does so when needed, but it won't stop it anymore.
* Fixed a couple typos in tools/anvil-check-memory that prevented it from running.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-09-04 18:58:43 -04:00
Digimer
8a2c86d088 * Renamed striker-configure-host (back) to anvil-configure-host, and started updating it to work on any machine type.
* Created tools/anvil-check-memory to report how much RAM is used by a given program.
* Added documentation for some previously undocumented methods.
* Updated Database->archive_database() to take the 'tables' parameter.
* Updated Storage->scan_directory() to record a directory's mode and type, even when recursive isn't used.
* Finished System->check_memory().
* Updated ocf:alteeve:server to now NOT stop a DRBD resource unless 'stop_drbd_resources'.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-09-03 14:07:39 -04:00
Digimer
c0dd34334e * Fixed another bug in making ocf:alteeve:server work in pacemaker.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-21 01:14:10 -04:00
Digimer
ed2e83a1a4 * Fixed a few more bugs in 'ocf:alteeve:anvil', but it's still failing when invoked by pacemaker.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-20 23:46:59 -04:00
Digimer
f5caec52dc * Made DRBD->allow_two_primaries() smarter about finding the 'target_node_id' when it wasn't passed.
* Fixed a couple bugs, and now ocf:alteeve:server properly can pull and push servers between nodes.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-20 00:59:41 -04:00
Digimer
113a44ecc6 * Got 'migrate_to' working in ocf:alteeve:server. 'migrate_from' still needs work.
* Created DRBD->allow_two_primaries() and ->reload_defaults() that enables (and resets/disables) dual-primary operation (allow-two-primaries=yes), used to enable live migration.
* Created Remote->test_access() that simply verifies that a remote target can be accessed (as a given user).
* Created Server->migrate() that actually migrates a server. It can push already, and pull will be added next.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-16 01:41:47 -04:00
Digimer
7db542b9b0 * Fixed a bug where definition files that used '<source file='X'/>' instead of '<source dev='X'/>' for the backing block device for disks.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-14 01:48:36 -04:00
Digimer
873ed3e2b0 * Fixed some typo bugs.
* Added stop_drbd_resources() to ocf:alteeve:server.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-12 23:15:13 -04:00
Digimer
dff74102db * Create (but not yet tested) Server->shutdown() to, well, shutdown servers.
* Modified Server->boot() to now only work locally. Also updated it to optionally take the XML definition path for the server to boot.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-10 01:35:08 -04:00
Digimer
d224be9344 * Created DRBD->manage_resource() that allows for up/down/primary/secondary'ing a resource on a local or remote system.
* Created Server->boot() that starts a server and verifies that it did actually start.
* Got ocf:anvil:server smarter about starting DRBD resources, properly handing resources where auto-promote isn't enabled. The 'start' process is now complete (baring bugs).

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-08 22:58:21 -04:00
Digimer
b1ddf945e2 * Got ocf:alteeve:server working again to boot servers. It's now smarter, knowing when the server is running locally already (success), running on the other node (hard error) and running on DR (fatal error).
* Updated DRBD->get_devices() to store data all under 'drbd::config::<host>::x'.
* Created Server->find() that takes a target and collects the servers running on it.
* Updated System->check_storage() to redirect all calls STDERR to /dev/null to supress errors about failing to open /dev/drbdX when LVM's filter isn't setup.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-08 01:10:38 -04:00
Digimer
324ef351fe * Updated DRBD->get_devices() to properly identify the peer node, when run on an actual node in the cluster (not DR or Striker).
* Created System->active_lv() that, surprise, activates an inactive logical volume. Also created ->check_storage() that parses out the LVM data.
* Fixed a bug in tools/fence_pacemaker that was preventing it from compiling and running.
* Updated ocf:alteeve:server to validate the target server's storage.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-06 23:31:35 -04:00
Digimer
16f79ca244 * Created System->get_bridges() that gets a list of bridges (and connected interfaces, and data). Also created ->get_free_memory() that returns the amount of available RAM.
* Got ocf:alteeve:server validating bridges.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-03 01:38:12 -04:00
Digimer
4a93682447 * Started rebuilding ocf:alteeve:server using the new module methods.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-08-02 00:06:06 -04:00
Digimer
7a7e3db0c1 * Created DRBD->get_devices() that finds and maps the /dev/drbdX devices to their resources and backing LVs.
* Got Server->get_status() and ->_parse_definition() pulling out all but the device bus data from the XML files. Under devices, started parsing, with 'channel' devices now being parsed.
* Fixed a bug in several Storage->X calls where the test to see if a 'target' was the local host or not wasn't smart enough.
* Purged 'ocs:alteeve:server' to prepare for the rewrite where a lot of the logic is being moved into the Anvil::Tools modules as their functionality will be needed elsewhere anyway.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-30 02:10:04 -04:00
Digimer
312c949648 * Actually added the new Anvil::Tools::DRBD module, as well as a new ::Server module that will handle anything related to the virtual servers.
* Started a re-write of the ocf:alteeve:server resource agent. It has bugs and is missing logic, and trying to clean it up given most of the bugs come from trying to clean up the original agent doesn't make sense. Moving much of the logic into module methods as the functions will be needed elsewhere anyway.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-25 01:12:25 -04:00
Digimer
0f873c45b5 * Created the new Anvil::Tools::DRBD moduke to hold all DRBD related stuff. Started working of ->get_status, still very much a work in progress.
* Started working on ocf:alteeve:server to make it smarter (and more patient) when bringing a DRBD resource up so we don't get false failures when we hit a race.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-23 02:37:14 -04:00
Digimer
f294d48777 * Updated notes with working pcs config example (enable, disable and migrate all work!)
* Minor fix to ocf/alteeve/server to not print anything except the XML meta information when requested

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-20 01:36:02 -04:00
Digimer
a5761df894 * Got migration (push and pull) working.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-19 01:05:30 -04:00
Digimer
0dc2e5d2e9 * Fixed up an issue with down'ing resources under a server after it's shut down.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-18 03:18:38 -04:00
Digimer
3cf1ad2ff8 * Fairly heavy rework of ocf:alteeve:server to update how it handles storage during server start. It now properly handles the new "one resource, multiple volumes" and "start resource, not daemon" approach.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-18 02:53:05 -04:00
Digimer
7e4a170382 * Fixed a bug where Tools.pm->_anvil_version() and Get->host_uuid() were storing values in the wrong $anvil hash.
* Fixed a bug where Get->host_uuid() wasn't reading from the host.uuid file.
* Updated Remote->call() to record a target's fingerprint when needed.
* The ocf:alteeve:server resource agent now properly stopps a server and the corresponding DRBD resource.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-17 02:41:05 -04:00
Digimer
b56fbf923c * Finished the initial convertion of ocf:alteeve:server to use Anvil::Tools.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-16 00:14:13 -04:00
Digimer
c0220c9635 * Continues work on migrating ocf:alteeve:server to use Anvil::Tools.
Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-14 01:01:50 -04:00
Digimer
9c0f6b8f79 * Added automatic 'echo return_code:$?' to System->call and Remote->call which is parsed out and returned automatically on all calls.
* Started porting ocf:alteeve:server to use the Anvil::Tools module and updating it for RHEL 8.

Signed-off-by: Digimer <digimer@alteeve.ca>
2019-07-13 04:16:03 -04:00
Digimer
eafd4fd3f7 * Fixed a couple bugs to get System->change_shell_user_password() working.
* Made logging between journald and a traditional file configurable via 'sys::log_file'. Also made the file handle unbuffered when logging to a file.
* Fixed a bug with loading the anvil.conf config file in a few locations.
* Created System->stty_echo() to handle enabling/disabling shell echo, and added restoring the echo to Tools->catch_sig.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-04-26 12:41:03 -04:00
Digimer
c21b326f1a * Changed all methods to take a 'debug' argument for setting log level on calls.
* Fixed a bug with resync, but others remain as resync is incomplete (at least for network_interfaces).
* Currently, tools/anvil-update-states is broken while working on the above issue.
* Reworked the jobs table and removed the units/anvil-jobs.service unit. Jobs will be invoked and backgrounded in all calls.
* Started adding missing hidden form fields.
* Updated the 'server' OCF resource agent version and metadata.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-03-07 03:11:55 -05:00
Digimer
92a1e29082 * The new resource agent works!
** Fixed a bug so that when the agent is invoked on the target node after a migration, it just does a quick check to see if the server is running and exists 0 if so.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-21 23:24:53 -05:00
Digimer
3b0659c5bf * Looks like the RA is done, though more testing is needed.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-21 13:54:49 -05:00
Digimer
f52d8196f6 * Migration is now sort of working. There is still an issue to sort out with enabling drbd dual-primary, but server can move is some cases now.
* Changed fence_pacemaker to exit with '1' on generic error as per LINBIT's comments.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-21 02:06:00 -05:00
Digimer
4e5dc9f1c2 * Started work on migration handling.
* Fixed a bug where a stop operation on a server already in shutdown would exit immediately instead of waiting for the server to actually shut off.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-20 02:14:59 -05:00
Digimer
f2079da183 * The agent can now boot and stop a server. Migration is up next.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-19 02:34:55 -05:00
Digimer
e755a708dd * The resource agent now properly checks (and starts, if needed) the DRBD resources under the server being asked to start. It probably needs optimization still, but the logic is in place.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-18 03:06:09 -05:00
Digimer
36c0d3b921 * Started parsing drbdsetup JSON data.
* Fixed a bug with how we stored data from the drbdadm dump-xml data.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-16 18:47:39 -05:00
Digimer
bb4b5b1778 * Got the RA to the point where it identifies the local DRBD devices and backing LVs.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-16 02:09:28 -05:00
Digimer
fe65718811 * Finished validating optical media.
* Added initial parsing of 'drbdadm dump-xml'.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-15 03:35:31 -05:00
Digimer
dd0bdec839 * Broke up the validation steps into their own functions.
* Finished bridge validation.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-15 00:49:19 -05:00
Digimer
8aa2d28103 * Got the server start function to the point where all data that we need to sanity check is gathered. It already verifies that the emulator exists, that there is enough RAM and that the server's name matches the name we expect.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-14 02:24:15 -05:00
Digimer
81534cddbc * Moved ocf:alteeve:server along... It now can properly check and report the server's status on a monitor call.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-13 23:59:07 -05:00
Digimer
4dcaa524c5 * Made Get->switches take a bare word as a valid switch.
* Framed up the new ocf:alteeve:server agent. It only handles metadata at this point, but its a start.

Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-12 21:58:37 -05:00
Digimer
14763136f2 * Added what will (might?) become the resource agent for managing Anvil!-hosted servers on m3.
Signed-off-by: Digimer <digimer@alteeve.ca>
2018-02-12 01:10:48 -05:00