* Added logic to tools/anvil-manage-install-target to only update the local RPM repo on a periodic basis.

* Updated tools/anvil-configure-striker to be consistent about naming firewall zones using upper-case.

Signed-off-by: Digimer <digimer@alteeve.ca>
main
Digimer 6 years ago
parent 7f5ac528ab
commit 8468215831
  1. 10
      anvil.conf
  2. 126
      notes
  3. 3
      share/words.xml
  4. 8
      tools/anvil-configure-striker
  5. 14
      tools/anvil-manage-firewall
  6. 156
      tools/anvil-manage-install-target

@ -153,3 +153,13 @@
# you might want to change this is if you spend time working directly on the various Anvil! cluster machines.
#kickstart::timezone = Etc/GMT --isUtc
# If this is set to '1', the packages used to build machines via the Install Target feature will not
# auto-update.
install-manifest::refresh-packages = 1
# This controls how often the local RPM repository is checked for updates. The default is '86400' seconds
# (one day). If anything, you might want to increase this. Common values;
# 86400 = Once per day
# 604800 = Once per week
# 2419200 = Once per month (well, 4 weeks)
install-manifest::refresh-period = 86400

126
notes

@ -1,28 +1,34 @@
NEXT; - Copy os/isolinux/* stuff from syslinux-nonlinux
NEXT; -
DOCS; -
- Explanation of 'comps.xml' (package grouping) - https://pagure.io/fedora-comps
- Firewalld
- https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-using-firewalld-on-centos-7
- PXE;
- https://docs.fedoraproject.org/en-US/fedora/f28/install-guide/advanced/Network_based_Installations/
- https://docs.fedoraproject.org/en-US/Fedora/26/html/Installation_Guide/chap-pxe-server-setup.html
- UEFI PXE notes - https://www.syslinux.org/wiki/index.php?title=PXELINUX#UEFI
- How to write a NetworkManager dispatcher script to apply ethtool commands? - https://access.redhat.com/solutions/2841131
- Setup nodes to log to striker? - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-configuring_netconsole
- Pacemaker can be monitored via SNMP - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-snmpandpacemaker-HAAR
- corosync.conf - https://access.redhat.com/articles/3185291
Explanation of 'comps.xml' (package grouping) - https://pagure.io/fedora-comps
dnf -y install perl-Net-Netmask perl-Text-Diff dnf-utils hdparm lsscsi createrepo dhcp kernel-core syslinux tftp-server
Firewall config stuff.
====
---- Files
[root@f28-striker01 zones]# cat /etc/firewalld/zones/BCN1.xml
### Create our zones (skip SN on striker)
# /etc/firewalld/zones/BCN1.xml
<?xml version="1.0" encoding="utf-8"?>
<zone>
<short>BCN1</short>
<description>Back-Channel Network #1 - Used for all inter-machine communication in the Anvil!, as well as communication for foundation pack devices. Should be VLAN-isolated from the IFN and, thus, trusted.</description>
<service name="ssh"/>
<service name="dhcpv6-client"/>
<service name="cockpit"/>
<service name="postgresql"/>
<service name="http"/>
<service name="https"/>
<port port="80" protocol="tcp"/>
<port port="443" protocol="tcp"/>
</zone>
[root@f28-striker01 zones]# cat /etc/firewalld/zones/IFN1.xml
# /etc/firewalld/zones/IFN1.xml
<?xml version="1.0" encoding="utf-8"?>
<zone>
<short>IFN1</short>
@ -31,66 +37,80 @@ Firewall config stuff.
<service name="postgresql"/>
<service name="http"/>
<service name="https"/>
<port port="80" protocol="tcp"/>
<port port="443" protocol="tcp"/>
</zone>
[root@f28-striker01 zones]# cat /etc/firewalld/zones/SN1.xml
### NOTE: Only on nodes and DR, not on Striker
# /etc/firewalld/zones/SN1.xml
<?xml version="1.0" encoding="utf-8"?>
<zone>
<short>SN1</short>
<description>Storage Network #1 - Used for DRBD communication between nodes and DR hosts. Should be VLAN-isolated from the IFN and, thus, trusted.</description>
<service name="ssh"/>
</zone>
----
Reload;
firewall-cmd --reload
### These are permanent
# Put the interfaces under the appropriate zones.
firewall-cmd --zone=IFN1 --change-interface=ifn1_bond1
firewall-cmd --zone=BCN1 --change-interface=bcn1_bond1
# Set the IFN as the default zone (as that is what will most likely be edited by a user)
firewall-cmd --set-default-zone=IFN1
### These are temporary unless --permanent is used
# Allow routing/masq'ing through the IFN
# Allow routing/masq'ing through the IFN1 (provide net access to the BCN)
firewall-cmd --zone=IFN1 --add-masquerade
# Check
firewall-cmd --zone=IFN1 --query-masquerade
yes
[yes|no]
# Disable
# NOTE: Doesn't break existing connections
firewall-cmd --zone=IFN1 --remove-masquerade
- Notes;
### Example shell calls
# Firewall
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=postgresql
firewall-cmd --reload
firewall-cmd --state [running (rc: 0),not running (rc:252)]
Ports we care about
Porto Number Used by Nets Description
TCP 2224 pcsd bcn It is crucial to open port 2224 in such a way that pcs from any node can talk to all nodes in the cluster, including itself.
UDP 5404 corosync bcn Required on corosync nodes if corosync is configured for multicast UDP
UDP 5405 corosync bcn Required on all corosync nodes (needed by corosync)
TCP 7788+ drbd sn 1 port per resource
TCP 49152-49215 virsh bcn live migration - migration_port_min and migration_port_max attributes in the /etc/libvirt/qemu.conf
NOTE: DHCP listens to raw sockets and ignores firewalld rules. We need to stop dhcpd directly - https://kb.isc.org/docs/aa-00378
* After all changes;
firewall-cmd --zone=public --add-port=49152-49215/tcp --permanent
firewall-cmd --reload
- Paths
If we want to create services or helpers later, look under - /usr/lib/firewalld/
Core firewalld configs, including defaults zones, etc - /etc/firewalld/
- https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-using-firewalld-on-centos-7
* Zones are meant to deal with dynamic environments and aren't that useful in mostly static server environments
** Seem to be pre-configured sets of what is/isn't allowed. 'public' for IFN, 'work' for SN/BCN? 'external/internal' are for routing
** Configured in /etc/firewalld/zones/<zone>.xml - Create 'BCN', 'SN' and 'IFN'?
* Use 'firewall-cmd' WITHOUT '--permanent' for things like enabling the VNC port for a server. Use '--permanent' for everything else.
====
Striker as PXE server
====
https://docs.fedoraproject.org/en-US/fedora/f28/install-guide/advanced/Network_based_Installations/
UEFI PXE notes - https://www.syslinux.org/wiki/index.php?title=PXELINUX#UEFI
dnf install dhcp tftp-server syslinux kernel-core
# NOTE: We DON'T enable DHCP. We'll turn it on as needed.
# NOTE: Apache needs to show dot-files! (anaconda looks for .treeinfo)
@ -151,23 +171,6 @@ FXSAVE_STATE - 000000007FF7E310
====
Forewalld Router config
====
# Allow routing/masq'ing through the IFN
firewall-cmd --zone=IFN --add-masquerade
success
# Check
firewall-cmd --zone=IFN --query-masquerade
yes
# Disable
# NOTE: Doesn't break existing connections
firewall-cmd --zone=IFN --remove-masquerade
success
====
DB stuff;
Dump;
@ -181,17 +184,9 @@ su - postgres -c "dropdb anvil" && su - postgres -c "createdb --owner admin anvi
su - postgres -c "psql anvil"
All systems have a UUID, even VMs. Use that for system UUID in the future.
https://access.redhat.com/solutions/2841131 - How to write a NetworkManager dispatcher script to apply ethtool commands?
Setup nodes to log to striker?
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-configuring_netconsole
* Pacemaker can be monitored via SNMP; https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-snmpandpacemaker-HAAR
* corosync.conf; https://access.redhat.com/articles/3185291
Changes made using tools such as nmcli do not require a reload but do require the associated interface to be put down and then up again. That can be done by using commands in the following format:
* nmcli dev disconnect interface-name
@ -293,11 +288,6 @@ systemctl start httpd.service
# Post install
systemctl daemon-reload
# Firewall
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=postgresql
firewall-cmd --reload
# SELinux
restorecon -rv /var/www
@ -374,20 +364,6 @@ pcs resource clone drbd clone-max=2 notify="false"
stonith_admin --fence m3-a01n02 --verbose; crm_error $?
==== (configured via https)
Ports we care about
Porto Number Used by Nets Description
TCP 2224 pcsd bcn It is crucial to open port 2224 in such a way that pcs from any node can talk to all nodes in the cluster, including itself.
UDP 5404 corosync bcn Required on corosync nodes if corosync is configured for multicast UDP
UDP 5405 corosync bcn Required on all corosync nodes (needed by corosync)
TCP 7788+ drbd sn 1 port per resource
TCP 49152-49215 virsh bcn live migration - migration_port_min and migration_port_max attributes in the /etc/libvirt/qemu.conf
* After all changes;
firewall-cmd --zone=public --add-port=49152-49215/tcp --permanent
firewall-cmd --reload
==== DRBD notes

@ -497,6 +497,7 @@ The body of the file: [#!variable!file!#] does not match the new body. The file
<key name="log_0232">The file: [#!variable!file!#] will now be updated.</key>
<key name="log_0233">There was a problem updating file: [#!variable!file!#], expected the write to return '0' but got: [#!variable!return!#]. Please check the logs for details.</key>
<key name="log_0234">Failed to backup the file: [#!variable!source!#] to: [#!variable!destination!#]. Details may be found in the logs above.</key>
<key name="log_0235">Not updating the local repository on this run. Use '#!variable!program!# --refresh' to force a refresh of the local repository.</key>
<!-- Test words. Do NOT change unless you update 't/Words.t' or tests will needlessly fail. -->
<key name="t_0000">Test</key>
@ -641,6 +642,7 @@ Here we will inject 't_0006', which injects 't_0001' which has a variable: [#!st
<key name="striker_0103">The peer will be added to the local configuration shortly, and we will be added to their configuration as well. Expect slight performance impacts if there is a lot of data to synchronize.</key>
<key name="striker_0104">The peer will be removed from to the local configuration shortly. Any existing data will remain but no further data will be shared.</key>
<key name="striker_0105"><![CDATA[Are you sure that you want to remove the peer: [<span class="code">#!variable!peer!#</span>]? If so, no further data from this system will be written to the peer. Do note that any existing data will remain and will be reused if you add the peer back again.]]></key>
<key name="striker_0106">Indicates when the last time the host system's RPM repository was refreshed. If the last refresh failed, this will be incremented by one day before another attempt is made (regardless of 'install-manifest::refresh-period' setting).</key>
<!-- Strings used by jobs -->
<key name="job_0001">Configure Network</key>
@ -724,6 +726,7 @@ The update appears to have not completed successfully. The output was:
#!variable!error!#
====
</key>
<key name="error_0046">This Striker system is not configured yet. This tool will not be available until it is.</key>
<!-- These are units, words and so on used when displaying information. -->
<key name="unit_0001">Yes</key>

@ -393,7 +393,7 @@ sub reconfigure_network
}
}
$bond_config .= "DEFROUTE=\"".$say_defroute."\"\n";
$bond_config .= "ZONE=\"".$say_interface."\"";
$bond_config .= "ZONE=\"".uc($say_interface)."\"";
my $link1_config = "# $say_network - Link 1\n";
$link1_config .= "HWADDR=\"".uc($link1_mac)."\"\n";
@ -409,7 +409,7 @@ sub reconfigure_network
$link1_config .= "NM_CONTROLLED=\"yes\"\n";
$link1_config .= "SLAVE=\"yes\"\n";
$link1_config .= "MASTER=\"".$say_interface."_bond1\"\n";
$link1_config .= "ZONE=\"".$say_interface."\"";
$link1_config .= "ZONE=\"".uc($say_interface)."\"";
my $link2_config = "# $say_network - Link 2\n";
$link2_config .= "HWADDR=\"".uc($link2_mac)."\"\n";
@ -425,7 +425,7 @@ sub reconfigure_network
$link2_config .= "NM_CONTROLLED=\"yes\"\n";
$link2_config .= "SLAVE=\"yes\"\n";
$link2_config .= "MASTER=\"".$say_interface."_bond1\"\n";
$link2_config .= "ZONE=\"".$say_interface."\"";
$link2_config .= "ZONE=\"".uc($say_interface)."\"";
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => {
bond_config => $bond_config,
@ -598,7 +598,7 @@ sub reconfigure_network
$link1_config .= "USERCTL=\"no\"\n";
$link1_config .= "MTU=\"1500\"\n"; # TODO: Make the MTU user-adjustable
$link1_config .= "NM_CONTROLLED=\"yes\"\n";
$link1_config .= "ZONE=\"".$say_interface."\"";
$link1_config .= "ZONE=\"".uc($say_interface)."\"";
# Backup the existing link1 file, if it exists.
if (-e $old_link1_file)

@ -1,8 +1,20 @@
#!/usr/bin/perl
#
# This manages the firewall on the host.
# This manages does two main tasks;
# - On Striker dashboards, it enables and disabled the "Install Target" feauter.
# - On nodes and DR hosts, it opens and closes VNC ports as servers are booted or stopped.
#
# Examples;
# - Call without any arguments and it will check for running servers and, for each it will open the VNC port
# on both the BCN and IFN. Before doing so, it will check 'servers::<server_name>::vnc::<network>'. If that
# is found and set to '0', the port will not be opened on the network. For example, to prevent a server
# named 'foo' from being accessible over 'ifn1', set 'servers::foo::vnc::ifn1 = 0'. The network name comes
# from the first part of the interface name with the IP address. So for 'ifn1_bond1', the network is
# 'ifn1'.
# - Call it with '--enable-install-target' or '--disable-install-target' to enable or disable the "Install
# Target" feature. This also enables or disables the dhcpd daemon. This can also be called as an Anvil jobs
# to enable or disable the Install Target by setting the 'job_data' to 'install-target::enable' or
# 'install-target::disable'.
#
#
# Exit codes;

@ -8,7 +8,9 @@
# Alternatively, /mnt/files/ will be checked for a
#
# Examples;
#
# - Call with '--enable' to enable the install target.
# - Call with '--disable' to disable the install target.
# - Call with '--refresh' to check for updates from upstream repositories.
#
# Exit codes;
# 0 = Normal exit.
@ -19,6 +21,8 @@
# 5 = Not running as the 'root' user.
# 6 = The 'comps,xml' file provided by anvil-striker-extra was not found.
# 7 = A package failed to download to our repo
# 8 = No database connection available.
# 9 = The system isn't configured yet.
#
# TODO:
# - Support building the install target by mounting the ISO and checking /mnt/shared/incoming for needed
@ -59,6 +63,28 @@ if (($< != 0) && ($> != 0))
$anvil->nice_exit({code => 5});
}
# Connect to the database(s).
$anvil->Database->connect;
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 3, secure => 0, key => "log_0132"});
if (not $anvil->data->{sys}{database}{connections})
{
# No databases, exit.
print $anvil->Words->string({key => "error_0003"})."\n";
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 0, secure => 0, key => "error_0003"});
$anvil->nice_exit({exit_code => 8});
}
# Exit if we're not configured yet
my $configured = $anvil->System->check_if_configured;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 3, list => { configured => $configured }});
if (not $configured)
{
print $anvil->Words->string({key => "error_0046"})."\n";
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 0, secure => 0, key => "error_0046"});
$anvil->nice_exit({exit_code => 9});
}
# Figure out what this machine is
my ($os_type, $os_arch) = $anvil->System->get_os_type();
$anvil->data->{host_os}{os_type} = $os_type;
$anvil->data->{host_os}{os_arch} = $os_arch;
@ -94,6 +120,101 @@ $anvil->nice_exit({exit_code => 0});
# Private functions. #
#############################################################################################################
sub check_refresh
{
my ($anvil) = @_;
# Setup the packages directory
$anvil->data->{path}{directories}{packages} = "/var/www/html/".$anvil->data->{host_os}{os_type}."/".$anvil->data->{host_os}{os_arch}."/os/Packages";
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "path::directories::packages" => $anvil->data->{path}{directories}{packages} }});
# Default to 'no'
$anvil->data->{switches}{refresh} = 0 if not defined $anvil->data->{switches}{refresh};
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "switches::refresh" => $anvil->data->{switches}{refresh} }});
# It it's set, the user asked for it.
if ($anvil->data->{switches}{refresh})
{
return(0);
}
# If it's been disabled in anvil.conf, exit.
$anvil->data->{'install-manifest'}{'refresh-packages'} = 1 if not defined $anvil->data->{'install-manifest'}{'refresh-packages'};
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "install-manifest::refresh-packages" => $anvil->data->{'install-manifest'}{'refresh-packages'} }});
if (not $anvil->data->{'install-manifest'}{'refresh-packages'})
{
# We're out.
return(0);
}
# If the Packages directory doesn't exist, we'll need to refresh.
if (not -e $anvil->data->{path}{directories}{packages})
{
# Source isn't configured, set it up
$anvil->data->{switches}{refresh} = 1;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "switches::refresh" => $anvil->data->{switches}{refresh} }});
return(0);
}
# See when the last time we refreshed the local package cache
my ($unixtime, $variable_uuid, $modified_date) = $anvil->Database->read_variable({
variable_name => "install-target::refreshed",
variable_source_uuid => $anvil->Get->host_uuid,
variable_source_table => "hosts",
});
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => {
unixtime => $unixtime,
variable_uuid => $variable_uuid,
modified_date => $modified_date,
}});
### TODO: Allow the user to set a "refresh time" that will wait until the local time is after a
### certain time before refreshing. Also, allow the user to disable auto-refresh entirely.
# If the database variable 'install-target::refresh' is not set, or is more than 24 hours old,
# refresh.
$anvil->data->{'install-manifest'}{'refresh-period'} = 86400 if not defined $anvil->data->{'install-manifest'}{'refresh-period'};
$anvil->data->{'install-manifest'}{'refresh-period'} =~ s/,//g;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => {
'install-manifest::refresh-period' => $anvil->data->{'install-manifest'}{'refresh-period'},
}});
if ($anvil->data->{'install-manifest'}{'refresh-period'} =~ /\D/)
{
$anvil->data->{'install-manifest'}{'refresh-period'} = 86400;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => {
'install-manifest::refresh-period' => $anvil->data->{'install-manifest'}{'refresh-period'},
}});
}
my $time_now = time;
my $next_scan = $unixtime + $anvil->data->{'install-manifest'}{'refresh-period'};
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => {
's1:time_now' => $time_now,
's2:next_scan' => $next_scan,
}});
if ((not $variable_uuid) or ($unixtime =~ /^\d+/) or ($time_now > $next_scan))
{
$anvil->data->{switches}{refresh} = 1;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "switches::refresh" => $anvil->data->{switches}{refresh} }});
return(0);
}
# If the refresh fails, we'll update the last unixtime to be 24 hours from now, so that we can try
# again. We intentionally ignore 'install-manifest::refresh-period', that's only used when the
# refresh succeeded.
if ($unixtime =~ /^\d+$/)
{
$anvil->data->{sys}{retry_time} = $unixtime + 86400;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "sys::retry_time" => $anvil->data->{sys}{retry_time} }});
}
else
{
# No previous time, so start with the current time
$anvil->data->{sys}{retry_time} = time;
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { "sys::retry_time" => $anvil->data->{sys}{retry_time} }});
}
return(0);
}
sub setup_boot_environment
{
my ($anvil) = @_;
@ -538,10 +659,16 @@ sub update_install_source
{
my ($anvil) = @_;
### TODO: Make this only run once every 24 hours (track in the database;
### 'system::install-target::source-updated').
# Should we refresh the local repo?
check_refresh($anvil);
if (not $anvil->data->{switches}{refresh})
{
$anvil->Log->entry({source => $THIS_FILE, line => __LINE__, level => 1, secure => 0, key => "log_0235", variables => { program => $THIS_FILE }});
return(0);
}
# Clear the dnf cache
my $success = 1;
my $output = $anvil->System->call({debug => 2, shell_call => $anvil->data->{path}{exe}{dnf}." clean expire-cache" });
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { output => $output }});
print $anvil->Words->string({key => "message_0077", variables => {
@ -635,6 +762,17 @@ sub update_install_source
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { error_out => $error_out }});
if ($error_out)
{
# Bump the last successful time by 24 hours.
$anvil->Database->insert_or_update_variables({
variable_name => "install-target::refreshed",
variable_value => $anvil->data->{sys}{retry_time},
variable_default => "",
variable_description => "striker_0106",
variable_section => "system",
variable_source_uuid => $anvil->Get->host_uuid,
variable_source_table => "hosts",
});
# Something went wrong, exit.
print $anvil->Words->string({key => "error_0045", variables => { error => $error_out }})."\n";
$anvil->nice_exit({code => 7});
@ -668,6 +806,18 @@ sub update_install_source
$output = $anvil->System->call({debug => 2, shell_call => $anvil->data->{path}{exe}{createrepo}." -g ".$comps_xml." ".$repo_path });
$anvil->Log->variables({source => $THIS_FILE, line => __LINE__, level => 2, list => { output => $output }});
# Update the refresh time to now.
$anvil->Database->insert_or_update_variables({
variable_name => "install-target::refreshed",
variable_value => time,
variable_default => "",
variable_description => "striker_0106",
variable_section => "system",
variable_source_uuid => $anvil->Get->host_uuid,
variable_source_table => "hosts",
});
return(0);
}

Loading…
Cancel
Save