Table of Contents generated with DocToc

Required configuration

The cluster configuration file can be generated by using clusterctl generate cluster command. This command actually uses the template file and replace the values surrounded by ${} with environment variables. You have to set all required environment variables in advance. The following sections explain some more details about what should be configured.

Note: You can use the template file by manually replacing values.

Note: By default the command creates highly available control plane with 3 control plane nodes. If you wish to create single control plane without load balancer, use without-lb flavor. For example,

# Using 'without-lb' flavor
clusterctl generate cluster capi-quickstart \
  --flavor without-lb \
  --kubernetes-version v1.24.2 \
  --control-plane-machine-count=1 \
  --worker-machine-count=1 \
  > capi-quickstart.yaml

OpenStack version

We currently require at least OpenStack Pike.

Operating system image

cloud-init based images

We currently depend on an up-to-date version of cloud-init, otherwise the operating system choice is yours. The kubeadm bootstrap provider we’re using also depends on some pre-installed software like a container runtime, kubelet, kubeadm, etc.. . For examples of how to build such an image take a look at image-builder (openstack).

The image can be referenced by exposing it as an environment variable OPENSTACK_IMAGE_NAME.

Ignition based images

Some OS like Fedora CoreOS or Flatcar do not use cloud-init but Ignition to provision the instance. You need to enable the Ignition experimental feature: export EXP_KUBEADM_BOOTSTRAP_FORMAT_IGNITION=true

Flatcar comes in two flavor variants:

  • flatcar

    This variant relies on a Flatcar image built using the image-builder project: the Kubernetes version is bound to the Flatcar version and a rebuild of the image is required for each Kubernetes or Flatcar upgrade.

    To build and use Flatcar image:

    • Build the image with the image-builder: make OEM_ID=openstack build-qemu-flatcar
    • Upload the image
    • Export the name of the uploaded image: export OPENSTACK_FLATCAR_IMAGE_NAME=flatcar-stable-3374.2.5-kube-v1.25.6
    • When generating the cluster configuration, use the following Cluster API flavor: --flavor flatcar (NOTE: Don’t forget to refer to the external-cloud-provider section)
  • flatcar-sysext

    This variant relies on a plain Flatcar image and it leverages systemd-sysext feature to install and update Kubernetes components: the Kubernetes version is not bound to the Flatcar version (i.e Flatcar can be independently upgraded from Kubernetes and vice versa).

    The template comes with a systemd-sysupdate configuration file that will download each new patch version of Kubernetes (i.e if you start with Kubernetes 1.x.y, systemd-sysupdate will automatically pull 1.x.y+1 but not 1.x+1.y), please note that this behavior is disabled by default. To enable the Kubernetes auto-update you can:

    • Update the template to enable the systemd-sysupdate.timer
    • Or run the following command on the nodes: sudo systemctl enable --now systemd-sysupdate.timer

    When the Kubernetes release reaches end-of-life it will not receive updates anymore. To switch to a new major version, do a sudo rm /etc/sysupdate.kubernetes.d/kubernetes-*.conf and download the new update config into the folder with cd /etc/sysupdate.kubernetes.d && sudo wget https://github.com/flatcar/sysext-bakery/releases/download/latest/kubernetes-${KUBERNETES_VERSION%.*}.conf.

    To coordinate the node reboot, we recommend to use Kured. Note that running kubeadm upgrade apply on the first controller and kubeadm upgrade node on all other nodes is not automated (yet), see the docs.

    To use Flatcar image:

    • Upload an image on OpenStack from the Flatcar release servers (e.g for Stable, you might use this image: https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img)
    • Export the name of the uploaded image: export FLATCAR_IMAGE_NAME=flatcar_production_openstack_image
    • When generating the cluster configuration, use the following Cluster API flavor: --flavor flatcar-sysext (NOTE: Don’t forget to refer to the external-cloud-provider section)

SSH key pair

The SSH key pair is required. You can create one using,

openstack keypair create [--public-key <file> | --private-key <file>] <name>

The key pair name must be exposed as an environment variable OPENSTACK_SSH_KEY_NAME.

In order to access cluster nodes via SSH, you must either access nodes through the bastion host or configure custom security groups with rules allowing ingress for port 22.

OpenStack credential

Generate credentials

The env.rc script sets the environment variables related to credentials. It’s highly recommend to avoid using admin credential.

source env.rc <path/to/clouds.yaml> <cloud>

The following variables are set.

VariableMeaning
OPENSTACK_CLOUDThe cloud name which is used as second argument
OPENSTACK_CLOUD_YAML_B64The secret used by Cluster API Provider OpenStack accessing OpenStack
OPENSTACK_CLOUD_PROVIDER_CONF_B64The content of cloud.conf which is used by OpenStack cloud provider
OPENSTACK_CLOUD_CACERT_B64The content of your custom CA file which can be specified in your clouds.yaml by ca-file, mandatory when openstack endpoint is https

Note: Only the external cloud provider supports Application Credentials.

Note: you need to set clusterctl.cluster.x-k8s.io/move label for the secret created from OPENSTACK_CLOUD_YAML_B64 in order to successfully move objects from bootstrap cluster to target cluster. See bug 626 for further information.

CA certificates

When using an https openstack endpoint, providing CA certificates is required unless verification is explicitly disabled. You can choose to provide your ca certificates per cluster or globally using a specific capo flag.

Per cluster

To use the per cluster ca certificate, you can use the OPENSTACK_CLOUD_CACERT_B64 environment variable. The generator will set the cacert key with the variable’s content in the cluster’s cloud-config secret.

Global configuration

To use the same ca certificate for all clusters you can use the --ca-certs flag. When reconciling a cluster, if no cacert is set in the cluster’s cloud-config secret, CAPO will use the certicates provided with this flag.

For instance, to use CAPO’s docker image ca certificates:

kubectl patch deployment capo-controller-manager -n capo-system \
  --type='json' \
  -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--ca-certs=/etc/ssl/certs/ca-certificates.crt"}]'

Availability zone

The availability zone names must be exposed as an environment variable OPENSTACK_FAILURE_DOMAIN.

By default, if Availability zone is not given, all Availability zone that defined in openstack will be a candidate to provision from, If administrator credential is used then internal Availability zone which is internal only Availability zone inside nova will be returned and can cause potential problem, see PR 1165 for further information. So we highly recommend to set Availability zone explicitly.

DNS server

The DNS servers must be exposed as an environment variable OPENSTACK_DNS_NAMESERVERS.

Machine flavor

The flavors for control plane and worker node machines must be exposed as environment variables OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR and OPENSTACK_NODE_MACHINE_FLAVOR respectively.

The recommmend minimum value of control plane flavor’s vCPU is 2 and minimum value of worker node flavor’s vCPU is 1.

CNI security group rules

Depending on the CNI that will be deployed on the cluster, you may need to add specific security group rules to the control plane and worker nodes. For example, if you are using Calico with BGP, you will need to add the following security group rules to the control plane and worker nodes:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
spec:
   ...
   managedSecurityGroups: 
     allNodesSecurityGroupRules:
     - remoteManagedGroups:
       - controlplane
       - worker
       direction: ingress
       etherType: IPv4
       name: BGP (Calico)
       portRangeMin: 179
       portRangeMax: 179
       protocol: tcp
       description: "Allow BGP between control plane and workers"
     - remoteManagedGroups:
       - controlplane
       - worker
       direction: ingress
       etherType: IPv4
       name: IP-in-IP (Calico)
       protocol: 4
       description: "Allow IP-in-IP between control plane and workers"
     allowAllInClusterTraffic: false

Optional Configuration

Log level

When running CAPO with --v=6 the gophercloud client logs its requests to the OpenStack API. This can be helpful during debugging.

External network

If there is only a single external network it will be detected automatically. If there is more than one external network you can specify which one the cluster should use by setting the environment variable OPENSTACK_EXTERNAL_NETWORK_ID.

The public network id can be obtained by using command,

openstack network list --external

Note: If your openstack cluster does not already have a public network, you should contact your cloud service provider. We will not review how to troubleshoot this here.

Use existing router

You can use a pre-existing router instead of creating a new one. When deleting a cluster a pre-existing router will not be deleted.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
spec:
  ...
  router: 
     id: <Router id>

API server floating IP

Unless explicitly disabled, a floating IP is automatically created and associated with the load balancer or controller node. If required, you can specify the floating IP explicitly by spec.apiServerFloatingIP of OpenStackCluster.

You have to be able to create a floating IP in your OpenStack in advance. You can create one using,

openstack floating ip create <public network>

Note: Only user with admin role can create a floating IP with specific IP.

Note: When associating a floating IP to a cluster with more than 1 controller node, the floatingIP will be associated to the first controller node and the other controller nodes have no floating IP assigned. When the controller node has the floating IP status down CAPO will NOT auto assign the floating IP address to any other controller node. So we recommend to only set one controller node when floating IP is needed, or please consider using load balancer instead, see issue #1265 for further information.

Disabling the API server floating IP

It is possible to provision a cluster without a floating IP for the API server by setting OpenStackCluster.spec.disableAPIServerFloatingIP: true (the default is false). This will prevent a floating IP from being allocated.

WARNING

If the API server does not have a floating IP, workload clusters will only deploy successfully when the management cluster and workload cluster control plane nodes are on the same network. This can be a project-specific network, if the management cluster lives in the same project as the workload cluster, or a network that is shared across multiple projects.

In particular, this means that the cluster cannot use OpenStackCluster.spec.managedSubnets to provision a new network for the cluster. Instead, use OpenStackCluster.spec.network to explicitly specify the same network as the management cluster is on.

When the API server floating IP is disabled, it is not possible to provision a cluster without a load balancer without additional configuration (an advanced use-case that is not documented here). This is because the API server must still have a virtual IP that is not associated with a particular control plane node in order to allow the nodes to change underneath, e.g. during an upgrade. When the API server has a floating IP, this role is fulfilled by the floating IP even if there is no load balancer. When the API server does not have a floating IP, the load balancer virtual IP on the cluster network is used.

Restrict Access to the API server

NOTE

This requires “amphora” as load balancer provider at in version >= v2.12

It is possible to restrict access to the Kubernetes API server on a network level. If required, you can specify the allowed CIDRs by spec.APIServerLoadBalancer.AllowedCIDRs of OpenStackCluster.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
spec:
  apiServerLoadBalancer:
    allowedCIDRs:
    - 192.168.10/24
    - 10.10.0.0/16

All known IPs of the target cluster will be discovered dynamically (e.g. you don’t have to take care of target Cluster own Router IP, internal CIDRs or any Bastion Host IP). Note: Please ensure, that at least the outgoing IP of your management Cluster is added to the list of allowed CIDRs. Otherwise CAPO can’t reconcile the target Cluster correctly.

All applied CIDRs (user defined + dynamically discovered) are written back into status.network.apiServerLoadBalancer.allowedCIDRs

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-namespace>
status:
  network:
    apiServerLoadBalancer:
      allowedCIDRs:
        - 10.6.0.0/24 # openStackCluster.Status.Network.Subnet.CIDR
        - 10.6.0.90/32 # bastion Host internal IP
        - 10.10.0.0/16 # user defined
        - 192.168.10/24 # user defined
        - 172.16.111.100/32 # bastion host floating IP
        - 172.16.111.85/32 # router IP
      internalIP: 10.6.0.144
      ip: 172.16.111.159
      name: k8s-clusterapi-cluster-<cluster-namespace>-<cluster-name>

If you locked out yourself or the CAPO management cluster, you can easily clear the allowed_cidrs field on OpenStack via

openstack loadbalancer listener unset --allowed-cidrs <listener ID>

Network Filters

If you have a complex query that you want to use to lookup a network, then you can do this by using a network filter. More details about the filter can be found in NetworkParam

By using filters to look up a network, please note that it is possible to get multiple networks as a result. This should not be a problem, however please test your filters with openstack network list to be certain that it returns the networks you want. Please refer to the following usage example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            name: <network-name>

Multiple Networks

You can specify multiple networks (or subnets) to connect your server to. To do this, simply add another entry in the networks array. The following example connects the server to 3 different networks:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            name: myNetwork
            tags: myTag
        - network:
            id: your_network_id
        - fixedIPs:
            - subnet:
                id: your_subnet_id

Subnet Filters

Rather than just using a network, you have the option of specifying a specific subnet to connect your server to. The following is an example of how to specify a specific subnet of a network to use for your server.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            name: <network-name>
          fixedIPs:
            - subnet:
              name: <subnet-name>

Ports

A server can also be connected to networks by describing what ports to create. Describing a server’s connection with ports allows for finer and more advanced configuration. For example, you can specify per-port security groups, fixed IPs, VNIC type or profile.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
        - network:
            id: <your-network-id>
          fixedIPs:
            - subnet:
                id: <your-subnet-id>
              ipAddress: <your-fixed-ip>
            - subnet:
                name: <your-subnet-name>
                tags:
                  - tag1
                  - tag2
          nameSuffix: <your-port-name>
          description: <your-custom-port-description>
          vnicType: normal
          securityGroups:
            - <your-security-group-id>
          profile:
            capabilities:
              - <capability>

Any such ports are created in addition to ports used for connections to networks or subnets.

Port network and IP addresses

Together, network and fixedIPs define the network a port will be created on, and the addresses which will be assigned to the port on that network.

network is a filter which uniquely describes the Neutron network the port will be created be on. Machine creation will fail if the result is empty or not unique. If a network id is specified in the filter then no separate OpenStack query is required. This has the advantages of being both faster and unambiguous in all circumstances, so it is the preferred way to specify a network where possible.

The available fields are described in the CRD.

If network is not specified at all, it may be possible to infer the network from any uniquely defined subnets in fixedIPs. As this may result in additional OpenStack queries and the potential for ambiguity is greater, this is not recommended.

fixedIPs describes a list of addresses from the target network which will be allocated to the port. A fixedIP is either a specific ipAddress, a subnet from which an ip address will be allocated, or both. If only ipAddress is specified, it must be valid in at least one of the subnets defined in the current network. If both are defined, ipAddress must be valid in the specified subnet.

subnet is a filter which uniquely describe the Neutron subnet an address will be allocated from. Its operation is analogous to network, described above.

fixedIPs, including all fields available in the subnet filter, are described in the CRD.

If no fixedIPs are specified, the port will get an address from every subnet in the network.

Examples

A single explicit network with a single explicit subnet.

ports:
- tags:
  - control-plane
  network:
    id: 0686143b-f0a7-481a-86f5-cc1f8ccde692
  fixedIPs:
  - subnet:
      id: a5e50a9c-58f9-4b6f-b8ee-2e7b4e4414ee

No network or fixed IPs: the port will be created on the cluster default network, and will get a single address from the cluster default subnet.

ports:
- tags:
  - control-plane

Network and subnet are specified by filter. They will be looked up. Note that this is not as efficient or reliable as specifying the network by id.

ports:
- tags:
  - storage
  network:
    name: storage-network
  fixedIPs:
  - subnet:
      name: storage-subnet

No network, but a fixed IP with a subnet. The network will be inferred from the network of the subnet. Note that this is not as efficient or reliable as specifying the network explicitly.

ports:
- tags:
  - control-plane
  fixedIPs:
  - subnet:
      id: a5e50a9c-58f9-4b6f-b8ee-2e7b4e4414ee

Port Security

port security can be applied to specific port to enable/disable the port security on that port; When not set, it takes the value of the corresponding field at the network level.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ports:
      - network:
          id: <your-network-id>
        ...
        disablePortSecurity: true
        ...

Security groups

Security groups are used to determine which ports of the cluster nodes are accessible from where.

If spec.managedSecurityGroups of OpenStackCluster is set to a non-nil value (e.g. {}), two security groups named k8s-cluster-${NAMESPACE}-${CLUSTER_NAME}-secgroup-controlplane and k8s-cluster-${NAMESPACE}-${CLUSTER_NAME}-secgroup-worker will be created and added to the control plane and worker nodes respectively.

Example of spec.managedSecurityGroups in OpenStackCluster spec when we want to enable the managed security groups:

managedSecurityGroups: {}
  • Control plane nodes
    • API server traffic from anywhere
    • Etcd traffic from other control plane nodes
    • Kubelet traffic from other cluster nodes
  • Worker nodes
    • Node port traffic from anywhere
    • Kubelet traffic from other cluster nodes

When the flag OpenStackCluster.spec.managedSecurityGroups.allowAllInClusterTraffic is set to true, the rules for the managed security groups permit all traffic between cluster nodes on all ports and protocols (API server and node port traffic is still permitted from anywhere, as with the default rules).

We can add security group rules that authorize traffic from all nodes via allNodesSecurityGroupRules. It takes a list of security groups rules that should be applied to selected nodes. The following rule fields are mutually exclusive: remoteManagedGroups, remoteGroupID and remoteIPPrefix. If none of these fields are set, the rule will have a remote IP prefix of 0.0.0.0/0 per Neutron default.

Valid values for remoteManagedGroups are controlplane, worker and bastion.

To apply a security group rule that will allow BGP between the control plane and workers, you can follow this example:

managedSecurityGroups:
  allNodesSecurityGroupRules:
  - remoteManagedGroups:
    - controlplane
    - worker
    direction: ingress
    etherType: IPv4
    name: BGP (Calico)
    portRangeMin: 179
    portRangeMax: 179
    protocol: tcp
    description: "Allow BGP between control plane and workers"

If this is not flexible enough, pre-existing security groups can be added to the spec of an OpenStackMachineTemplate, e.g.:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-control-plane
spec:
  template:
    spec:
      securityGroups:
      - filter:
          name: allow-ssh

Tagging

You have the ability to tag all resources created by the cluster in the OpenStackCluster spec. Here is an example how to configure tagging:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: <cluster-name>
  namespace: <cluster-name>
spec:
  tags:
  - cluster-tag

To tag resources specific to a machine, add a value to the tags field in the OpenStackMachineTemplate spec like this:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      tags:
      - machine-tag

Metadata

You also have the option to add metadata to instances. Here is a usage example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      serverMetadata:
        name: bob
        nickname: bobbert

Boot From Volume

For example in OpenStackMachineTemplate set spec.rootVolume.diskSize to something greater than 0 means boot from volume.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: <cluster-name>-controlplane
  namespace: <cluster-name>
spec:
  template:
    spec:
      ...
        rootVolume:
          sizeGiB: <image size>
          type: <a cinder volume type (*optional)>
          availabilityZone:
            name: <The cinder availability zone name>
      ...

If volumeType is not specified, cinder will use the default volume type.

If availabilityZone is not specified, the volume will be created in the cinder availability zone specified in the MachineSpec’s failureDomain. This same value is also used as the nova availability zone when creating the server. Note that this will fail if cinder and nova do not have matching availability zones. In this case, cinder availabilityZone must be specified explicitly on rootVolume.

Timeout settings

The default timeout for instance creation is 5 minutes. If creating servers in your OpenStack takes a long time, you can increase the timeout. You can set a new value, in minutes, via the environment variable CLUSTER_API_OPENSTACK_INSTANCE_CREATE_TIMEOUT in your Cluster API Provider OpenStack controller deployment.

Custom pod network CIDR

If 192.168.0.0/16 is already in use within your network, you must select a different pod network CIDR. You have to replace the CIDR 192.168.0.0/16 with your own in the generated file.

Accessing nodes through the bastion host via SSH

Enabling the bastion host

To configure the Cluster API Provider for OpenStack to create a SSH bastion host, add this line to the OpenStackCluster spec after clusterctl generate cluster was successfully executed:


spec:
  ...
  bastion:
    enabled: true
    spec:
      flavor: <Flavor name>
      image:  <Image name>
      sshKeyName: <Key pair name>

All parameters are mutable during the runtime of the bastion host. The bastion host will be re-created if it’s enabled and the instance spec has been changed. This is done by a simple checksum validation of the instance spec which is stored in the OpenStackCluster annotation infrastructure.cluster.x-k8s.io/bastion-hash.

A floating IP is created and associated to the bastion host automatically, but you can add the IP address explicitly:


spec:
  ...
  bastion:
    ...
    floatingIP: <Floating IP address>

If managedSecurityGroups is set to a non-nil value (e.g. {}), security group rule opening 22/tcp is added to security groups for bastion, controller, and worker nodes respectively. Otherwise, you have to add securityGroups to the bastion in OpenStackCluster spec and OpenStackMachineTemplate spec template respectively.

Making changes to the bastion host

Changes can be made to the bastion spec, like for example changing the flavor, by modifying the OpenStackCluster.Spec.Bastion.Spec field. The bastion host will be re-created with the new spec.

Disabling the bastion

To disable the bastion host, set enabled: false in the OpenStackCluster.Spec.Bastion field. The bastion host will be deleted, you can check the status of the bastion host by running kubectl get openstackcluster and looking at the Bastion field in status. Once it’s gone, you can now remove the OpenStackCluster.Spec.Bastion field from the OpenStackCluster spec.

Obtain floating IP address of the bastion node

Once the workload cluster is up and running after being configured for an SSH bastion host, you can use the kubectl get openstackcluster command to look up the floating IP address of the bastion host (make sure the kubectl context is set to the management cluster). The output will look something like this:

$ kubectl get openstackcluster
NAME    CLUSTER   READY   NETWORK                                SUBNET                                 BASTION
nonha   nonha     true    2e2a2fad-28c0-4159-8898-c0a2241a86a7   53cb77ab-86a6-4f2c-8d87-24f8411f15de   10.0.0.213