Ansible Roles

Simple and compatible on many platforms.

Home Blog My manifesto About Uptime View on GitHub

An overview of articles

Terraform loops

There are a couple of ways to loop in Terraform. Let’s have a look.

Count

This is the “oldest” method. It’s quite powerful.

resource "some_resource" "some_name" {
  count = 3
  name  = "my_resource_${count.index}"
}

As you can see, the resource some_resource is being created 3 (count = 3) times. The name should be unique, so the count.index variable is used. This variable is available when using count.

The variable count.index has these values:

itteration count.index value
first 0
second 1
third 2

And so on.

Counting and looking up values

The parameter count sounds simple, but is actually quite powerful. Lets have a look at this example.

Here is a sample .tfvars file:

virtual_machines = [
  {
    name = "first"
    size = 16
  },
  {
    name = "second"
    size = 32
  }
]

The above structure is a “list of maps”:

  • List is indicated by the [ and ] character.
  • Map is indicated by the { and } character.

Now lets loop over that list:

resource "fake_virtual_machine" "default" {
  count = length(var.virtual_machines)
  name  = var.virtual_machines[count.index].name
  size  = var.virtual_machines[count.index].size
}

A couple of tricks happen here:

  1. count is calculated by the length function. It basically counts how many maps there are in the list virtual_machines.
  2. name is looked-up in the variable var.virtual_machines. Pick the first (0) entry from var.virtual_machines in the first itteration.
  3. Similarly size is looked up.

This results in two resources:

resource "fake_virtual_machine" "default" {
  name  = "first"
  size  = 16
}

# NOTE: This code does not work; `default` may not be repeated. It's just to explain what happens.
resource "fake_virtual_machine" "default" {
  name  = "second"
  size  = 32
}

For each

The looping mechanism for_each can also be used. Similar to the count example, let’s think of a data structure to make virtual machines:

virtual_machines = [
  {
    name = "first"
    size = 16
  },
  {
    name = "second"
    size = 32
  }
]

And let’s use for_each to loop over this structure.

resource "fake_virtual_machine" "default" {
  for_each = var.virtual_machines
  name = each.value.name
  size = each.value.size
}

This pattern creates exactly the same resources as the count example.

Dynamic block

Imagine there is some block in the fake_virtual_machine resource, like this:

resource "fake_virtual_machine" "default" {
  name = "example"
  disk {
    name = "os"
    size = 32
  }
  disk {
    name = "data"
    size = 128
  }
}

The variable structure we’ll use looks like this:

virtual_machines = [
  {
    name = "first"
    disks [
      {
        name = "os"
        size = 32
      },
      {
        name = "data"
        size = 128
      }
    ]
  },
  {
    name = "second"
    disks [
      {
        name = "os"
        size = 64
      },
      {
        name = "data"
        size = 256
      }
    ]
  }
]

As you can see:

  • A list of virtual_machines.
  • Each virtual_machine has a list of disks.

Now let’s introduce the dynamic block:

resource "fake_virtual_machine" "default" {
  for_each = var.virtual_machines
  name = each.value.name
  dynamic "disk" {
    for_each = each.value.disks
    content {
      name = disk.value.name
      size = disk.value.size
    }
  }
}

Wow, that’s a lot to explain:

  1. The dynamic "disk" { starts a dynamic block. The name (“disk”) must reflect the parameter in the resource, not juts any name. Now a new object is available; disk.
  2. The for_each = each.value.disks loops the dynamic block. The loop uses disks from an already looped value var.virtual_machines.
  3. The content { block will be rendered by Terraform.
  4. The name = disk.value.name uses the disk variable (created by the block iterator disk) to find the value from the disks map.

Hope that helps a bit when writing loops in Terraform!

Looping in Terraform, Ansible and Bash

Looping is quite an important mechanism in coding. (Thanks @Andreas for the word coding, a mix of scripting and programming.)

Looping allows you to write a piece of logic once, and reuse it as many times are required.

Looping is difficult to understand for new people new to coding. It’s sometimes difficult for me to. This article will probably help me a bit too!

A sequence

The simplest loop I know is repeating a piece of logic for a set of number or letters, like this:

  • 1 2 3

First off, here is how to generate a sequence:

bash ansible terraform
seq 1 3 with_sequence: start=1 end=3 range(1, 3)
{1..3}    

For all languages, this returns a list of numbers.

bash ansible terraform
1 2 3 item: 1, item: 2, item: 3 [ 1, 2, 3, ]

So, a more complete example for the three languages:

Bash

for number in {1..3} ; do
  echo "number: ${number}
done

The above script returns:

1
2
3

Ansible

- name: show something
  debug:
    msg: "number: "
  with_sequence:
    start=1
    end=3

Note: See that = sign, I was expecting a : so made an issue.

The above script returns:

ok: [localhost] => (item=1) => {
    "msg": "1"
}
ok: [localhost] => (item=2) => {
    "msg": "2"
}
ok: [localhost] => (item=3) => {
    "msg": "3"
}

Terraform

locals {
  numbers = range(1,4)
}

output "number" {
  value = local.numbers
}

The above script returns:

number = tolist([
  1,
  2,
  3,
])

Ansible testing components

To test Ansible, I use quite a number of components. This page lists the components uses, their versions, and where they are used.

Component Used Latest Used where
ansible 2.9 2.9.18 tox.ini
ansible 2.10 2.10.7 tox.ini
molecule >=3,<4 c docker-github-action-molecule
tox latest n.a. docker-github-action-molecule
ansible-lint latest e docker-github-action-molecule
pre-commit 2.9.3 v2.10.1 installed on development desktop.
molecule-action 2.6.16 g .github/workflows/molecule.yml
github-action-molecule 3.0.6 h .gitlab-ci.yml
ubuntu 20.04 20.04 .github/workflows/galaxy.yml
ubuntu 20.04 20.04 .github/workflows/molecule.yml
ubuntu 20.04 20.04 .github/workflows/requirements2png.yml
ubuntu 20.04 20.04 .github/workflows/todo.yml
galaxy-action 1.1.0 m .github/workflows/galaxy.yml
graphviz-action 1.0.7 n .github/workflows/requirements2png.yml
checkout v2 o .github/workflows/requirements2png.yml
checkout v2 o .github/workflows/molecule.yml
todo-to-issue v2.3 p .github/workdlows/todo.yml
python 3.9 3.9 .travis.yml
pre-commit-hooks v3.4.0 r .pre-commit-config.yaml
yamllint v1.26.0 v1.26.0 .pre-commit-config.yaml
my pre-commit v1.4.5 u .pre-commit-config.yaml
fedora 33 33 docker-github-action-molecule

Debugging GitLab builds

Now that Travis has become unusable, I’m moving stuff to GitLab. Some builds are breaking, this is how to reproduce the errors.

Start the dind container

export role=ansible-role-dns
cd Documents/github/buluma
docker run --rm --name gitlabci --volume $(pwd)/${role}:/${role}:z --privileged --tty --interactive docker:stable-dind

Login to the dind container

docker exec --tty --interactive gitlabci /bin/sh

Install software

The dind image is Alpine based and misses required software to run molecule or tox.

apk add --no-cache python3 python3-dev py3-pip gcc git curl build-base autoconf automake py3-cryptography linux-headers musl-dev libffi-dev openssl-dev openssh

Tox

GitLab CI tries to run tox (if tox.ini is found). To emulate GitLab CI, run:

python3 -m pip install tox --ignore-installed

And simply run tox to see the results.

tox

Molecule

For more in-depth troubleshooting, try installing molecule:

python3 -m pip install ansible molecule[docker] docker ansible-lint

Now you can run molecule:

cd ${role}
molecule test --destroy=never
molecule login

YAML Anchors and References

This is a post to help me remind how YAML Anchors and References work.

What?

In YAML you can make an Anchor:

- first_name: &me Michael

Now the Anchor me contains Michael. To refer to it, do something like this:

- first_name: &me Michael
  give_name: *me

The value for given_name has been set to Michael.

You can also anchor to a whole list item:

people:
  - person: &me
    name: Michael
    family_name: Buluma

No you may refer to the me anchor:

access:
  - person: *me

Now Michael has access.

Good to prevent dry.

Ansible alternatives for shell tricks

If you’re used to shells and their commands like bash, sed and grep, here are a few alternatives for Ansible.

Using these native alternatives has an advantage, developers maintain the Ansible modules for you, they are idempotent and likely work on more distributions or platforms.

Grep

Imagine you need to know if a certain patter is in a file. With a shell script you would use something like this:

grep "pattern" file.txt && do-something-if-pattern-is-found.sh

With Ansible you can achieve a similar result like this:

--- 
- hosts: all
  gather_facts: no

  tasks:
    - name: check if pattern if found
      lineinfile:
        path: file.txt
        regexp: '.*pattern.*'
        line: 'whatever'
      register: check_if_pattern_is_found
      check_mode: yes
      notify: do something

  handlers:
    - name: do something
      command: do-something-if-pattern-is-found.sh

Hm, much longer than the bash example, but native Ansible!

Sed

So you like sed? So do I. It’s one of the most powerful tools I’ve used.

Lets replace some pattern in a file:

sed 's/pattern_a/pattern_b/g' file.txt

This would repace all occurences of pattern_a for pattern_b. Lets see what Ansible looks like.

---
- name: replace something
  gather_facts: no

  tasks:
    - name: replace patterns
      lineinfile:
        path: file.txt
        regexp: '^(.*)pattern_a(.*)$'
        line: '\1pattern_b\2'

Have a look at the lineinfile module documentation for more details.

Find and remove.

The find (UNIX) tools is really powerful too, imagine this command:

find / -name some_file.txt -exec rm {} \;

That command would find all files (and directories) named some_file.txt and remove them. A bit dangerous, but powerful.

With Ansible:

- name: find and remove
  gather_facts: no

  tasks:
    - name: find files
      find:
        paths: /
        patterns: some_file.txt
      register: found_files

    - name: remove files
      file:
        path: "{{ item.path }}"
        state: absent
      loop: "{{ found_files.results }}"

Conclusion

Well, have fun with all non-shell solutions. You hardly needs the shell or command modules when you get the hang of it.

Debugging services in Ansible

Sometimes services don’t start and give an error like:

Unable to start service my_service: A dependency job for my_service.service failed. See 'journalctl -xe' for details.

Well, if you’re testing in CI, you can’t really access the instance that has an issue. So how to you troubleshoot this?

I use this pattern frequently:

- name: debug start my_service
  block:
    - name: start my_service
      service:
        name: "my_service"
        state: started
  rescue:
    - name: collect information
      command: "{{ item }}"
      register: my_service_collect_information
      loop:
        - journalctl --unit my_service --no-pager
        - systemctl status my_service
    - name: show information
      debug:
        msg: "{{ item }}"
      loop: "{{ my_service_collect_information.results }}"

What’s happening here?

  • The block is a grouping of tasks.
  • The rescue runs when the initial task (start my_service) fails. It has two tasks.
  • The task collect information runs a few commands, add extra is you know more and saves the results in my_service_collect_information.
  • The task show information displays the saved result. Because collect information has a loop, the variable has .results appended, which is a list to needs to be looped over.

Hope this helps you troubleshoot services in Travis of Google Actions.

5 times why

Why do I write all this code? What’s the purpose and were does it stop?

To answer this question, there is a method you can use: 5 times why. Answer the initial question and pose a new question: “Why?”. Repeat the process a couple of times until you get to the core of the reason. Let’s go.

1. Why do I write all this code?

Because it’s a great way to learn a language.

2. Why do I want to learn a new language?

Technology changes, this allows me to keep up to date.

3. Why do I need to be up to date?

Being up to date allows me to be relevant.

4. Why do I need to be relevant.

Being relevant allows me to steer my career.

5. Why do I need to steer my career?

I don’t want to depend on 1 single employer and have many options when choosing a job.

Conclusion

So, I write all this code to not depend on any one employer. Interesting conclusion, I tend to agree.

GitHub action to release to Galaxy

GitHub Actions is an approach to offering CI, using other peoples actions from the GitHub Action Marketplace.

I’m using 2 actions to:

  1. test the role using Molecule
  2. release the role to Galaxy

GitHub is offering 20 concurrent builds which is quite a lot, more than Travis’s limit of 5. The build could be 4 times faster. Faster CI, happier developers. ;-)

Here are a few examples. First; just release to Galaxy, no testing includes. (Not a smart idea)

---
name: GitHub Action

on:
  - push

jobs:
  release:
    runs-on: ubuntu-latest
    steps:
      - name: checkout
        uses: actions/checkout@v4
      - name: galaxy
        uses: buluma/galaxy-action@master
        with:
          galaxy_api_key: "$"

As you can see, 2 actions are used, checkout which gets the code and galaxy-action to push the role to Galaxy. Galaxy does lint-testing, but not functional testing. You can use the molecule-action to do that.

---
name: GitHub Action

on:
  - push

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: checkout
        uses: actions/checkout@v4
        with:
          path: "$"
      - name: molecule
        uses: buluma/molecule-action@1.0.0
        with:
          image: "$"
  release:
    needs:
      - test
    runs-on: ubuntu-latest
    steps:
      - name: galaxy
        uses: buluma/galaxy-action@master
        with:
          galaxy_api_key: $

The build is split in 2 parts now; test and release and release needs test to be done. You can also see that checkout is now called with a path which allows Molecule to find itself. (ANSIBLE_ROLES_PATH: $ephemeral_directory/roles/:$project_directory/../)

Finally you can include a matrix to build with a matrix of variables set.

---
name: GitHub Action

on:
  - push

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        image:
          - alpine
          - amazonlinux
          - debian
          - centos
          - fedora
          - opensuse
          - ubuntu
    steps:
      - name: checkout
        uses: actions/checkout@v4
        with:
          path: "$"
      - name: molecule
        uses: buluma/molecule-action@1.0.0
        with:
          image: $
  release:
    needs:
      - test
    runs-on: ubuntu-latest
    steps:
      - name: galaxy
        uses: buluma/galaxy-action@master
        with:
          galaxy_api_key: $

Now your role is tested on the list of images specified.

Hope these actions make it easier to develop, test and release your roles, if you find problems, please make an issue for either the molecule or galaxy action.

GitHub action to run Molecule

GitHub Actions is an approach to offering CI, using other peoples actions from the GitHub Action Marketplace.

The intent is to let a developer of an Action think about ‘hard stuff’ and the user of an action simply include another step into a workflow.

So; I wrote a GitHub action to test an Ansible role with a single action.

Using the GitHub Action.

Have a look at the Molecule action.

It boils down to adding this snippet to .github/workflows/molecule.yml:

---
on:
  - push

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: checkout
        uses: actions/checkout@v4
        with:
          path: "$"
      - name: molecule
        uses: buluma/molecule-action@master

How it works

You may want to write your own action, here is an overview of the required components.

+--- Repository with an Ansible role ---+
| - .github/workflows/molecule.yml      |
+-+-------------------------------------+
  |
  |    +-------- buluma/molecule-action --------+
  +--> | - image: buluma/github-action-molecule |
       +-+--------------------------------------------+
         |
         |    +--- github-action-molecule ---+
         +--> | - pip install molecule       |
              | - pip install tox            |
              +------------------------------+

1. Create a container

First create a container that has all tools installed you need and push it to Docker Hub. Here is the code for my container

2. Create an action

Create a GitHub repository per action. It should at least contain an action.yml. Have a look at the documentation for Actions.

3. Integrate your action

Pick a repository, and add a file (likely with the name of the action) in .gitlab/workflows/my_action.yml. The contents should refer to the action:

    steps:
      - name: checkout
        uses: actions/checkout@v4
        with:
          path: "$"
      - name: molecule
        uses: buluma/molecule-action@master
        with:
          image: $

A full example here.

The benefit is that you (or others) can reuse the action. Have fun making GitHub actions!

And, or and not

Today I spent a couple of hours on a condition that contained a mistake. Let me try to help myself and describe a few situations.

Condition?

A condition in Ansible can be described in a when statement. This is a simple example:

- name: do something only to virtual instances
  debug:
    msg: "Here is a message from a guest"
  when: ansible_virtualization_role == "guest"

And

It’s possible to describe multiple conditions. In Ansible, the when statement can be a string (see above) or a list:

- name: do something only to Red Hat virtual instances
  debug:
    msg: "Here is a message from a Red Hat guest"
  when:
    - ansible_virtualization_role == "guest"
    - ansible_os_family == "RedHat"

The above example will run when it’s both a virtual instance and it’s a Red Hat-like system.

Or

Instead of combining (‘and’) conditions, you can also allow multiple condition where either is true:

- name: do something to either Red Hat or virtual instances
  debug:
    msg: "Here is a message from a Red Hat system or a guest"
  when:
    - ansible_virtualization_role == "guest" or ansible_os_family == "RedHat"

I like to keep lines short to increase readability:

  when:
    - ansible_virtualization_role == "guest" or
      ansible_os_family == "RedHat"

And & or

You can also combine and and or:

- name: do something to a Debian or Red Hat, if it's a virtual instances
  debug:
    msg: "Here is a message from a Red Hat or Debian guest"
  when:
    - ansible_virtualization_role == "guest"
    - ansible_os_family == "RedHat" or ansible_os_family == "Debian"

In

It’s also possible to check if some pattern is in a list:

- name: make some list
  set_fact:
    allergies:
      - apples
      - bananas

- name: Test for allergies
  debug:
    msg: "A match was found: "
  when: item in allergies
  loop:
    - pears
    - milk
    - nuts
    - apples

You can have multiple lists and check multiple times:

- name: make some list
  set_fact:
    fruit:
      - apples
      - bananas
    dairy:
      - milk
      - eggs

- name: Test for allergies
  debug:
    msg: "A match was found: "
  when:
    - item in fruit or
      item in dairy
  loop:
    - pears
    - milk
    - nuts
    - apples

Negate

It’s also possible to have search in a list negatively. This is where it gets difficult: (for me!)

- name: make some list
  set_fact:
    fruit:
      - apples
      - bananas
    dairy:
      - milk
      - eggs

- name: Test for allergies
  debug:
    msg: "No match was found: "
  when:
    - item not in fruit
    - item not in dairy
  loop:
    - pears
    - milk
    - nuts
    - apples

The twist here is that both conditions (and) should not be true.

Well, I’ll certainly run into some issue again in the future, hope this helps you (and me) if you ever need a complex condition in Ansible.

Relations between containernames, setup and Galaxy

It’s not easy to find the relation between container names, facts returned from setup (or gather_facts) and Ansible Galaxy platform names.

Here is an attempt to make life a little easier:

Alpine

containername: alpine
ansible_os_family: Alpine
ansible_distribution: Alpine
galaxy_platform: Alpine
galaxy_version docker_tag ansible_distribution_major_version
all latest 3
all edge 3

AmazonLinux

containername: amazonlinux
ansible_os_family: RedHat
ansible_distribution: Amazon
galaxy_platform: Amazon
galaxy_version docker_tag ansible_distribution_major_version
Candidate latest 2
2018.03 1 2018

CentOS

containername: centos
ansible_os_family: RedHat
ansible_distribution: CentOS
galaxy_platform: EL
galaxy_version docker_tag ansible_distribution_major_version
8 latest 8
7 7 7

RockyLinux

containername: rockylinux
ansible_os_family: RedHat
ansible_distribution: Rocky
galaxy_platform: EL
galaxy_version docker_tag ansible_distribution_major_version
8 latest 8

Debian

containername: debian
ansible_os_family: Debian
ansible_distribution: Debian
galaxy_platform: Debian
galaxy_version docker_tag ansible_distribution_major_version
bullseye latest 11
bookworm bookworm testing/12

Fedora

containername: fedora
ansible_os_family: RedHat
ansible_distribution: Fedora
galaxy_platform: Fedora
galaxy_version docker_tag ansible_distribution_major_version
32 32 32
33 latest 33
34 rawhide 34

OpenSUSE

containername: opensuse
ansible_os_family: Suse
ansible_distribution: OpenSUSE
galaxy_platform: opensuse
galaxy_version docker_tag ansible_distribution_major_version
all latest 15

Ubuntu

containername: ubuntu
ansible_os_family: Debian
ansible_distribution: Ubuntu
galaxy_platform: Ubuntu
galaxy_version docker_tag ansible_distribution_major_version
focal latest 20
bionic bionic 18
xenial xenial 16

Why would you write Ansible roles for multiple distributions?

During some disucussion with the audience at DevOps Amsterdam, I got some feedback;

My statement are:

  • “Keep your code as simple as possible”
  • “Write roles for multiple distributions” (To improve logic.)

These two contradict each other: simplicity would mean 1 role for 1 (only my) distribution.

Hm, that’s a very fair point. Still I think writing for multiple operating systems is a good thing, for these reasons:

  1. You get a better understanding of all the operating systems. For example Ubuntu is (nearly) identical to Debian, SUSE is very similar to Red Hat.
  2. By writing for multiple distributions, the logic (in tasks/main.yml) becomes more stable.
  3. It’s just very useful to be able to switch distributions without switching roles.

Super important Ansible facts

There are some facts that I use very frequently, they are super important to me.

This is more a therapeutic post for me, than it’s a great read to you. ;-)

Sometimes, actually most of the times, each operating system or distribution needs something specific. For example Apache httpd has different package named for mostly every distribution. This mapping (distro:packagename) can be done using this variable: ansible_os_family.

Try to select packages/service-names/directories/files/etc based on the most general level and work your way down to more specific when required. This results in a sort of priority list:

  1. General variable, not related to a distribution. For example: postfix_package.
  2. ansible_os_family variable, related to the type of distribution, for example: httpd_package which differs for Alpine, Archlinux, Debian, Suse and RedHat. (But is’t the same for docker images debian and ubuntu.)
  3. ansible_distribution variable when each distribution has differences. For example: reboot_requirements. CentOS needs yum-utils, but Fedora needs dnf-utils.
  4. ansible_distribution and ansible_distribution_major_version when there are differences per distribution release. For example firewall_packages. CentOS 6 and CentOS 7 need to have a different package.

Here is a list of containers and their ansible_os_family.

Container image ansible_os_family
alpine Alpine
archlinux/base Archlinux
centos RedHat
debian Debian
fedora RedHat
opensuse/leap Suse
ubuntu Debian

What I wish ansible collections would be

Ansible Collections is a way of:

  1. Packaging Ansible Content (modules/roles/playbooks).
  2. Distributing Ansible Content through Ansible Galaxy.
  3. Reducing the size of Ansible Engine.

All modules that are now in Ansible will move to Ansible Collections.

I’m not 100% sure how Ansible Collections will work in the future, but here is a guess.

From an Ansible role

I could imagine that requirements.yml will link depending modules and collections. Something like this:

---
- src: buluma.x
  type: role
- src: buluma.y
  type: collection

That structure would ensure that all modules required to run the role are going to be installed.

From an Ansible playbook repository

Identical to the role setup, I could imagine a requirements.yml that basically prepares the environment with all required dependencies, either roles or collections.

Loop dependencies

Ansible Collections can depend on other Ansible Collections.

Imagine my_collection’s requirements.yml:

---
- src: buluma.y
  type: collection

The Ansible Collection y could refer to my_colletion.

my_collection ---> y
       ^           |
       |           |
       +-----------+

I’m not sure how that can be resolved or prevented.

How many modules do you need?

Ansible Collections are coming. A big change in Ansible, so a stable version will likely be a good moment to go to Ansible 3. Listening to the developers, I think we can expect Ansible 3 in the spring of 2020.

Anyway, let’s get some stats:

How many modules am I using?

That was not so difficult to estimate: 97 modules.

What ‘weird’ modules?

A bit more difficult to answer, I’ve taken two approaches:

  1. Take the bottom of the list of “most used modules”.
  2. Walked through the 97 modules and discover odd once.
  • bigip_*: I’ve written a role for a workshop.
  • gem: Don’t know, weird.
  • debug: What, get it out of there!
  • include_vars: Why would that be?
  • fail: Let’s check that later.
  • set_fact: I’m now a big fan of set_fact, most facts can be “rendered” in vars/main.yml.

How many ‘vendor’ modules?

I expect some Ansible Collections will be maintained by the vendors; Google (GCP), Microsoft (Azure), F5 (BigIP), yum (RedHat), etc. That’s why knowing this upfront is likely smart.

Module Times used Potential maintainer
pip 17 PyPi
apt 16 Canonical
yum 9 Red Hat
apt_key 6 Canonical
apt_repository 5 Canonical
rpm_key 4 Red Hat
zypper 3 SUSE
yum_repository 3 Red Hat
dnf 3 Red Hat/Fedora
zypper_repository 2 SUSE
zabbix_host 2 Zabbix
zabbix_group 2 Zabbix
apk 2 Alpine
tower_* 7 (combined) RedHat
redhat_subscription 1 RedHat
pacman 1 ArchLinux
bigip_* 6 (combined) F5

How often do I use what modules?

Place Module Times used
1 package 138
2 service 137
3 command 73
4 template 64
5 file 62
6 meta 27
7 assert 26
8 unarchive 24
9 lineinfile 21
10 copy 20

Wow, I’m especially surprised by two modules:

  1. command - I’m going to review if there are modules that I can use instead of command. I know very well that command should be used as a last resort, not 73 times… Painful.
  2. assert - Most roles used to see of variable met the criteria. (If a variable is defined and the type is correct.) Rather wait for role spec.

Ansible Fest Atlanta 2019

Announcements on Ansible, AWX, Molecule, Galaxy, Ansible-lint and many other produts are always done on Ansible Fest.

Here is what I picked up on Ansible Fest 2019 in Atlanta, Georgia.

Ansible Collections

Ansible if full of modules, “batteries included” is a common expression. This reduces velocity in adding modules, fixing issues with modules or adding features to modules. Ansible Collections is there to solve this issue.

Ansible will (in a couple of releases) only be the framework, without modules or plugins. Modules will have to be installed seprarately.

There are a few unknowns:

  • How to manage dependencies between collections and Ansible. For example, what collections work on which Ansible version.
  • The documentation of Ansible is very important, but how to keep the same central point of documentation while spreading all these collections.
  • How to deal with colliding module names? Imaging the file module is included in more than 1 collection, which one takes precedence?

Anyway, the big take-away: Start to learn to develop or use Ansible Collections, it’s going to be important.

Here is how to develop Ansible Collections and how to use them.

AWX

AWX is refactoring components to improve development velocity and the performance of the product itself.

  • New UI, based on React and Pattern-Fly.
  • tower-cli will be replaced by awx, which exposed the availabe commands based on the capabilities of the AWX API. The version of awx will be the same as the AWX web/api-tool.

Data analysis

There are a few applications to analyse data and give insights on development and usage of Ansible:

There are many more perspectives, have a look.

Next Ansible Fest not in Europe

Spain seems to be the largest contributor of Ansible, but next Ansible Fest will be in San Diego.

The Contributors Summit will be in Europe though.

Why “hardening” is not a role

I see many developers writing an Ansible role for hardening. Although these roles can absolutely be useful, here is why I think there is a better way.

Roles are (not always, but frequently) product centric. Think of role names like:

  • auditd
  • sshd
  • users

A role for hardening you system has the potential to cover all kinds of topics that are covered in the product specific roles.

Besides that, in my opinion a role should be:

  1. Small
  2. Cover on function

A good indicator of a role that’s too big is having multiple task files in tasks.

So my suggestion to not use a harden role, but rather have each role that you compose a system out of, use secure defaults.

Ansible Galaxy Collections are here!

As the documentation describes:

Collections are a new way to package and distribute Ansible related content.

I write a lot of roles, roles are nice, but it’s a bit like ingredients without a recipe: A role is only a part of the whole picture.

Collections allow you to package:

  • roles
  • actions
  • filters
  • lookup plugins
  • modules
  • strategies

So instead of [upstreaming](https://en.wikipedia.org/wiki/Upstream_(software_development) content to Ansible, you can publish or consume content yourself.

The whole process is documented and should not be difficult.

I’ve published my development_environment and only had to change these things:

1. Add galaxy.yml

namespace: "buluma"
name: "development_environment"
description: Install everything you need to develop Ansible roles.
version: "1.0.4"
readme: "README.md"
authors:
    - "Michael Buluma"
dependencies:
license:
    - "Apache-2.0"
tags:
    - development
    - molecule
    - ara
repository: "https://github.com/buluma/ansible-development-environment"
documentation: "https://github.com/buluma/ansible-development-environment/blob/master/README.md"
homepage: "https://buluma.nl"
issues: "https://github.com/buluma/ansible-development-environment/issues"

2. Enable Travis for the repository

Go to Travis and click Sync account. Wait a minute or so and enable the repository containing your collection.

3. Save a hidden variable in Travis

Under settings for a repository you can find Environment Variables. Add one, I called it galaxy_api_key. You’ll refer to this variable in .travis.yml later.

4. Add .travis.yml

---
language: python

install:
  - pip install mazer
  - release=$(mazer build | tail -n1 | awk '{print $NF}')

script:
  - mazer publish --api-key=${galaxy_api_key} ${release}

Bonus hint: Normally you don’t save roles, so you add something like roles/* to .gitignore, but in this case it is a part of the collection. So if you have requirements.yml, download all the roles locally using ansible-galaxy install -r roles/requirements.yml -f and include them in the commit.

Fedora 30 and above use python-3

Fedora 30 (and above) uses python 3 and starts to deprecate python 2 package like python2-dnf.

Ansible 2.8 and above discover the python interpreter, but Ansible 2.7 and lower do not have this feature.

So for a while, you have to tell Ansible to use python 3. This can be done by setting the ansible_python_interpreter somewhere. Here are a few locations you could use:

1. inventory

This is quite a good location, because you could decide to give a single node this variable:

inventory/host_vars/my_host.yml:

---
ansible_python_interpreter: /usr/bin/python3

Or you could group hosts and apply a variable to it:

inventory/hosts:

[python3]
my_host

inventory/group_vars/python3.yml

---
ansible_python_interpreter: /usr/bin/python3

2. extra vars

You could start a playbook and set the ansible_python_interpreter once:

ansible-playbook my_playbook.yml --extra_vars "ansible_python_interpreter=/usr/bin/python3"

It’s not very persistent though.

3. playbook or role

You could save the variable in your playbook or role, but this makes re-using code more difficult; it will only work on machines with /usr/bin/python3:

---
- name: do something
  hosts: all

  vars:
    ansible_python_interpreter: /usr/bin/python3

  tasks:
    - name: do some task
      debug:
        msg: "Yes, it works."

4. molecule

Last case I can think of it to let Molecule set ansible_python_interpreter.

molecule/default/molecule.yml:

---
# Many parameters omitted.
provisioner:
  name: ansible
  inventory:
    group_vars:
      all:
        ansible_python_interpreter: /usr/bin/python3
# More parameters omitted.

Why you should use the Ansible set_fact module

So far it seems that the Ansible set_fact module is not required very often. I found 2 cases in the roles I write:

In the awx role:

- name: pick most recent tag
  set_fact:
    awx_version: ""
  with_items:
    - ""

In the zabbix_server role:

- name: find version of zabbix-server-mysql
  set_fact:
    zabbix_server_version: ""

In both cases a “complex” variable strucure is saved into a simpler to call variable name.

Variables that are constructed of other variables can be set in vars/main.yml. For example the kernel role needs a version of the kernel in defaults/main.yml:

kernel_version: 5.0.3

And the rest can be calculated in vars/main.yml:

kernel_unarchive_src: https://cdn.kernel.org/pub/linux/kernel/v.x/linux-.tar.xz

So sometimes set_fact can be used to keep code simple, other (most) times vars/main.yml can help.

For a moral compass Southpark uses Brian Boitano, where my moral coding compass uses Jeff Geerling who would say something like: “If your code is complex, it’s probably not good.”

Different methods to include roles

There are several ways to include roles from playbooks or roles.

Classic

The classic way:

---
- name: Build a machine
  hosts: all

  roles:
    - buluma.bootstrap
    - buluma.java
    - buluma.tomcat

Or a variation that allows per-role variables:

---
- name: Build a machine
  hosts: all

  roles:
    - role: buluma.bootstrap
    - role: buluma.java
      vars: java_version: 9
    - role: buluma.tomcat

Include role

The include_role way:

---
- name: Build a machine
  hosts: all

  tasks:
    - name: include bootstrap
      include_role:
        name: buluma.bootstrap

    - name: include java
      include_role:
        name: buluma.java

    - name: include tomcat
      include_role:
        name: buluma.tomcat

Or a with_items (since Ansible 2.3) variation:

---
- name: Build a machine
  hosts: all

  tasks:
    - name: include role
      include_role:
        name: ""
      with_items:
        - buluma.bootstrap
        - buluma.java
        - buluma.tomcat

Sometimes it can be required to call one role from another role. I’d personally use import_role like this:

---
- name: do something
  debug:
    msg: "Some task"

- name: call another role
  import_role:
    name: role.name

If the role (role.name in this example) requires variables, you can set them in vars/main.yml, like so:

variable_x_for_role_name: foo
variable_y_for_role_name: bar

A real life example is my buluma.artifactory role calls buluma.service role to add a service. The code for the artifactory role contains:

# snippet
- name: create artifactory service
  import_role:
    name: buluma.service
# endsnippet

and the variable are set in [vars/main.yml](https://github.com/buluma/ansible-role-artifactory/blob/master/vars/main.yml) contains:

service_list:
  - name: artifactory
    description: Start script for Artifactory
    start_command: "/bin/artifactory.sh start"
    stop_command: "/bin/artifactory.sh stop"
    type: forking
    status_pattern: artifactory

How to write and maintain many Ansible roles

It’s great to have many code nuggets around to help you setup an environment rapidly. Ansible roles are perfect to describe what you want to do on systems.

As soon as you start to write more roles, you start to develop a style and way of working. Here are the tings I’ve learned managing many roles.

Use a skeleton for stating a new role

When you start to write a new role, you can start with pre-populated code:

ansible-galaxy init --role-skeleton=ansible-role-skeleton role_name

To explain what happens:

  • ansible-galaxy is a command. This may change to molecule in the future.
  • init tells ansible-galaxy to initialize a new role.
  • --role-skeleton=ansible-role-skeleton refers to a skeleton ansible role. I use his repository.
  • role_name is the name of your new role. I use short names here, like nginx or postfix.

Use ansible-lint for quick feedback

Andrew has written a tool including many rules that help you write readable and consistent code.

There are times where I don’t agree to the rules, but the feedback is quickly processed.A

There are also times where I initially think rules are useless, but after a while I’m convinced about the intent and change my code.

You can also describe your preferences and use ansible-lint to verify you code. Great for teams that need to agree on a style.

Use molecule on Travis to test

In my opinion the most important part of writing code is testing. I spend a lot of time on writing and executing tests. It helps yourself to prove that certain scenarios work as intended.

Travis can help test your software. A typical commit takes some 30 to 45 minutes to test, but after that I know:

  1. It works on the platforms I want to support.
  2. When it works, the software is released to Galaxy
  3. Pull requests are automatically tested.

It makes me less afraid of committing.

Use versions or tags to release software

When I write some new functionality, I typically need a few iterations to make it work. Using GitHub releases helps me to capture (and release) a working version of a role.

You can play as much as you want in between releases, but when a release is done, the role should work.

Go forth and develop!

You can setup a machine yourself for developing Ansible roles. I’ve prepared a repository that may help.

The playbook in that repository looks something like this:

---
- name: setup an ansible development environment
  hosts: all
  become: yes
  gather_facts: no

  roles:
    - buluma.bootstrap
    - buluma.update
    - buluma.fail2ban
    - buluma.openssh
    - buluma.digitalocean_agent
    - buluma.common
    - buluma.users
    - buluma.postfix
    - buluma.docker
    - buluma.investigate
    - buluma.ansible
    - buluma.ansible_lint
    - buluma.buildtools
    - buluma.molecule
    - buluma.ara
    - buluma.ruby
    - buluma.travis

  tasks:
    - name: copy private key
      copy:
        src: id_rsa
        dest: /home/robertdb/.ssh/id_rsa
        mode: "0400"
        owner: robertdb
        group: robertdb

    - name: copy git configuration
      copy:
        src: gitconfig
        dest: /home/robertdb/.gitconfig

    - name: create repository_destination
      file:
        path: ""
        state: directory
        owner: robertdb
        group: robertdb

    - name: clone all roles
      git:
        repo: "/.git"
        dest: "/"
        accept_hostkey: yes
        key_file: /home/robertdb/.ssh/id_rsa
      with_items: ""
      become_user: robertdb

When is a role a role

Sometimes it’s not easy to see when Ansible code should be captured in an Ansible role, or when tasks can be used.

Here are some guidelines that help me decide when to choose for writing an Ansible role:

Don’t repeat yourself

When you start to see that your repeating blocks of code, it’s probably time to move those tasks into an Ansible role.

Repeating yourself may:

  • Introduce more errors
  • Be more difficult to maintain

Keep it simple

Over time Ansible roles tend to get more complex. Jeff Geerling tries to keep Ansible roles under 100 lines. That can be a challenge, but I agree with Jeff.

Whenever I open up somebody else’ Ansible role and the code keeps on scrolling, I tend to get demotivated:

  • Where can you find the error/issue/bug?
  • How can this be maintained?
  • There is probaly no easy way to test this.
  • The code does many things and misses focus.

Cleanup your playbook

Another reason to put code in Ansible roles, is to keep your playbook easy to read. A long list of tasks is harder to read than a list of roles.

Take a look at this example:

- name: build the backend server
  hosts: backend
  become: yes
  gather_facts: no

  roles:
    - buluma.bootstrap
    - buluma.update
    - buluma.common
    - buluma.python_pip
    - buluma.php
    - buluma.mysql
    - buluma.phpmyadmin

This code is simple to read, anybody could have an understanding what it does.

When input is required

Some roles can have variables to change the installation, imagine this set of variables:

httpd_port: 80

The role can assert variables, for example:

- name: test input
  assert:
    that:
      - httpd_port <= 65535
      - httpd_port >= 1

Check yourself

To verify that you’ve made the right decision:

Could you publish this role?

That means you did not put data in the role, except sane defaults.

Would anybody else be helped with your role?

That means you thought about the interface (defaults/main.yml).

Is there a simple way to test your role?

That means the role is focused and can do just a few things.

Was it easy to think of the title?

That means you knew what you were building.

Conclusion

Hope this helps you decide when a role is a role.

Testing CVE 2018-19788 with Ansible

So a very simple exploit on polkit has been found. There is not solution so far.

To test if your system is vulnerable, you can run this Ansible role.

A simple playbook that includes a few roles:

---
- name: test cve 2018 19788
  hosts: all
  gather_facts: no
  become: yes

  roles:
    - buluma.bootstrap
    - buluma.update
    - buluma.cve_2018_19788

And a piece of altered-for-readability code from the role:

- name: create a user
  user:
    name: cve_2018_19788
    uid: 2147483659

- name: execute a systemctl command as root
  service:
    name: chronyd
    state: started

In my tests these were the results: (snipped, only kept the interesting part)

TASK [ansible-role-cve_2018_19788 : test if user can manage service] ***********
    ok: [cve-2018-19788-debian] => {
        "changed": false, 
        "msg": "All assertions passed"
    }
    fatal: [cve-2018-19788-ubuntu-16]: FAILED! => {
        "assertion": "not execute_user.changed", 
        "changed": false, 
        "evaluated_to": false, 
        "msg": "users can manage services"
    }
    ...ignoring
    fatal: [cve-2018-19788-ubuntu-18]: FAILED! => {
        "assertion": "not execute_user.changed", 
        "changed": false, 
        "evaluated_to": false, 
        "msg": "users can manage services"
    }
    ...ignoring
    fatal: [cve-2018-19788-ubuntu-17]: FAILED! => {
        "assertion": "not execute_user.changed", 
        "changed": false, 
        "evaluated_to": false, 
        "msg": "users can manage services"
    }
    ...ignoring
    fatal: [cve-2018-19788-fedora]: FAILED! => {
        "assertion": "not execute_user.changed", 
        "changed": false, 
        "evaluated_to": false, 
        "msg": "users can manage services"
    }
    ...ignoring
    fatal: [cve-2018-19788-centos-7]: FAILED! => {
        "assertion": "not execute_user.changed", 
        "changed": false, 
        "evaluated_to": false, 
        "msg": "users can manage services"
    }
    ...ignoring
    ok: [cve-2018-19788-centos-6] => {
        "changed": false, 
        "msg": "All assertions passed"
    }

So for now these distributions seem vulnerable, even after an update:

  • Ubuntu 16
  • Ubuntu 17
  • Ubuntu 18
  • Fedora 28
  • Fedora 29
  • CentOS 7

Ansible on Fedora 30.

Fedora 30 (currently under development as rawhide) does not have python2-dnf anymore.

The Ansible module dnf tries to install python2-dnf if it running on a python2 environment. It took me quite some time to figure out why this error appeared:

fatal: [bootstrap-fedora-rawhide]: FAILED! => {"attempts": 10, "changed": true, "msg": "non-zero return code", "rc": 1, "stderr": "Error: Unable to find a match\n", "stderr_lines": ["Error: Unable to find a match"], "stdout": "Last metadata expiration check: 0:01:33 ago on Thu Nov 29 20:16:32 2018.\nNo match for argument: python2-dnf\n", "stdout_lines": ["Last metadata expiration check: 0:01:33 ago on Thu Nov 29 20:16:32 2018.", "No match for argument: python2-dnf"]}

(I was not trying to install python2-dnf, so confusion…)

Hm; so I’ve tried these options to work around the problem:

  • Use alternatives to set /usr/bin/python to /usr/bin/python3. Does not work, the Ansible module dnf will still try to install python2-dnf.
  • Set ansible_python_interpreter for Fedora-30 hosts. Does not work, my bootstrap role does not have any facts, it does not know about ansible_distribution (Fedora), nor ansible_distribution_major_version (30).

so far the only reasonable option is to set ansible_python_interpreter as documented by Ansible.

provisioner:
  name: ansible
  inventory:
    group_vars:
      all:
        ansible_python_interpreter: /usr/bin/python3

This means all roles that use distributions that:

  • use dnf
  • don’t have python2-dnf

will need to be modified… Quite a change.

2 December 2018 update: I’ve created pull request 49202 to fix issue 49362.

TL;DR On Fedora 30 (and higher) you have to set ansible_python_interpreter to /usr/bin/python3.

Ansible Molecule testing on EC2

Molecule is great to test Ansible roles, but testing locally with has it’s limitations:

  • Docker - Not everything is possible in Docker, like starting services, rebooting of working with block devices.
  • Vagrant - Nearly everything is possible, but it’s resource intensive, making testing slow.

I use my bus-ride time to develop Ansible Roles and the internet connection is limited, which means a lot of waiting. Using AWS EC2 would solve a lot of problems for me.

Here is how to add an EC2 scenario to an existing role.

Save AWS credentials

Edit ~/.aws/credentials using information downloaded from [AWS Console].

[default]
aws_access_key_id=ABC123
aws_secret_access_key=ABC123

Install extra software

On the node where you initiate the tests, a few extra pip modules are required.

pip install boto boto3

Add a scenario

If you already have a role and want to add a single scenario:

cd ansible-role-your-role
molecule init scenario --driver-name ec2 --role-name ansible-role-your-role --scenario-name ec2

Start testing

And simply start testing in a certain region.

export EC2_REGION=eu-central-1
molecule test --scenario-name ec2

The molecule.yml should look something like this:

---
dependency:
  name: galaxy
driver:
  name: ec2
lint:
  name: yamllint
platforms:
  - name: rhel-7
    image: ami-c86c3f23
    instance_type: t2.micro
    vpc_subnet_id: subnet-0e688067
  - name: sles-15
    image: ami-0a1886cf45f944eb1
    instance_type: t2.micro
    vpc_subnet_id: subnet-0e688067
  - name: amazon-linux-2
    image: ami-02ea8f348fa28c108
    instance_type: t2.micro
    vpc_subnet_id: subnet-0e688067
provisioner:
  name: ansible
  lint:
    name: ansible-lint
scenario:
  name: ec2

Weirdness

It feels as if the ec2 driver has had a little less attention as for example the vagrant or docker driver. Here are some strange things:

  • The region needs to be set using an environment variable, the credentials from a file. This may be my mistake, but now it’s a little strange. It would feel more logical to add region: to the platform section.
  • The vpc_subnet_id should be found by the user and put into molecule.yml.

Molecule and ARA

To test playbooks, molecule is really great. And since Ansible Fest 2018 (Austin, Texas) clearly communicated that Molecule will be a part of Ansible, I guess it’s safe to say that retr0h’s tool will be here to stay.

When testing, it’s even nicer to have great reports. That’s where ARA comes in. ARA collects job output as a callback_plugin, saves it and is able to display it.

Here is how to set it up.

Install molecule

pip install molecule

Install ara

pip install ara

Start ara

ara-manage runserver

Configure molecule to use ara

Edit molecule.yml, under provisioner:

provisioner:
  name: ansible
  config_options:
    defaults:
      callback_plugins: /usr/lib/python2.7/site-packages/ara/plugins/callbacks

Now point your browser to http://localhost:9191/ and run a molecule test:

molecule test

[204] Lines should be no longer than 120 chars

It seems Galaxy is going to use galaxy-lint-rules to star roles. One of the controls tests the length of the lines. Here are a few way so pass those rules.

Spread over lines

In YAML you can use multi line to spread long lines.

Without new lines

The > character replaces newlines by spaces.

- name: demostrate something
  debug: 
    msg: >
      This will just
      be a single long
      line.

With new lines

The | character keeps newlines.

- name: demostrate something
  debug:
    msg: |
      The following lines
      will be spread over
      multiple lines.

Move long lines to vars

Sometimes variables can get very long. You can save a longer variable in a shorter one.

For example, too long would be this task in main.yml:

- name: unarchive zabbix schema
  command: gunzip /usr/share/doc/zabbix-server-{{ zabbix_server_type }}-{{ zabbix_version_major }}.{{ zabbix_version_minor }}/create.sql.gz

Copy-paste that command to vars/main.yml:

gunzip_command: "gunzip /usr/share/doc/zabbix-server-{{ zabbix_server_type }}-{{ zabbix_version_major }}.{{ zabbix_version_minor }}/create.sql.gz"

And change main.yml to simply:

- name: unarchive zabbix schema
  command: "{{ gunzip_command }}"

Conclusion

Yes it’s annoying to have a limitation like this, but it does make the code more readable and it’s not difficult to change your roles to get 5 stars.

Ansible roles for clusters

Ansible can be used to configure clusters. It’s actually quite easy!

Typically a cluster has some master/primary/active node, where stuff needs to be done and other stuff needs to be done on the rest of the nodes.

Ansible can use run_once: yes on a task, which “automatically” selects a primary node. Take this example:

inventory:

host1
host2
host3

tasks/main.yml:

- name: do something on all nodes
  package:
    name: screen
    state: present

- name: select the master/primary/active node
  set_fact:
    master: ""
  run_once: yes

- name: do something to the master only
  command: id
  when:
    - inventory_hostname == master

- name: do something on the rest of the nodes
  command: id
  when:
    - inventory_hostname != master

It’s a simple and understandable solution. You can even tell Ansible that you would like to pin a master:

- name: select the master/primary/active node
  set_fact:
    master: ""
  run_once: yes
  when:
    - master is not defined

In the example above, if you set “master” somewhere, a user can choose to set a master instead of “random” selection.

Hope it helps you!

Ansible Galaxy Lint

Galaxy currently is a dumping place for Ansible roles, anybody can submit any quality role there and it’s kept indefinitely.

For example Nginx is listed 1122 times. Happily Jeff Geerling’s role shows up among the top list, probably because it has the most downloads.

The Galaxy team has decided that checking for quality is one way to improve search results. It looks liek roles will have a few criterea:

The rules are stored in galaxy-lint-roles. So far Andrew, House and Robert have contributed, feel free to propose new rules or improvements!

You can prepare your roles:

cd directory/to/save/the/rules
git clone https://github.com/ansible/galaxy-lint-rules.git
cd directory/to/your/role
ansible-lint -r directory/to/save/the/rules/galaxy-lint-rules/rules .

I’ve removed quite a few errors by using these rules:

  • Missing spaces after {{ or before }}.
  • Comparing booleans using == yes.
  • meta/main.yml mistakes

You can peek how your roles are scored on development.

Ansible 2.7

As announced, Ansible 2.7 is out. The changed look good, I’m testing my bootstrap role against it.

In 2.7 (actually since 2.3) all package modules don’t need with_items: or loop: anymore. This make for simpler code.

- name: customize machine
  hosts: all

  vars:
    packages:
      - bash
      - screen
      - lsof

  tasks:
    - name: install packages
      package:
        name: "{{ packages }}"
        state: present

Wow, that’s simpler so better.

A reboot module has been introduced. Rebooting in Ansible is not easy, so this could make life much simpler.