AutomationJuly 2026 · 8 min read

Getting Started with Ansible for Network Automation

The hardest part of network automation isn't the tooling — it's trusting a script with a box you'd normally handle like live ordnance. Here's the path I took from ad-hoc SSH sessions to playbooks running against production, and the lessons that came with it.

Why Ansible for networks

Most network automation journeys start the same way mine did: a task so repetitive it hurts. Mine was compliance changes across a fleet of switches — the same six lines of config, dozens of devices, a change window, and a human typing into each one. Every device a chance for a typo; every typo a potential incident.

Ansible fits networking well for three practical reasons: it's agentless (talks SSH/API to devices, nothing to install on a switch), it's declarative-ish (you describe state, modules work out the commands), and the inventory model maps naturally onto how network teams already think — sites, roles, device groups.

The minimum viable setup

Three files get you from zero to a working automation. First the inventory:

inventory.yml
all:
  children:
    edinburgh_switches:
      hosts:
        edi-leaf-01:
          ansible_host: 10.10.0.11
        edi-leaf-02:
          ansible_host: 10.10.0.12
      vars:
        ansible_network_os: arista.eos.eos
        ansible_connection: ansible.netcommon.network_cli
        ansible_user: automation

Install the vendor collection (ansible-galaxy collection install arista.eos — or cisco.ios, junipernetworks.junos, etc.), then write the first playbook. Make it read-only:

backup.yml — the safest possible first playbook
---
- name: Back up running configs
  hosts: edinburgh_switches
  gather_facts: false
  tasks:
    - name: Pull running config
      arista.eos.eos_config:
        backup: true
        backup_options:
          dir_path: ./backups
          filename: "{{ inventory_hostname }}.cfg"

Run it nightly from cron or CI and you've already built something valuable: point-in-time config history for the whole estate, before you've automated a single change.

Your first change — and the two flags that make it safe

ntp.yml — enforcing standard NTP servers
---
- name: Standardise NTP
  hosts: edinburgh_switches
  gather_facts: false
  tasks:
    - name: Ensure NTP servers are configured
      arista.eos.eos_config:
        lines:
          - ntp server 10.0.0.100 prefer
          - ntp server 10.0.0.101
        save_when: changed

Before this ever touches a device, run it with --check --diff. Check mode asks the device what would change without changing it; diff shows you the exact lines. This pair of flags is the single biggest trust-builder for a network engineer starting out — you get the dry-run discipline of a change request, executed in seconds.

Idempotency is the point. Run the playbook twice; the second run should report changed: 0. Once changes are idempotent, "config drift" stops being an audit finding and becomes something you simply re-run away.

Lessons from production rollouts

Start read-only, earn write access

Backups, fact gathering, compliance reports. You build the inventory, credentials handling and muscle memory with zero blast radius — and you accumulate evidence that convinces the change board when you propose the first automated write.

Git is not optional

Playbooks, inventory, and generated configs live in a repository. The moment two engineers edit the same playbook, or you need to answer "what changed last Tuesday?", version control is the difference between automation and a shared folder of scripts.

Vault your credentials from day one

No plaintext passwords in inventory — ever. ansible-vault is built in and takes ten minutes to adopt. Retro-fitting secrets hygiene after credentials have leaked into a repo's history is a far worse afternoon.

Limit blast radius mechanically

Use --limit to run against one device, then a site, then the estate. Wire serial: 1 into playbooks that touch anything critical so a bad change stops at the first device, not the fiftieth. Structure your first production runs like you'd structure a manual change: canary, verify, proceed.

Don't automate what you don't understand

Ansible executes your intent faster — including bad intent. If you can't articulate exactly what a change does on one device by hand, you're not ready to run it against a hundred. Automation amplifies competence and incompetence equally.

Where this leads

Ad-hoc playbooks are the gateway. The mature end state is network-as-code: your intended network described in YAML data models, configs generated and validated in CI, devices converged to the model rather than patched by hand. In the Arista world that's AVD; similar patterns exist for every major vendor. But every team I've seen get there started exactly where this post starts — one engineer, one inventory file, one backup playbook.

Summary