Getting Started with Ansible for Network Automation
The hardest part of network automation isn't the tooling — it's trusting a script with a box you'd normally handle like live ordnance. Here's the path I took from ad-hoc SSH sessions to playbooks running against production, and the lessons that came with it.
Why Ansible for networks
Most network automation journeys start the same way mine did: a task so repetitive it hurts. Mine was compliance changes across a fleet of switches — the same six lines of config, dozens of devices, a change window, and a human typing into each one. Every device a chance for a typo; every typo a potential incident.
Ansible fits networking well for three practical reasons: it's agentless (talks SSH/API to devices, nothing to install on a switch), it's declarative-ish (you describe state, modules work out the commands), and the inventory model maps naturally onto how network teams already think — sites, roles, device groups.
The minimum viable setup
Three files get you from zero to a working automation. First the inventory:
all:
children:
edinburgh_switches:
hosts:
edi-leaf-01:
ansible_host: 10.10.0.11
edi-leaf-02:
ansible_host: 10.10.0.12
vars:
ansible_network_os: arista.eos.eos
ansible_connection: ansible.netcommon.network_cli
ansible_user: automation
Install the vendor collection (ansible-galaxy collection install arista.eos — or
cisco.ios, junipernetworks.junos, etc.), then write the first playbook.
Make it read-only:
---
- name: Back up running configs
hosts: edinburgh_switches
gather_facts: false
tasks:
- name: Pull running config
arista.eos.eos_config:
backup: true
backup_options:
dir_path: ./backups
filename: "{{ inventory_hostname }}.cfg"
Run it nightly from cron or CI and you've already built something valuable: point-in-time config history for the whole estate, before you've automated a single change.
Your first change — and the two flags that make it safe
---
- name: Standardise NTP
hosts: edinburgh_switches
gather_facts: false
tasks:
- name: Ensure NTP servers are configured
arista.eos.eos_config:
lines:
- ntp server 10.0.0.100 prefer
- ntp server 10.0.0.101
save_when: changed
Before this ever touches a device, run it with --check --diff. Check mode asks the device
what would change without changing it; diff shows you the exact lines. This pair of flags is
the single biggest trust-builder for a network engineer starting out — you get the dry-run
discipline of a change request, executed in seconds.
Idempotency is the point. Run the playbook twice; the second run should report
changed: 0. Once changes are idempotent, "config drift" stops being an audit finding
and becomes something you simply re-run away.
Lessons from production rollouts
Start read-only, earn write access
Backups, fact gathering, compliance reports. You build the inventory, credentials handling and muscle memory with zero blast radius — and you accumulate evidence that convinces the change board when you propose the first automated write.
Git is not optional
Playbooks, inventory, and generated configs live in a repository. The moment two engineers edit the same playbook, or you need to answer "what changed last Tuesday?", version control is the difference between automation and a shared folder of scripts.
Vault your credentials from day one
No plaintext passwords in inventory — ever. ansible-vault is built in and takes ten
minutes to adopt. Retro-fitting secrets hygiene after credentials have leaked into a repo's
history is a far worse afternoon.
Limit blast radius mechanically
Use --limit to run against one device, then a site, then the estate. Wire
serial: 1 into playbooks that touch anything critical so a bad change stops at the
first device, not the fiftieth. Structure your first production runs like you'd structure a
manual change: canary, verify, proceed.
Don't automate what you don't understand
Ansible executes your intent faster — including bad intent. If you can't articulate exactly what a change does on one device by hand, you're not ready to run it against a hundred. Automation amplifies competence and incompetence equally.
Where this leads
Ad-hoc playbooks are the gateway. The mature end state is network-as-code: your intended network described in YAML data models, configs generated and validated in CI, devices converged to the model rather than patched by hand. In the Arista world that's AVD; similar patterns exist for every major vendor. But every team I've seen get there started exactly where this post starts — one engineer, one inventory file, one backup playbook.
Summary
- Start with read-only playbooks: backups and facts, zero risk, immediate value
--check --diffbefore every write; idempotency as the acceptance test- Git for everything, ansible-vault for secrets,
--limitandserialfor blast radius - Automate only what you can do confidently by hand
- The destination is network-as-code — but the first playbook is the step that matters