个人工具

Quick HOWTO : Ch1 : Network Backups With Rancid

来自Ubuntu中文

跳转至: 导航, 搜索


Introduction

One of the most commonly overlooked aspects of network management is the failure to backup network device configurations. Sadly it is only viewed as being a priority after disaster strikes. Fortunately there is a Linux / Unix open source package called Rancid that can get the job done automatically for most devices that have command prompt method for configuration.

The product can be downloaded from the rancid website and has the added advantage of automatically archiving the older configuration versions in a Concurrent Versions System (CVS). This tutorial will show you how to quickly install and configure it for your network backup needs.

Rancid Installation

Under Fedora Linux, installation is relatively easy, but there are a large number of simple steps to follow. Let's begin:

1. Rancid uses the expect programming language to operate which you will have to install in advance. Use the rpm command with the -q qualifier to determine whether you have expect installed. In this case, it isn't so the yum command is used to do so.

[root@bigboy tmp]# rpm -q expect
package expect is not installed
[root@bigboy rancid-2.3.2a2]# yum -y install expect
Repository updates-released already added, not adding again
Repository base already added, not adding again
Setting up Install Process
...
...
...
[root@bigboy rancid-2.3.2a2]#

2) Create a Linux group named netadm which will eventually have access to the Rancid directory.

[root@bigboy tmp]# groupadd netadm

3) Create a user named rancid that will be used to run the network device backups every night. Here we make rancid a member of the netadm group and make /usr/local/rancid its home directory.

[root@bigboy tmp]# useradd -g netadm -c "Networking Backups" -d /usr/local/rancid rancid

4) Create a directory called /usr/local/rancid/tar and use the wget command to get the latest version of the Rancid tar file from its web site.

[root@bigboy tmp]# mkdir /usr/local/rancid/tar
[root@bigboy tmp]# cd /usr/local/rancid/tar
[root@bigboy tar]# wget ftp://ftp.shrubbery.net/pub/rancid/rancid-2.3.2a2.tar.gz
--01:14:26--   ftp://ftp.shrubbery.net/pub/rancid/rancid-2.3.2a2.tar.gz
                     => `rancid-2.3.2a2.tar.gz'
...
...
...
100%[==============================>] 280,435           153.28K/s
 
01:14:58 (152.78 KB/s) - `rancid-2.3.2a2.tar.gz' saved [280,435]
[root@bigboy tar

5) Rancid needs to be compiled. Next, you will need to extract the files from the Rancid tar file as a pre-compilation step. In this case the file is named rancid-2.3.2a2.tar.gz so the extraction process will place all the preliminary files in a directory named rancid-2.3.2a2.

[root@bigboy tar]# tar -xvzf rancid-2.3.2a2.tar.gz
rancid-2.3.2a2/bin/Makefile.am
rancid-2.3.2a2/bin/Makefile.in
rancid-2.3.2a2/bin/alogin.in
...
...
...
rancid-2.3.2a2/man/lg.conf.5.in
rancid-2.3.2a2/man/rancid.conf.5.in
rancid-2.3.2a2/man/lg_intro.1.in
[root@bigboy tar]#

6) Enter the directory.

[root@bigboy tar]# cd rancid-2.3.2a2
[root@bigboy rancid-2.3.2a2]#

7) In this directory there is a README file with instructions on what to do next. You can view it using the less command to see the various configuration options offered. We will proceed in this example by using a very simple scenario.

[root@bigboy rancid-2.3.2a2]# less README

8) Prepare the Rancid package for compiling with the configure command. Here, the --prefix switch is used to set the default directory to match the /usr/local/rancid/ home directory of our rancid user.


[root@bigboy rancid-2.3.2a2]# ./configure --prefix=/usr/local/rancid/
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
...
...
...
config.status: creating include/config.h
config.status: include/config.h is unchanged
config.status: executing depfiles commands
[root@bigboy rancid-2.3.2a2]#

9) Install the package with the make command.

[root@bigboy rancid-2.3.2a2]# make install
Making install in .
gmake[1]: Entering directory `/usr/local/rancid/tar/rancid-2.3.2a2'
gmake[2]: Entering directory `/usr/local/rancid/tar/rancid-2.3.2a2'
gmake[2]: Nothing to be done for `install-exec-am'.
test -z "/usr/local/rancid//share/rancid" || mkdir -p -- "/usr/local/rancid//share/rancid"
...
...
...
/usr/bin/install -c 'downreport' '/usr/local/rancid//share/rancid/downreport'
gmake[2]: Leaving directory `/usr/local/rancid/tar/rancid-2.3.2a2/share'
gmake[1]: Leaving directory `/usr/local/rancid/tar/rancid-2.3.2a2/share'
[root@bigboy rancid-2.3.2a2]#

10) There is a sample password file named cloginrc.sample. You'll need to copy it to the /usr/local/rancid/ home directory as the hidden file /usr/local/rancid/.cloginrc.

[root@bigboy rancid-2.3.2a2]# cp cloginrc.sample /usr/local/rancid/.cloginrc
[root@bigboy rancid-2.3.2a2]#

11) Finally you will need to set the .cloginrc file permissions to be readable by the rancid user and the new netadm Linux group. You will also have to change the ownership and permissions of the home directory in a similar fashion.

[root@bigboy rancid-2.3.2a2]# chmod 0640 /usr/local/rancid/.cloginrc
[root@bigboy rancid-2.3.2a2]# chown -R rancid:netadm /usr/local/rancid/
[root@bigboy rancid-2.3.2a2]# chmod 770 /usr/local/rancid/

Now that the installation is complete, you'll need to do some initial configuration to get Rancid to work. Don't worry, it is fairly straight forward.


Initial Rancid Configuration

Initial configuration involves setting up Rancid to periodically backup your configurations and email status reports to the necessary users.

1) The rancid.conf file is used to determine where rancid stores its configurations and other general parameters. We'll need to edit it.

[root@bigboy rancid-2.3.2a2]# vi /usr/local/rancid/etc/rancid.conf

In this example, we'll create a Rancid device group called "networking". All files related to this group will be stored in a sub-directory of the same name under the var sub-directory of the Rancid home directory. In other words /usr/local/var/networking.

By default Rancid filters out passwords and SNMP community strings. You may want to set the FILTER_PWDS and NOCOMMSTR variables to "NO" to prevent this.

#
# Sample rancid.conf
#
LIST_OF_GROUPS="networking"
FILTER_PWDS=NO; export FILTER_PWDS
NOCOMMSTR=NO; export NOCOMMSTR

2) Rancid will send status emails to mailing lists defined in the /etc/aliases file. The "networking" Rancid group will need to have groups named rancid-admin-networking and rancid-networking. A Rancid group named "alldevices" would have groups named rancid-admin-alldevices and rancid-alldevices.

In this example, the emails go to the noc mailing list made up of the addresses [email protected] and [email protected].

#
# Sample /etc/aliases
#

#
# Rancid email addresses
#
rancid-admin-networking:                 rancid-networking
rancid-networking:                       noc
noc:                                     [email protected]

3) The email aliases then need to be added sendmail alias database with the newaliases command.

[root@bigboy rancid-2.3.2a2]# newaliases
/etc/aliases: 82 aliases, longest 80 bytes, 983 bytes total
[root@bigboy rancid-2.3.2a2]#

4) The next couple steps need to be done as the rancid user. Use the su command to become the rancid user.

[root@bigboy rancid-2.3.2a2]# su - rancid

5) The rancid-cvs command needs to be used to create the /usr/local/var/networking directory and its associated database and network device list files.

[rancid@bigboy ~]$ /usr/local/rancid/bin/rancid-cvs
No conflicts created by this import
cvs checkout: Updating networking
cvs checkout: Updating networking/configs
cvs add: scheduling file `router.db' for addition
cvs add: use 'cvs commit' to add this file permanently
RCS file: /usr/local/rancid//var/CVS/networking/router.db,v
done
Checking in router.db;
/usr/local/rancid//var/CVS/networking/router.db,v   <--   router.db
initial revision: 1.1
done
[rancid@bigboy ~]$

6) The README file will be useful, so copy it to the home directory before deleting the rancid sub-directory under the tar sub-directory.

[rancid@bigboy ~]$ cp tar/rancid-2.3.2a2/README .
[rancid@bigboy ~]$ rm -rf tar/rancid-2.3.2a2
[rancid@bigboy ~]$

7) Now edit the rancid user's crontab table file to schedule regular backups using the /usr/local/rancid/bin/rancid-run file.

[rancid@bigboy ~]$ crontab -e

#
# Rancid user's crontab file
#

# Run config differ hourly
1 * * * * /usr/local/rancid/bin/rancid-run

# Clean out config differ logs
50 23 * * * /usr/bin/find /usr/local/rancid/var/logs -type f -mtime +2 -exec rm {} \;

The Rancid network device list and password files will now have to be edited before your configurations can be backed up, but first, let's review the most important file locations.

Rancid File Locations

Table 1-1 shows a list of important rancid file locations based on the configuration steps we've done so far. The following sections will review the most important ones in more detail.

Table 1-1 : Rancid File Locations

Location Description
/usr/local/rancid Base Rancid directory location
/usr/local/rancid/var/logs Location of the rancid backup log files. You can trace backup failures here.
/usr/local/rancid/bin Location of the executables
/usr/local/rancid/var/networking/configs Backup location of all the configurations
/usr/local/rancid/var/networking/router.db List of all devices that need to be backed up.
/usr/local/rancid/tar Location of the original rancid tar files
/usr/local/rancid/README General help file
/usr/local/rancid/.cloginrc Password file


The Rancid router.db file

The router.db file is the device list rancid uses to do its backups. It has the format:

dns-name-or-ip-address:device-type:status

Where dns-name-or-ip-address is the hostname or IP address of the device, device-type is the expected type of operating system the device should be running and status (which can be up or down) which determines whether the device should be backed up or not. This example is for a Cisco device with an IP address of 192.168.1.1.

192.168.1.1:cisco:up

Note: According to the Rancid help pages, "a '#' at the beginning of a line is considered as a comment and the entire line is ignored. If a device is deleted from the router.db file, then Rancid will clean up by removing the device's configuration file /usr/local/rancid/var/networking/configs directory. The CVS information for the device will be moved to CVS Attic directory (using cvs delete)."

Table 1-2 shows some important device-types for the router.db file.


Table 1-2 : Various device types for Rancid

Device Description
alteon An Alteon WebOS switches.
baynet A Bay Networks router.
cat5 A Cisco catalyst series 5000 and 4000 switches (i.e.: running the catalyst OS, not IOS).
cisco A Cisco router, PIX, or switch such as the 3500XL or 6000 running IOS (or IOS-like) OS.
css A Cisco content services switch.
enterasys An enterasys NAS. This is currently an alias for the riverstone device type.
erx A Juniper E-series edge router.
Extreme An Extreme switch.
ezt3 An ADC-Kentrox EZ-T3 mux.
force10 A Force10 router.
foundry A Foundry router, switch, or router-switch. This includes HP Procurve switches that are OEMs of Foundry products, such as the HP9304M.
hitachi A Hitachi routers.
hp A HP Procurve switch such as the 2524 or 4108 procurve switches. Also see the foundry type.
mrtd A host running the (merit) MRTd daemon.
netscalar A Netscalar load balancer.
netscreen A Netscreen firewall.
redback A Redback router, NAS, etc.
tnt A lucent TNT.
zebra Zebra routing software.
riverstone A Riverstone NAS or Cabletron (starting with version ~9.0.3) router.
juniper A Juniper router.


The Rancid .clogin.rc file

The .clogin.rc file lists all the passwords rancid will use. The one that comes with the Rancid installation kit has a lot of examples in it and is fairly self-explanatory. Unfortunately some of the examples are not commented out, so you will have to do so yourself. Here is a sample snippet using some commonly encountered scenarios.

#
# Sample .clogin.rc file
#
 
####################################################################
#
# Device 192.168.1.16 has a unique username and password, but
# doesn't logins do not get the enable prompt.
#
# If the device prompts for a username, Rancid will use the Linux
# "rancid" username and the first password in the list. If only a
# login password is requested, rancid uses the first password in the
# list. The second password is the "enable" password.
#
####################################################################

add password       192.168.1.16     {telnet-password}   {enable-password}

####################################################################
#
# Devices with DNS names ending in my-web-site.org in the router.db
# file or beginning with 172.16. have a different set of passwords.
#
# If the device prompts for a username, Rancid will use the Linux
# "rancid" username and the first password in the list. If only a
# login password is requested, rancid uses the first password in the
# list. The second password is the "enable" password.
#
####################################################################

add password *.my-web-site.org   {telnet-password}     {enable-password}
add password 172.16.*            {telnet-password}     {enable-password}

####################################################################
#
# Everything else uses these passwords. Rancid will attempt to use
# telnet then SSH for logins
#
####################################################################

add password   *    {telnet-password}     {enable-password}
add method     *    telnet ssh

Testing Rancid

Rancid has a number of scripts that can be run as part of a testing program and the logs they create are fairly detailed. Here are some examples. As a general rule, it is usually easiest to do testing as the rancid user.


Testing A Login for a Single Device

The clogin script in the bin directory can be used to read the .cloginrc file as part of an interactive test. In this example, we successfully log in to our 192.168.1.1 Cisco device and get an interactive enable prompt.

[rancid@bigboy ~]$ bin/clogin 192.168.1.1
192.168.1.1
spawn telnet 192.168.1.1
Trying 192.168.1.1...
Connected to (192.168.1.1).
Escape character is '^]'.
 
User Access Verification
 
Password:
Type help or '?' for a list of available commands.
pixfirewall> enable
Password: ********
pixfirewall#
pixfirewall# exit

Logoff

Connection closed by foreign host.
[rancid@bigboy ~]$

You can still test if you are not logged in as the rancid Linux user, but are a member of the netadm group (or root). Simply use the clogin command as user rancid and using the /usr/local/rancid/.cloginrc password file as in the example below.

[root@bigboy tmp]$ /usr/local/rancid/bin/clogin \
-f /usr/local/rancid/.cloginrc -u netadm 192.168.1.1


Testing For All Devices

The rancid-run script in the bin directory can be used to read the .cloginrc file as part of a complete test.

[rancid@bigboy ~]$ bin/rancid-run
[rancid@bigboy ~]$


Troubleshooting Using the Rancid Log Files

The var/logs/ directory contains all the rancid logs sorted by date as we can see here.

[rancid@bigboy ~]$ ls var/logs/
networking.20050721.020048   networking.20050721.020101
[rancid@bigboy ~]$


Successful Execution

When successful, the Rancid log file has a "All routers successfully completed" message near the end.

[rancid@bigboy ~]$ less var/logs/networking.20050721.020101
starting: Thu Jul 21 02:01:01 PDT 2005

Trying to get all of the configs.
All routers successfully completed.
 
cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs
 
ending: Thu Jul 21 02:01:06 PDT 2005
[rancid@bigboy ~]$

If the rancid-run script was used, you should now see a copy of your configuration in the var/networking/configs/ directory as seen here.

[rancid@bigboy ~]$ ls var/networking/configs/
192.168.1.1   CVS
[rancid@bigboy ~]$

Possible Reasons for Failure

From time to time, Rancid will fail, usually for configuration file or connectivity reasons. In these cases the log file entries will look like this with an "End of run not found" message at the end:

192.168.1.1: missed cmd(s): admin show diag,dir /all slavedisk2:,show rsp chassis-info,dir /all sec-slot2:,show diag,dir /all disk1:,show gsr chassis,dir /all sec-nvram:,dir /all disk2:,dir /all sec-bootflash:,show spe version,dir /all slaveslot2:,dir /all disk0:,show install active,show bootvar,dir /all slaveslot0:,dir
...
...
...
version,show redundancy secondary,show running-config,show c7200,dir /all slot1: 192.168.1.1: End of run not found

This could be due to any one of the following causes:

  1. The IP address or DNS name used in the router.db file is incorrect.
  2. The device type entry in the router.db file is incorrect.
  3. For Cisco devices, the login device prompt doesn't end in a ">".
  4. The device is inaccessible from the server running Rancid.
  5. The password information in the .clogin.rc file is incorrect.
  6. A device accessible by only SSH was replaced and the SSH keys on the device were not regenerated. A tell tale sign is that SSH sessions will get "connection refused" messages like this one:
[rancid@bigboy ~]$ ssh 192.168.1.1
ssh: connect to host 192.168.1.1 port 22: Connection refused
[rancid@bigboy ~]$
  1. The rancid-run command was previously run from the command line and was aborted using <CTRL-C>. This causes a lock file to be left behind. A new instance of Rancid will not run unless this file is deleted. In our case the file name is:
/tmp/.networking.run.lock


Getting Rancid Help

Configuration help can be found in the /usr/local/rancid/README file, but this is often insufficient. Better assistance can be obtained as seen in the following sections.

You can use the man -M /usr/local/rancid/man <filename> to get help on the use of any file in the rancid directory tree. In this example there is help on the router.db file.

[rancid@bigboy ~]$ man -M /usr/local/rancid/man router.db


Conclusion

Backing up of network configuration files is an essential network engineering maintenance activity. Rancid, is a very popular, reliable and effective application that should capably handle most of your needs.