个人工具

Quick HOWTO : Ch23 : Advanced MRTG for Linux/zh

来自Ubuntu中文

Qiii2006讨论 | 贡献2010年7月23日 (五) 20:28的版本 Conclusion

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航, 搜索

简介

In many cases using MRTG in a basic configuration to monitor the volume of network traffic to your server isn't enough. You may also want to see graphs of CPU, disk, and memory usage. This chapter explains how to find the values you want to monitor in the SNMP MIB files and then how to use this information to configure MRTG.

在许多情况下利用MTRG的默认配置来监视你的服务器的网络流量是不够的。你可能同时想看到CPU,硬盘和内存的使用情况。这一章节将介绍如何在SNMP MIB中找到你所想监视的数据以及如何利用这些数据来配置MRTG。

All the chapter's examples assume that the SNMP Read Only string is craz33guy and that the net-snmp-utils RPM package is installed (see Chapter 22, " Monitoring Server Performance").

本章节所有的例子均假定SNMP只读社区字符串值为craz33guy并且RPM中的net-snmp-utils包已经安装(see Chapter 22, " Monitoring Server Performance")

Linux MIB目录的打开和查看

Residing in memory, MIBs are data structures that are constantly updated via the SNMP daemon. The MIB configuration text files are located on your hard disk and loaded into memory each time SNMP restarts.

MIB是驻留在内存住的数据结构,并且通过SNMP进程刷新数据。MIB的配置文档被存储在硬盘中,并在SNMP启动的时候导入到内存中。

You can easily find your Fedora Linux MIBs by using the locate command and filtering the output to include only values with the word "snmp" in them. As you can see in this case, the MIBs are located in the /usr/share/snmp/mibs directory:

你可以通过查找指定在Fedora Linux中轻松地找到MIB,而且如果在指令中添加“snmp”可以过滤输出使它们之包含数据。在下面的例子中你可以看到,MIB位于/usr/share/snmp/mibs directory:

[root@bigboy tmp]# locate mib | grep snmp
/usr/share/doc/net-snmp-5.0.6/README.mib2c
/usr/share/snmp/mibs
/usr/share/snmp/mibs/DISMAN-SCHEDULE-MIB.txt
...
...
[root@bigboy tmp]#

As the MIB configurations are text files you can search for keywords in them using the grep command. This examples searches for the MIBs that keep track of TCP connections and returns the RFC1213 and TCP MIBs as the result.

在text文档中编辑MIB设置时,你可以通过grep指令来查找关键字。在下面这个例子中在MIB中搜索TCP连接,可以看到搜索结果是RFC1213和TCP MIB。

[root@silent mibs]# grep -i tcp /usr/share/snmp/mibs/*.txt | grep connections
...
RFC1213-MIB.txt: "The limit on the total number of TCP connections
RFC1213-MIB.txt: "The number of times TCP connections have made a
...
TCP-MIB.txt:     "The number of times TCP connections have made a
...
...
[root@silent mibs]#

You can use the vi editor to look at the MIBs. Don't change them, because doing so could cause SNMP to fail. MIBs are very complicated, but fortunately the key sections are commented.

你还可以通过vi deitor来查看MIB。不要修改它们,因为这样做可能会导致SNMP崩溃。MIB是非常复杂的,但是所幸那些重要部分是有解释的。

Each value tracked in a MIB is called an object and is often referred to by its object ID or OID. In this snippet of the RFC1213-MIB.txt file, you can see that querying the tcpActiveOpens object returns the number of active open TCP connections to the server. The SYNTAX field shows that this is a counter value.

MIB中所追踪的一个数据叫做一个对象,同时每个对象被它的对象ID或者说OID所指向。在这个RFC1213-MIB.txt的文件片段中,你可以看到查询 TCP Active Opens对象时的返回值是连接到服务器上的active open TCP连接数。SYNTAX字段显示这是一个计数值。

MIBs usually track two types of values. Counter values are used for items that continuously increase as time passes, such as the amount of packets passing through a NIC or amount of time CPU been busy since boot time. Integer values change instant by instant and are useful for tracking such statistics as the amount of memory currently being used.

MIB通常追踪两种类型的数据。计数值用于追踪那些随着时间增大的值,比如通过NIC(网卡)的包的数量或者从启动开始CPU的累计高负荷工作时间。整数值可以即时刷新从而可以用于追踪目前内存使用量的统计数据。

tcpActiveOpens OBJECT-TYPE
    SYNTAX  Counter
    ACCESS  read-only
    STATUS  mandatory
    DESCRIPTION
            "The number of times TCP connections have made a
            direct transition to the SYN-SENT state from the
            CLOSED state."
    ::= { tcp 5 }

You'll explore the differences between SNMP and MRTG terminologies in more detail later. Understanding them will be important in understanding how to use MRTG to track MIB values.

在下面你可以题回到SNMP和MRPG术语系统的更多不同。掌握这些不同对于理解如何使用MRTG来追踪MIB数据是很重要的。

测试你的MIB值

Once you have identified an interesting MIB value for your Linux system you can then use the snmpwalk command to poll it. Many times the text aliases in a MIB only reference the OID branch and not the OID the data located in a leaf ending in an additional number like a ".0" or ".1". The snmpget command doesn't work with branches giving an error stating that the MIB variable couldn't be found.

一旦你已经为你的Linux系统确定了一个你感兴趣的值,那么你就可以用snmpwalk指令来登记它。在许多情况下MIB中的text文件只是引用OID目录分支和非OID目录分支中那些“0”或者“1”的那些以额外数据形式存在于叶底的数据的替换通道。snmpget指令在MIB变量无法找到的目录分支中是无法使用的。

In the example below, the ssCpuRawUser OID alias was found to be interesting, but the snmpget command fails to get a value. Follow up with the snmpwalk command shows that the value is located in ssCpuRawUser.0 instead. The snmpget is then successful in retrieving the "counter32" type data with a current value of 396271.

在下面这个例子中,我们想找到ssCpuRawUser OID的替换通道,但是snmpget指令并没有找到它的值。在snmpwalk指令后面显示这个值被放在ssCpuRawUser.0处。那么snmpget是可以检索32位计数器类型数据的,并且其最大值为396271。

[root@bigboy tmp]# snmpget -v1 -c craz33guy localhost ssCpuRawUser  
Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
Failed object: UCD-SNMP-MIB::ssCpuRawUser
[root@bigboy tmp]#
  
[root@bigboy tmp]# snmpwalk -v1 -c craz33guy localhost ssCpuRawUser
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 396241
[root@bigboy tmp]# snmpget -v1 -c craz33guy localhost ssCpuRawUser.0
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 396271
[root@bigboy tmp]#

The MIB values that work successfully with snmpget are the ones you should use with MRTG.

在snmpget自令下可以正常工作的MIB值就是你可以在MRTG中使用的那些值。

MIB和MRTG术语方面的不同

Always keep in mind that MRTG refers to MIB counter values as counter values. It refers to MIB integer and gauge values as gauge. By default, MRTG considers all values to be counters.

我们应当知道MRTG以计数器值的方式引用MIB的计数值。它以量规值的形式引用MIB的整形数据和量规值。MRTG默认所有的数据为计数器值。

MRTG doesn't plot counter values as a constantly increasing graph, it plots only how much the value has changed since the last polling cycle. CPU usage is typically tracked by MIBs as a counter value; fortunately, you can edit your MRTG configuration file to make it graph this information in a percentage use format (more on this later).

MTRG中计数器值并不是以不断上升的图像显示的,而是以上次查询之后的变化值显示的。CPU占用两就是MIB所追踪的一个计数器值,而所幸,我们可以通过配置MRTG文件来以百分比图像的形式来显示这些信息。

The syntax type, the MIB object name, and the description of what it does are the most important things you need to know when configuring MRTG; I'll come back to these later.

句法类型,MIB的对象名称以及它的作用是当你配置MTRG是所需要了解的最重要的东西,我们现面就对这些进行讲解。

用MIB对CPU和内存进行监听

The UCD-SNMP-MIB MIB keeps track of a number of key performance MIB objects, including the commonly used ones in Table 23-1.

UCD-SNMP-MIB MIB跟踪许多MIB关键对象的情况,常用的一些对象被列在表23-1中。

Table 23-1 UCD-SNMP-MIB MIB中的关键对象

UCD-SNMP-MIB Object Variable MIB Type MRTG Type Description
ssCpuRawUser 计数器类型 计数器类型

从系统启动开始非特权用户应用所使用的总的CPU使用量。增加用户,系统和合理的参数之可以获得CPU总使用量的近似值。

ssCpuRawSystem 计数器类型 计数器类型 从系统启动开始特权用户应用所使用的总的CPU使用量。
ssCpuRawNice 计数器类型 计数器类型 由次优先级的应用所使用的总的CPU使用量。
ssCpuRawIdle 计数器类型 计数器类型

CPU空载的时间百分比 从100里面减去这个值可以得到一个总的CPU使用量的优良的近似值。

memAvailReal 整形 标准度量 主机上的物理内存空余量

用MIB监视TCP/IP协议

The TCP-MIB MIB keeps track of data connection information and contains the very useful tcpActiveOpens and tcpCurrEstab objects. Table 23.2 details the most important objects in TCP-MIB.

TCP-MIB MIB追踪数据连接信息并且包含一些非常有用的对象,如tcpActiveOpens和tcpCurrEstab。表23.2详细介绍了TCP-MIB中的一些重要对象

Table 23-2 TCP-MIB MIB中的重要对象

UCD-SNMP-MIB Object Variable MIB Type MRTG Type Description
tcpActiveOpens 计数器类型 计数器类型 测量已经结束的TCP连接数。
tcpCurrEstab 标准度量 标准度量 测量连接中的TCP连接数。
tcpInErrs 计数器类型 计数器类型 Total number of TCP segments with bad checksum errors

手动配置你的MRTG文件

The MRTG cfgmaker program creates configuration files for network interfaces only, simultaneously tracking two OIDs: the NIC's input and output data statistics. The mrtg program then uses these configuration files to determine the type of data to record in its data directory. The indexmaker program also uses this information to create the overview, or Summary View Web page for the MIB OIDs you're monitoring.

MRTG中的cfgmaker程序只能网络接口创建配置文件,并且同时监听两种对象标识:网卡的上行和下行流量的统计数据。然后它根据配置文件来决定在数据目录中记录哪种数据。indexmaker也利用这些配置文件来创建概述文件,或者说是你正在监听的MIB对象标识的Summary View Web页。

This Summary View page shows daily statistics only. You have to click on the Summary View graphs to get the Detailed View page behind it with the daily, weekly, monthly, and annual graphs. Some of the parameters in the configuration file refer to the Detailed View, others refer to the Summary View.

Summary View Web页只显示每天的统计数据。你可以选择Summary View图表来得到每日,每周,每月和每年的图表。配置文件中的一些参数是和Detailed View有关的,其它的则和Summary View有关。

If you want to monitor any other pairs of OIDs, you have to manually create the configuration files, because cfgmaker isn't aware of any OIDs other than those related to a NIC. The mrtg and indexmaker program can be fed individual OIDs from a customized configuration file and will function as expected if you edit the file correctly.

如果你想查看其它的成对的对象标识,因为cfgmaker不能识别除了和网卡有关的对象标识,因此你得手动创建配置文件。mrtg和indexmaker可以处理单独的自定义配置文件中的对象标识,并且在设置正确的情况下可以很好得运行。

参数格式

MRTG configuration parameters are always followed by a graph name surrounded by square brackets and a colon. The format looks like this:

MRTG的配置参后面经常跟着一个由方括号括起来的图表名(graph name)和一个冒号。格式如下:

Parameter[graph name]: value

For ease of editing, the parameters for a particular graph are usually grouped together. Each graph can track two OIDS listed in the Target parameter, which is usually placed at the very top of the graph name list. The two OID values are separated by an & symbol; the first one can be is the input OID, and the second one is the output OID.

为了便于编辑,图表的参数经常组合起来使用。每个图表只能监听目标参数中的两个目标标识,并且经常是位于图表名列表顶部的参数。两个对象标识的值由&符号分开,&前面的是输入的目标标识,而&后面的则是输出的目标标识。

图例参数

On the Detailed View Web page, each graph has a legend that shows the max, average, and current values of the graph's OID statistics. You can use the legendI parameter for the description of the input graph (first graph OID) and the legendO for the output graph (second graph OID).

在显示详细数据的Web页上,每个图表都有一个显示对象标识的最大值,平均值和当前值的图例。你可以用lengendI来设置输入数据的图像(第一个对象标识的图像),用lengendO来设置输出数据的图像(第二个对象标识的图像)

The space available under each graph's legend is tiny so MRTG also has legend1 and legend2 parameters that are placed at the very bottom of the page to provide more details. Parameter legend1 is the expansion of legendI, and legend2 is the expansion of legendO.

在每个图表图例下方的空间很有限,所以MRTG在页末也有图例1和图例2的参数来提供更多的细节。lengend1参数是lengendI参数的补充,同样lengend2参数是lengendO参数的补充。

The Ylegend is the legend for the Y axis, the value you are trying to compare. In the case of a default MRTG configuration this would be the data flow through the interface in bits or bytes per second. Here is an example of the legends of a default MRTG configuration:

Ylengend是Y轴的图例,即是你所想比较的那个值。在MRTG的默认配置中这个比例是以位(bit)或者字节(byte)为单位显示的流经接口的数据流量。下面是MRTG中默认配置文件中图例的一个例子:

YLegend[graph1]: Bits per second
Legend1[graph1]: Incoming Traffic in Bits per Second
Legend2[graph1]: Outgoing Traffic in Bits per Second
LegendI[graph1]: In
LegendO[graph1]: Out

You can prevent MRTG from printing the legend at the bottom of the graph by leaving the value of the legend blank like this:

如果像下例中那样在图例的值处留为空值,这样可以防止MRTG在图表的底部输出图例线。

LegendI[graph1]:

Later you'll learn how to match the legends to the OIDs for a variety of situations.

下面你将学到如何在不同的情况下将图例和对象标识组合起来。

选项参数

Options parameters provide MRTG with graph formatting information. The growright option makes sure the data at the right of the screen is for the most current graph values. This usually makes the graphs more intuitively easy to read. MRTG defaults to growing from the left.

选项参数为MRTG提供格式化的图像信息。growright选项使图表右边显示的是当前值。这样使图表更加符合一般人的习惯并且便于阅读。MRTG默认地从左边输出当前值。

The nopercent option prevents MRTG from printing percentage style statistics in the legends at the bottom of the graph. The gauge option alerts MRTG to the fact that the graphed values are of the gauge type. If the value you are monitoring is in bytes, then you can convert the output to bits using the bits option. Likewise, you can convert per second values to per minute graphs using the perminute option. Here are some examples for two different graphs:

nopercent选项可以防止MRTG在图表底部输出百分比类型的统计数据线。gauge选项可以通知MRTG当前值为标准度量形式。如果你所监视的数据是字节形式的,那么你可以通过bits选项来将输出转换为位形式。同样地,你可以用perminute选项来把每秒的值转换为每分钟的值。下面有一些两种不同图表之间的例子:

options[graph1]: growright,nopercent,perminute

options[graph2]: gauge,bits

If you place this parameter at the top with a label of [_] it gets applied to all the graphs defined in the file. Here's an example.

如果你把后面跟着[_]的参数放在顶部,那么它将被应用到文件所定义的所有图表中。如下例。

options[_]: growright

标题参数

The title on the Summary Page is provided by the Title parameter, the PageTop parameter tells the title for the Detailed View page. The PageTop string must start with < H1 > and end with < H1 >.

Summary Page的标题通过标题参数进行设置,通过PageTop参数可以设置Detailed View页的标题。PageTop中的字符串的开头和结尾必须加上< H1 >.

Title[graph1]: Interface eth0

PageTop[graph1]: < H1 >Detailed Statistics For Interface eth0 < H1 >

缩放比例参数

The MaxBytes parameter is the maximum amount of data MRTG will plot on a graph. Anything more than this seems to disappear over the edge of the graph.

MaxBytes参数设置MRTG在图表中所显示的最大值。大于这个值的数据就在图表的边界上消失了。

MRTG also tries to adjust its graphs so that the largest value plotted on the graph is always close to the top. This is so even if you set the MaxBytes parameter.

MRTG也会试图去调整图表,从而使画在图表上的最大值综合贴近图表顶部。如果你设置了MaxBytes参数,那么图表上超出最大值的地方就会变成平的。

When you are plotting a value that has a known maximum and you always want to have this value at the top of the vertical legend, you may want to turn off MRTG's auto scaling. If you are plotting percentage CPU usage, and the server reaches a maximum of 60%, with scaling, MRTG will have a vertical plot of 0% to 60%, so that the vertical peak is near the top of the graph image.

当你知道一个数据的最大值的时候并且想让这个最大值作为纵轴的最大值,那么你可以将MRTG的自动比例关掉。例如里正在监视CPU负载并且服务器达到的最大值是60%,当MRTG调整比例的时候就会是纵轴从0%绘图到60%从而使纵向的峰值在图表的顶部。

When scaling is off, and MaxBytes is set to 100, then the peak will be only 60% of the way up as the graph plots from 0% to 100%. The example removes scaling from the yearly, monthly, weekly, and daily views on the Detailed View page and gives them a maximum value of 100.

当自动调整比例被关掉并且MaxBytes被设置为100,那么峰值只从达到60%而图表却从0%绘制到100%。下例把一年,一月,一周和一天的详细视图中的自动比例去除而赋予了一个值为100的最大值。

Unscaled[graph1]: ymwd
MaxBytes[graph1]: 100

设置MIB目标参数

As stated before, MRTG always tries to compare two MIB OID values that are defined by the Target parameter. You have to specify the two MIB OID objects, the SNMP password and the IP address of the device you are querying in this parameter, and separate them with an & character:

正如前面讲的那样,MIB会试图对比Target参数设置的两个MIB对象标识的值。你需要设置这两个MIB目标标识,比如SNMP的密码和你所要请求的设备的IP地址,同时要用&符号把两个参数分开。

Target[graph1]: mib-object-1.0&mib-object-2.0:<SNMP-password>@<IP-address>

The numeric value, in this case .0, at the end of the MIB is required. The next example uses the SNMP command to return the user mode CPU utilization of a Linux server. Notice how the .0 is tagged onto the end of the output.

在上例中MIB末尾的数值.0是必须的。下例利用SNMP命令要求返回某个Linux服务器的CPU使用模式。注意看.0是被如何附在输出后面的。

[root@silent mibs]# snmpwalk -v 1 -c craz33guy localhost ssCpuRawUser
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 926739
[root@silent mibs]#

The MRTG legends map to the MIBs listed in the target as shown in Table 23-3.

表23-3中列出了MRTG的图例是如何于MIB对应的。

表23-3 MIB同图表比例的对应

Legend Maps To Target MIB
Legend1 #1
Legend2 #2
LegendI #1
LegendO #2


So in the example below, legend1 and legendI describe mib-object-1.0 and legend2 and legendO describe mib-object-2.0.

所以在下例中legend1和legendI描述mib-object-1.0而legend2和legendO描述mib-object-2.0。

Target[graph1]: mib-object-1.0&mib-object-2.0:<SNMP-password>@<IP-address>

只绘制一个MIB值

If you want to plot only one MIB value, you can just repeat the target MIB in the definition as in the next example, which plots only mib-object-1. The resulting MRTG graph actually superimposes the input and output graphs one on top of the other.

如果你只想绘制一个MIB值,你可以像下例那样在定义中重复同一个MIB目标,这样的话就会只绘制mib-object-1。MRTG图表实际实际上是把输入和输出重叠在一起了。

Target[graph1]: mib-object-1.0&mib-object-1.0:<SNMP-password>@<IP-address>

将一个图表中的MIB值叠加

You can use the plus sign between the pairs of MIB object values to add them together. The next example adds mib-object-1.0 and mib-object-3.0 for one graph and adds mib-object-2.0 and mib-object-4.0 for the other.

你可以使用“+”号来使两个MIB目标值相加。下例中把mib-object-1.0和mib-object-3.0作为一个图表,同时把mib-object-2.0和mib-object-4.0作为另外一个图表,然后把两个图表加起来。

Target[graph1]: mib-object-1.0&mib-object-2.0:<SNMP-password>@<IP-address> + mib-object-3.0&mib-object-4.0:<SNMP-password>@<IP-address>

You can use other mathematical operators, such as subtract (-), multiply (*), and divide (%). Left and right parentheses are also valid. There must be white spaces before and after all these operators for MRTG to work correctly. If not, you'll get oddly shaded graphs.

你也可以使用其它的数学运算符,比如“-”,“*”和“%”。左括弧(“(”)和右括弧(“)”)同样是可以使用的。在这些运算符前后必须有空格,这样MRTG才能正常运行。如果不加空格的话,系统将显示奇怪的阴影图表。

监听对象:CPU总使用量

Linux CPU usage is occupied by system processes, user mode processes, and a few processes running in nice mode. This example adds them all together in a single plot.

在Linux中,CPU被系统进程,用户名令进程和一些用来优化命令的进程占用。在下面的例子中把它们全部加载在同一个图表中。

Target[graph1]:ssCpuRawUser.0&ssCpuRawUser.0:<SNMP-password>@<IP-address> + ssCpuRawSystem.0&ssCpuRawSystem.0:<SNMP-password>@<IP-address> + ssCpuRawNice.0&ssCpuRawNice.0:<SNMP-password>@<IP-address>

Be sure to place this command on a single line

注意:这些命令需要放在同一行。

监听对象:内存占用率

Here is an example for the plotting the amount of free memory versus the total RAM installed in the server. Notice that this is a gauge type variable.

在下例中,画出了相对系统中全部内存有多少内存是空闲的。

Target[graph1]: memAvailReal.0&memTotalReal.0:<SNMP-password>@<IP-address>
options[graph1]: nopercent,growright,gauge

Next, plot the percentage of available memory. Notice how the mandatory white spaces separate the mathematical operators from the next target element.

下面画出空闲内存的百分比。请注意观察,那些强制的空白是如何把下一个目标元素同数学算子分开的。

Target[graph1]: ( memAvailReal.0& memAvailReal.0:<SNMP-password>@<IP-Address> ) * 100 / ( memTotalReal.0&memTotalReal.0:<SNMP-password>@<IP-Address> )
options[graph1]: nopercent,growright,gauge

监听对象:新建连接

HTTP traffic caused by Web browsing usually consists of many very short lived connections. The tcpPassiveOpens MIB object tracks newly created connections and is suited for this type of data transfer. The tcpActiveOpens MIB object monitors new connections originating from the server. On smaller Web sites you may want to use the perminute option to make the graphs more meaningful.

由于浏览Web网站而产生的HTTP通信是由许多非常短的即时来连接组成的。MIB中的tcp Passive Opens对象监听新建连接同时把这些数据转换成适于通信的类型。MIB中的tcp ActiveOpens则监听由服务器发起的新连接。在较小的web站点,你可能需要用perminute这个选项来使这些图表更有意义。

Target[graph1]: tcpPassiveOpens.0& tcpPassiveOpens.0:<SNMP-password>@<IP-address>
MaxBytes[graph1]: 1000000
Options[graph1]: perminute

监听对象:已建立的TCP总连接数

Other protocols such as FTP and SSH create longer established connections while people download large files or stay logged into the server. The tcpCurrEstab MIB object measures the total number of connections in the established state and is a gauge value.

在其它的协议中,比如FTP和SSH,如果用户下载大文件或一直连接在服务器上,教长时间的连接将被连接建立。MIB的CurrEstabz对象监听长期驻留的连接总数并输出一个测量值。

Target[graph1]: tcpCurrEstab.0&tcpCurrEstab.0:<SNMP-password>@<IP-address>
MaxBytes[graph1]: 1000000
Options[graph1]: gauge

取样对象:硬盘各区使用量

In this example, you'll monitor the /var and /home disk partitions on the system.

在下例中,我们将监听系统中的/var和/home分区

1) First use the df -k command to get a list of the partitions in use.

1)首先,使用df -k命令得到正在使用的各个分区的列表

[root@bigboy tmp]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda8               505605    128199    351302  27% /
/dev/hda1               101089     19178     76692  21% /boot
/dev/hda5              1035660    122864    860188  13% /home
/dev/hda6               505605      8229    471272   2% /tmp
/dev/hda3              3921436    890092   2832140  24% /usr
/dev/hda2              1510060    171832   1261520  73% /var
[root@bigboy tmp]#

2) Add two entries to your snmpd.conf file.

2)将两个目录加到snmp.conf文件中。

disk  /home
disk  /var

3) Restart the SNMP daemon to reload the values.

重启SNMP进程来加在这些值。

[root@bigboy tmp]# service snmpd restart

4) Use the snmpwalk command to query the the dskPercent MIB. Object dskPercent.1 refers to the first disk entry in snmpd.conf (/home), and dskPercent.2 refers to the second (/var).

利用snmpwalk命令可以用来查询dskPercent MIB。对象dskPercent.1指向在snmp.conf中的第一个磁盘目录[/home],dsk.Precent.2则指向第二个磁盘目录[/var]。

[root@bigboy tmp]# snmpwalk -v 1 -c craz33guy localhost dskPercent.1
UCD-SNMP-MIB::dskPercent.1 = INTEGER: 13
[root@bigboy tmp]# snmpwalk -v 1 -c craz33guy localhost dskPercent.2
UCD-SNMP-MIB::dskPercent.2 = INTEGER: 73
[root@bigboy tmp]#

Your MRTG target for these gauge MIB objects should look like this:

对于MIB对象中的MRTG目标应该如下书写:

Target[graph1]: dskPercent.1& dskPercent.1:<SNMP-password>@<IP-address>
options[graph1]: growright,gauge

定义全局变量

You have to make sure MRTG knows where the MIBs you're using are located. The default location MRTG uses may not be valid. Specify their locations with the global LoadMIBs parameter. You must also define where the HTML files will be located; the example specifies the default Fedora MRTG HTML directory.

你应该保证MRTG知晓你正在使用的MIB的位置。MRTG使用的默认设置可能是不正确的。可以利用全局LoadMIB参数来确定它们的位置。同时你需要定义HTML文件的位置。下例指出了Fedora中MRTG HTML的默认目录。

LoadMIBs: /usr/share/snmp/mibs/UCD-SNMP-MIB.txt, /usr/share/snmp/mibs/TCP-MIB.txt
workdir: /var/www/mrtg/

实现高级服务器监听

You now can combine all you have learned to create a configuration file that monitors all these variables, and then you can integrate it into the existing MRTG configuration.

现在你可以结合你所学到的来创建监听这些变量的配置文件,然后把它写入已存在的MRTG配置文件中。

完整的配置实例

Here is a sample configuration file that is used to query server localhost for CPU, memory, disk, and TCP connection information.

下面是一个用来查询本地主机服务器中CPU,内存,磁盘和TCP连接信息的配置文件实例。

#
# File: /etc/mrtg/server-info.cfg
#
# Configuration file for non bandwidth server statistics
#

#
# Define global options
#

LoadMIBs: /usr/share/snmp/mibs/UCD-SNMP-MIB.txt,/usr/share/snmp/mibs/TCP-MIB.txt
workdir: /var/www/mrtg/


#
# CPU Monitoring
# (Scaled so that the sum of all three values doesn't exceed 100)
#

Target[server.cpu]:ssCpuRawUser.0&ssCpuRawUser.0:craz33guy@localhost + ssCpuRawSystem.0&ssCpuRawSystem.0:craz33guy@localhost + ssCpuRawNice.0&ssCpuRawNice.0:craz33guy@localhost
Title[server.cpu]: Server CPU Load
PageTop[server.cpu]: < H1 >CPU Load - System, User and Nice Processes< /H1 >
MaxBytes[server.cpu]: 100
ShortLegend[server.cpu]: %
YLegend[server.cpu]: CPU Utilization
Legend1[server.cpu]: Current CPU percentage load
LegendI[server.cpu]: Used
LegendO[server.cpu]:
Options[server.cpu]: growright,nopercent
Unscaled[server.cpu]: ymwd


#
# Memory Monitoring (Total Versus Available Memory)
#

Target[server.memory]: memAvailReal.0&memTotalReal.0:craz33guy@localhost
Title[server.memory]: Free Memory
PageTop[server.memory]: < H1 >Free Memory< /H1 >
MaxBytes[server.memory]: 100000000000
ShortLegend[server.memory]: B
YLegend[server.memory]: Bytes
LegendI[server.memory]: Free
LegendO[server.memory]: Total
Legend1[server.memory]: Free memory, not including swap, in bytes
Legend2[server.memory]: Total memory
Options[server.memory]: gauge,growright,nopercent
kMG[server.memory]: k,M,G,T,P,X


#
# Memory Monitoring (Percentage usage)
#
Title[server.mempercent]: Percentage Free Memory
PageTop[server.mempercent]: < H1 >Percentage Free Memory< /H1 >
Target[server.mempercent]: ( memAvailReal.0&memAvailReal.0:craz33guy@localhost ) * 100 / ( memTotalReal.0&memTotalReal.0:craz33guy@localhost )
options[server.mempercent]: growright,gauge,transparent,nopercent
Unscaled[server.mempercent]: ymwd
MaxBytes[server.mempercent]: 100
YLegend[server.mempercent]: Memory %
ShortLegend[server.mempercent]: Percent
LegendI[server.mempercent]: Free
LegendO[server.mempercent]: Free
Legend1[server.mempercent]: Percentage Free Memory
Legend2[server.mempercent]: Percentage Free Memory


#
# New TCP Connection Monitoring (per minute)
#
 
Target[server.newconns]: tcpPassiveOpens.0&tcpActiveOpens.0:craz33guy@localhost
Title[server.newconns]: Newly Created TCP Connections
PageTop[server.newconns]: < H1 >New TCP Connections< /H1 >
MaxBytes[server.newconns]: 10000000000
ShortLegend[server.newconns]: c/s
YLegend[server.newconns]: Conns / Min
LegendI[server.newconns]: In
LegendO[server.newconns]: Out
Legend1[server.newconns]: New inbound connections
Legend2[server.newconns]: New outbound connections
Options[server.newconns]: growright,nopercent,perminute
 

#
# Established TCP Connections
#

Target[server.estabcons]: tcpCurrEstab.0&tcpCurrEstab.0:craz33guy@localhost
Title[server.estabcons]: Currently Established TCP Connections
PageTop[server.estabcons]: < H1 >Established TCP Connections< /H1 >
MaxBytes[server.estabcons]: 10000000000
ShortLegend[server.estabcons]:
YLegend[server.estabcons]: Connections
LegendI[server.estabcons]: In
LegendO[server.estabcons]:
Legend1[server.estabcons]: Established connections
Legend2[server.estabcons]:
Options[server.estabcons]: growright,nopercent,gauge
 

#
# Disk Usage Monitoring
#

Target[server.disk]: dskPercent.1&dskPercent.2:craz33guy@localhost
Title[server.disk]: Disk Partition Usage
PageTop[server.disk]: < H1 >Disk Partition Usage /home and /var< /H1 >
MaxBytes[server.disk]: 100
ShortLegend[server.disk]: %
YLegend[server.disk]: Utilization
LegendI[server.disk]: /home
LegendO[server.disk]: /var
Options[server.disk]: gauge,growright,nopercent
Unscaled[server.disk]: ymwd

配置测试

The next step is to test that MRTG can load the configuration file correctly.

下一步就是测试MRTG可以正确地载入这些配置

Restart SNMP to make sure the disk monitoring commands in the snmpd.conf file are activated. Run the /usr/bin/mrtg command followed by the name of the configuration file three times. If all goes well, MRTG will complain only about the fact that certain database files don't exist. MRTG then creates the files. By the third run, all the files are created and MRTG should operate smoothly.

重新启动SNMP以保证snmp.conf文件中的磁盘监听命令被激活。运行“/usr/bin/mrtg”中的配置文件三次。如果一切正常,MRTG会提示某些数据库文件不存在。MRTG接下来会创建这些文件,当运行到第三次时,所有的文件都具备并且MRTG应该能正常运行了。

[root@bigboy tmp]# service snmpd restart
[root@bigboy tmp]# env LANG=C /usr/bin/mrtg /etc/mrtg/server-stats.cfg

创建一个新的MRTG索引文件来把这个文件包含进来

Use the indexmaker command and include your original MRTG configuration file from Chapter 22, " Monitoring Server Performance", (/etc/mrtg/mrtg.cfg) plus the new one you created (/etc/mrtg/server-stats.cfg).

利用indexmaker命令来包含你的MRTG原始文件[Chapter 22, " Monitoring Server Performance" (/etc/mrtg/mrtg.cfg)] 和你新创建的文件。

[root@bigboy tmp]# indexmaker --output=/var/www/mrtg/index.html \
/etc/mrtg/mrtg.cfg /etc/mrtg/server-stats.cfg

配置crin来使用新的MRTG文件

The final step is to make sure that MRTG is configured to poll your server every five minutes using this new configuration file. To do so, add this line to your /etc/cron.d/mrtg file.

最后一步是确定MRTG利用这些新的配置文件每5分钟就执行一次以监听你的服务器。为了达到这个目的,把下面这行代码加入到/etc/cron/mrtg文件中。

0-59/5 * * * * root env LANG=C /usr/bin/mrtg /etc/mrtg/server-stats.cfg

Some versions of Linux require you to edit your /etc/crontab file instead. See Chapter 22, " Monitoring Server Performance", for more details. You will also have to restart cron with the service crond restart for it to read its new configuration file that tells it to additionally run MRTG every five minutes using the new MRTG configuration file.

在一些Linux发行版中需要你去编辑/etc/crontab文件。详细细节请参考 Chapter 22, " Monitoring Server Performance"。你同样需要重启cron来重启Crond服务来使它载入它的新配置文件,从而使它每5分钟就将MRTG的新配置文件运行一次。

[root@bigboy tmp]# service crond restart

Monitoring Non Linux MIB Values

All the MIBs mentioned so far are for Linux systems; other types of systems will need additional MIBs whose correct installation may be unclear in user guides or just not available. In such cases, you'll need to know the exact value of the OID.

Scenario

Imagine that your small company has purchased a second-hand Cisco switch to connect its Web site servers to the Internet. The basic MRTG configuration shown in Chapter 22, " Monitoring Server Performance", provides the data bandwidth statistics, but you want to measure the CPU load the traffic is having on the device, as well. Downloading MIBs from Cisco and using them with the snmpget command was not a success. You do not know what to do next. Find The OIDs

When MIB values fail, it is best to try to find the exact OID value. Like most network equipment manufacturers, Cisco has an FTP site from which you can download both MIBs and OIDs. The SNMP files for Cisco's devices can be found at ftp.cisco.com in the /pub/mibs directory; OIDs are in the oid directory beneath that.

After looking at all the OID files, you decide that the file CISCO-PROCESS-MIB.oid will contain the necessary values and find these entries inside it.

"cpmCPUTotalPhysicalIndex"  "1.3.6.1.4.1.9.9.109.1.1.1.1.2"
"cpmCPUTotal5sec"           "1.3.6.1.4.1.9.9.109.1.1.1.1.3"
"cpmCPUTotal1min"           "1.3.6.1.4.1.9.9.109.1.1.1.1.4"
"cpmCPUTotal5min"           "1.3.6.1.4.1.9.9.109.1.1.1.1.5"
"cpmCPUTotal5secRev"        "1.3.6.1.4.1.9.9.109.1.1.1.1.6"
"cpmCPUTotal1minRev"        "1.3.6.1.4.1.9.9.109.1.1.1.1.7"
"cpmCPUTotal5minRev"        "1.3.6.1.4.1.9.9.109.1.1.1.1.8"

Testing The OIDs

As you can see, all the OIDs are a part of the same tree starting with 1.3.6.1.4.1.9.9.109.1.1.1.1. The OIDs provided may be incomplete, so it is best to use the snmpwalk command to try to get all the values below this root first.

[root@bigboy tmp]# snmpwalk -v1 -c craz33guy cisco-switch 1.3.6.1.4.1..9.9.109.1.1.1.1
SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.2.1 = INTEGER: 0
SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.3.1 = Gauge32: 32
SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.4.1 = Gauge32: 32
SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.5.1 = Gauge32: 32
[root@bigboy tmp]#

Although listed in the OID file, 1.1.1.1.6, 1.1.1.1.7, and 1.1.1.1.8 are not supported. Notice also how SNMP has determined that the first part of the OID value (1.3.6.1.4.1) in the original OID file maps to the word "enterprise".

Next, you can use one the snmpget command to set only one of the OID values returned by snmpwalk.

[root@bigboy tmp]# snmpget -v1 -c craz33guy cisco-switch \
enterprises.9.9.109.1.1.1.1.5.1
SNMPv2-SMI::enterprises.9.9.109.1.1.1.1.5.1 = Gauge32: 33
[root@bigboy tmp]#

Success! Now you can use this OID value, enterprises.9.9.109.1.1.1.1.5.1, for your MRTG queries.

Speeding up MRTG with RRDtool

MRTG is a very useful program but it has a limitation. All the graphs and web pages are recreated each time a device is polled. This can potentially overload your MRTG server especially if you have a large number of monitored devices and the graphs take more than five minutes to generate. RRDtool is an application written by the creator of MRTG that can store general purpose data, but generates graphs on demand. Integrating MRTG with RRDtool can have very noticeable performance benefits. The example that follows will show you how to quickly implement a general purpose solution.

Scenario

The use of RRDtool is needed to reduce the load on a monitoring server that has been experiencing very sluggish performance due to the amount of MRTG graphs it has to regenerate every polling cycle.

  • Due to space constraints, the RRD database needs to be located in the /var partition.
  • The server has a default Apache configuration with the CGI files needed for dynamically generated content being located in the /var/www/cgi-bin directory.
  • A CGI script is required that will read the new MRTG data in RRDtool format.
  • The MRTG configuration file is /etc/mrtg/mrtg.cfg.

Here's how to proceed.

Installing RRDtool

The RRDtool and RRDtool PERL module file can be downloaded from its website at http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/, but installation can be tricky as the installation program may look for certain supporting libraries in the wrong directories.

Fortunately the prerequisite rrdtool and rrdtool-perl packages now come as part of most Linux distributions. For more details on installing packages, see Chapter 6, "Installing Linux Software").

Storing the MRTG Data in RRDtool Format

This phase of the integration process can be done in a few minutes, but the steps can be tricky:

  • The first step is to add some new options to your cfgmaker command. The first indicates that MRTG should only store rrdtool formatted data, and the second defines the /var/mrtg directory in which it should be stored. For added security, the directory should be external to your web server's document root.
--global 'LogFormat: rrdtool' --global "workdir: /var/mrtg"  --global 'IconDir: /mrtg'
Finally, you should also specify an icon directory which specifies the location of all miscellaneous MRTG web page icons. The RRD web interface script we'll install later uses an incorrect location. The icon directory /mrtg is actually a partial URL location. In this Fedora scenario we are using the default Apache configuration which locates the MRTG icon files in the /var/www/mrtg directory. If you are using a non default Apache MRTG configuration or are using other Linux distributions or versions you may have to copy the icons to the custom directory in which the MRTG PNG format icon files are located.
The cfgmaker program is simple to use and is covered in in Chapter 22, "Monitoring Server Performance".
  • The next step is to create the data repository directory /var/mrtg and make it be owned by the apache user and process that runs the default Linux web server application.
[root@bigboy tmp]# mkdir /var/mrtg
[root@bigboy tmp]# chown apache /var/mrtg
[root@bigboy tmp]#
Note: If you are using SELinux you'll have to change the context of this directory to match that of the /var/www/html directory so that the apache process will be able to read the database files when your CGI script needs them. These commands compare the contexts of the both directories and apply the correct set to /var/mrtg.
Please refer to Chapter 20, " The Apache Web Server" for more details on file contexts with Apache.
[root@bigboy tmp]# ls -alZ /var/www | grep html
drwxr-xr-x  root     root     system_u:object_r:httpd_sys_content_t html
[root@bigboy tmp]# ls -alZ /var | grep mrtg
drwxr-xr-x  apache   root     root:object_r:var_t              mrtg
[root@bigboy tmp]# chcon -R -u system_u -r object_r -t httpd_sys_content_t /var/mrtg
[root@bigboy tmp]#
  • We now need to test that the RRD files are being created correctly. Run MRTG using the /etc/mrtg/mrtg.cfg file as the source configuration file then test to see if the contents of the /var/mrtg directory have changed. Success!
[root@bigboy tmp]# ls /var/mrtg/
localhost_192.168.1.100.rrd
[root@bigboy tmp]# 

The files are being created properly. Now we need to find a script to read the new data format and present it in a web format. This will be discussed next.

The MRTG / RRDtool Integration Script

The MRTG website recommends the script located on the mrtg-rrd website (http://www.fi.muni.cz/~kas/mrtg-rrd/) as being a good one to use. Let's go ahead and install it.

  • Download the script using wget. The site lists several versions; make sure you get the latest one.
[root@bigboy tmp]# wget ftp://ftp.linux.cz/pub/linux/people/jan_kasprzak/mrtg-rrd/mrtg-rrd-0.7.tar.gz
--12:42:12--  ftp://ftp.linux.cz/pub/linux/people/jan_kasprzak/mrtg-rrd/mrtg-rrd-0.7.tar.gz
           => `mrtg-rrd-0.7.tar.gz'
Resolving ftp.linux.cz... 147.251.48.205
Connecting to ftp.linux.cz|147.251.48.205|:21... connected.
Logging in as anonymous ... Logged in!
...
...
...
15:24:50 (53.53 KB/s) - `mrtg-rrd-0.7.tar.gz' saved [20863]
[root@bigboy tmp]# ls
mrtg-rrd-0.7.tar.gz
[root@bigboy tmp]#
  • Extract the contents of the tar file.
[root@bigboy tmp]# tar -xzvf mrtg-rrd-0.7.tar.gz 
mrtg-rrd-0.7/
mrtg-rrd-0.7/COPYING
mrtg-rrd-0.7/FAQ
mrtg-rrd-0.7/TODO
mrtg-rrd-0.7/Makefile
mrtg-rrd-0.7/mrtg-rrd.cgi
mrtg-rrd-0.7/ChangeLog
[root@bigboy tmp]#
  • Create the /var/www/cgi-bin/mrtg directory and copy the mrtg-rrd.cgi file to it.
[root@bigboy tmp]# mkdir -p /var/www/cgi-bin/mrtg
[root@bigboy tmp]# cp mrtg-rrd-0.7/mrtg-rrd.cgi /var/www/cgi-bin/mrtg/
[root@bigboy tmp]#
  • Edit the mrtg-rrd.cgi file and make it refer to the /etc/mrtg/mrtg.cfg file for its configuration details, or you can specify all the .cfg files in your /etc/mrtg directory.
#
# File: mrtg-rrd.cgi (Single File)
#
 
# EDIT THIS to reflect all your MRTG config files
BEGIN { @config_files = qw(/etc/mrtg/mrtg.cfg); }
#
# File: mrtg-rrd.cgi (multipl .cfg files)
#
 
# EDIT THIS to reflect all your MRTG config files
BEGIN { @config_files = </etc/mrtg/*.cfg>; }


  • You should now be able to access your MRTG RRD graphs by visiting this URL:
http://www.my-web-site.org/cgi-bin/mrtg/mrtg-rrd.cgi 


Once installed, RRDtool operates transparently with MRTG. You'll have to remember to add the RRD statements to any new MRTG configurations and also add the configuration file to the CGI script. Our monitoring server can now breathe a little easier.

Troubleshooting

The troubleshooting techniques for advanced MRTG are similar to those mentioned in Chapter 22, " Monitoring Server Performance", but because you have done some customizations you'll have to go the extra mile.

  • Verify the IP address and community string of the target device you intend to poll.
  • Make sure you can do an SNMP walk of the target device. If not, revise your access controls on the target device and any firewall rules that may impede SNMP traffic.
  • Ensure you can do an SNMP get of the specific OID value listed in your MRTG configuration file.
  • Check your MRTG parameters to make sure they are correct. Gauge values defined as counter and vice versa will cause your graphs to have continuous zero values. Graph results that are eight times what you expect may have the bits parameter set.
  • There are a few errors common to initial RRDtool integration.
Web messages like this where the reference to the MRTG configuration file in the CGI script was incorrect
Error: Cannot open config file: No such file or directory
"Permission Denied" web messages are usually caused by incorrect file permissions and / or SELinux contexts
Error: RRDs::graph failed, opening '/var/mrtg/localhost_192.168.1.100.rrd': Permission denied
Errors in the /var/log/httpd/errorlog file referring to files or directories that don't exist can be caused by an incorrect IconDir statement in the MRTG configuration file.
[Wed Jan 04 15:42:13 2006] [error] [client 192.168.1.102] File does not exist: /var/www/html/var,
referer: http://bigboy/cgi-bin/mrtg/mrtg-rrd.cgi/ 

[Wed Jan 04 15:45:46 2006] [error] [client 192.168.1.102] script not found or unable to stat:
 /var/www/cgi-bin/mrtg/mrtg-l.png, referer: http://bigboy/cgi-bin/mrtg/mrtg-rrd.cgi/

These quick steps should be sufficient in most cases and will reward you with a more manageable network.

Conclusion

Using the guidelines in this chapter you should be able to graph most SNMP MIB values available on any type of device. MRTG is an excellent, flexible monitoring tool and should be considered as a part of any systems administrator's server management plans.