If you have suggestions for a FAQ or would be willing to write one up, please contact me at netsaint@linuxbox.com
Contents:
Problems compiling NetSaint
Debugging "unknown variable" errors during configuration verification or runtime
Changing the contents of the default web page
Monitoring virtual web servers that use host headers
Monitoring remote host information
Monitoring printers
Monitoring Windows NT servers
Sending SNMP traps to management hosts
Troubleshooting problems with NetSaint
I'm having trouble compiling Netsaint - What can I do? | ||||||||||||
If you are running Linux, this is probably because you don't have the gcc compiler installed on your system. Either install the compiler yourself or ask your sysadmin to do it for you. If you are running SunOS, IRIX, HP-UX, *BSD, etc. make have to tweak the Makefile a bit. This may involve changing the compiler name, compiler options, and/or linker options. For some unknown reason Solaris does not include the inet_aton() function in the standard shipped libraries. Apparently, installing the gcc compiler and glibc libraries for Solaris will allow the plugins (which make use of the function) to compile properly, or so I've been told. Future versions of NetSaint may include an internal version of the inet_aton() function to make things easier. If you have to make changes to the Makefile in order to compile NetSaint, let me know what OS you are running and what changes you had to make. I would like to include this information in future releases. |
||||||||||||
Debugging "unknown variable" errors during configuration file verification or runtime | ||||||||||||
When trying to run NetSaint or verify your configuration file data using the -v argument, NetSaint may print out a message like "Error in configuration file 'xxxxxxx.cfg' - Line 34 (Unknown variable)". A few simple checks will usually resolve this problem...
|
||||||||||||
How do I change the contents of the default web page? | ||||||||||||
Several people have asked how to modify the default web page so that service detail or service overview information is displayed in the right hand frame (instead of the intro page). You can do this rather easily by modifying the frameset information in the index.html page (located in the root web directory for NetSaint) as follows.. Default Frame Configuration
<FRAMESET BORDER="0" FRAMEBORDER="0" FRAMESPACING="0" COLS="180,*"> Modified Configuration
<FRAMESET BORDER="0" FRAMEBORDER="0" FRAMESPACING="0" COLS="180,*"> Replace xxxxx with one of the following values, or anything else you may want...
|
||||||||||||
How do I monitor virtual web servers that use host headers? | ||||||||||||
If you are running a web server with multiple virtual servers and only one IP address, this applies to you. Let's say that your web server has an IP address of 192.168.0.1 and two virtual servers running on it - "www.myfirstdomain.com" and "www.myseconddomain.com". Both of these domain names resolve to the same IP address (192.168.0.1) during a DNS lookup. The check_http plugin can handle this type of situation without a problem. You will need to specify the virtual web site name as an additional command line argument to the plugin (using the -hn option). Example:
command[check_http2]=/usr/local/netsaint/check_http $HOSTADDRESS$ -u / -p 80 -hn $ARG1$ The check_http2 command defined here will use the check_http plugin to open a connection to port 80 of the host at IP address 192.168.0.1. It will then send an HTTP/1.1 request for the root document, along with either a "Host: www.myfirstdomain.com" or "Host: www.myseconddomain.com" in the request header. |
||||||||||||
How do I monitor remote host information? | ||||||||||||
Several people have asked how to use the check_disk, check_procs, check_load, and check_users plugins to report remote host information. Although you could use rsh or ssh to execute the plugins on the remote machines, I would suggest using one of the other methods described below...
|
||||||||||||
How can I monitor NT servers? | ||||||||||||
This is an important question, since anyone running NT knows that about 80% of all network server problems seem to come from these beasts. The good thing is that NT has many performance counters that can be exposed for the purpose of monitoring. The bad thing is that there is no easy way to do this. The only way that I currently know that it can be done remotely is via SNMP. I have plans to write an NT service that acts similiar to James Drews' MRTG Extension for NetWare, but this is a little ways off right now. For the time being, you'll have to use SNMP to obtain NT statistics for various system and application statistics. In order to expose NT performance counters for monitoring, you'll have to run the SNMP service on all servers you want to monitor. You'll also have to install any necessary performance MIBs for the services you want to monitor. I believe these can be found in the NT Resource Kit or in various server admin packages. If you've feeling extra lucky you can try to search the Microsoft site for the terms SNMP and MIB and maybe you'll find something... You can search the MRTG mailing list archives for more information on configuring NT servers to expose various performance counters via SNMP. I know this has been discussed in the past, as many people are graphing various NT performance statistics using MRTG. In fact, somebody from Microsoft is actually doing it - you can find their web page at http://snmpboy.rte.microsoft.com/. Once you've actually got the SNMP stuff working, you can use the check_snmp plugin to query your NT servers and generate alarms. |
||||||||||||
How do I monitor printers? | ||||||||||||
Assuming you have HP printers with JetDirect© cards installed, you can use the HP printer plugin to monitor them. Before you begin monitoring printers you should carefully plan your configuration to match level of monitoring and response time you need. You need to balance this against the annoyance of getting alerted every time sometime takes the printer offline to manually feed a transparency, etc. A lot of admins probably don't care if the printer is jammed or is out of paper, but some tech support people in large corporations might find this to be a useful feature. Anyway, if you decide to do this you will need to do the following things:
|
||||||||||||
Can NetSaint send SNMP traps to management hosts? | ||||||||||||
Yes, but not directly. NetSaint relies on plugins to handle the gathering of service and host information and event handler scripts to handle events that occur with services and hosts. If you want to have NetSaint send an SNMP trap to a management host in the event that a particular service has a problem, you will have to write a service event handler script and add it to the event_handler option of the service definition. If you have the UCD-SNMP package installed on your host, you could have the script call the snmptrap command to actually send a trap message, depending on what type of service event occurred. Look at the example event handler script to get a better idea of how to write a script. |
||||||||||||
Something isn't working properly - How can I track down the problem? | ||||||||||||
I've worked in tech support for a few years and have spent my share of time on a helpdesk. Most people are vague when they report a problem and have no desire whatsoever to try and track down the problem - they just want you to fix it now. I hope you are not that type of person. NetSaint is relatively new and is probably chock full of bugs, so things will not always work properly. If you suspect that either the service check or notification routines are not working, here are a few things you can do to try and track down the problem... This first thing you should do is verify your configuration data by running NetSaint with the -v option. Example: ./netsaint -v /usr/local/netsaint/etc/netsaint.cfg If no errors are found, proceed to the next steps. If NetSaint reports some error, go back and fix your configuration files. The next step will take more time, but will give you more information on what is going on inside of NetSaint. When I first developed NetSaint I added a lot of debugging code to help me track down problems. I still use that code when I add new features or track down bugs myself. Here is how to use the debugging code... Reconfigure NetSaint and enable one or more debug options as follows, replacing the "--enable-DEBUGx" with one or more of the values from the table below: ./configure --prefix=/your/netsaint/directory --enable-DEBUGx Debugging Options
Recompile NetSaint. Verify your configuration data again - you'll see a lot more information this time if you have enabled the DEBUG1 option. Try redirecting output to a file so that you can view or print it at a later time. If you have defined either the DEBUG3 or DEBUG4 options, run NetSaint as a foreground process and start monitoring your services. Example: ./netsaint /usr/local/netsaint/etc/netsaint.cfg Kill NetSaint at an approprate point (i.e. after a service check fails) and look through the output. It should help you track down where the problem is occurring. Some code tweaking may be necessary on your part in order to fix things. Let me know if you have to make any such alterations so I can include the fix in future releases. If you are unable to determine or fix the problem on your own, email me the following items:
|
||||||||||||