troubleshooting this type of issue requires checking the following things
Are the traps being sent?
The most likely cause that we've seen is that the CS 1000 isn't configured to send its emergency traps to the correct IP address, so it's the first thing we check. The configuration isn't difficult, but it's often overlooked. For specific instructions, see this post. Assuming the traps are being sent to the correct address, we can move on.
Are the traps being received?
This one is more difficult to diagnose, and it involves checking multiple things.
Is the Windows SNMP Trap Service Running?The first step is to make sure the Windows SNMP Trap Service is running. In both cases, it was. The screen shot on the right shows the Windows Service MMC Snap-in, indicating clearly that the SNMP Trap Service is running. Click the image to enlarge.
The LENS Server also has a handy icon showing the state of the SNMP Trap Service.
Is the Windows Firewall Blocking the Trap?
The trap service may be running, but that doesn't mean the traps are getting through. SNMP traps travel over UDP Port 162. If that port is blocked by whatever firewall rules are active, they'll be blocked by the firewall and never make it to the trap service. Everything will appear to be running fine, but nothing will happen.
Is Something Else Using Port 162?
This is the one that had tripped us up on these two support calls. There was another application handling traps! In and of itself, this isn't out of the realm of possibility. The reason it didn't occur to us is that nothing else complained about the port being tied up by another process. The Windows SNMP Trap Service didn't raise any errors. It started up just fine. And because the Windows Trap Service was happy, LENS was happy.
We found the issue by running this command from the Windows command prompt:
netstat -anoWe then scanned the output, looking for lines containing port 162. On my development machine without any other trap handlers running, I had the following relevant lines in the command's output:
UDP 0.0.0.0:162 *:* 428Those lines indicate that port 162 is being used by a process whose Windows Process ID is 428.
UDP [::]:162 *:* 428
To find out what process that is, simply start up the Task Manager and click the Processes tab. Process IDs are not displayed by default, so if you don't see it, click the View Menu and choose Select Columns... In the window that opens, choose PID (Process Identifier) and click OK.
Now sort on the PID column and find the Process ID matching the one from the output from the netstat command (in my case, 428). You may need to select the Show processes from all users option before it will appear. As you can see from the screen shot here (click the image to enlarge), on my system, Process ID 428 is in fact snmptrap.exe, also known as the SNMP Trap service. Thus, at least on my system, everything is configured properly.
This is the expected scenario, but wasn't the case in our two customer support issues. In both instances, once we reached this step, we found another process holding port 162, even though the Windows SNMP Trap service was also running, apparently without any problems.
The only thing left to do at that point was to stop the conflicting application, or if that wasn't possible, find another Windows Server on which to run the LENS Server.
The Lesson
If the LENS Server software doesn't appear to be responding to emergency calls from the CS 1000, don't rely on the fact that the Windows SNMP Service is running. Take the extra two minutes and make sure nothing else is using that UDP port. Network applications don't always play well together, especially when two of them want to use the same port. Normally, you'd expect one of them would complain. But when they don't, at least there is another way to determine what's going on.

No comments:
Post a Comment