I figured now that we have most systems added to puppet that we should probably be monitoring the puppet agent to make sure it's actually running after encounting a few servers where it was stopped/dead. A quick google search revealed a few nagios plugins, but they were more advanced than the simple running/dead check I wanted so I wrote my own. It's just a simple bash if statment that checks if the agent is running and is deployed to all servers with our monitoring module. There is the obvious issue of it's not going to be deployed to servers where puppet isn't running but I will find them and fix them!
The script is:
It's deployed to every server with our monitoring module:
And I have added an exec to our monitoring module to ensure that the nagios user has the correct permissions in sudoers to execute the plugin:
Then finally the line in the nrpe config looks like: