End the Chaos, Get with the Program: Python Logging to Syslog, and Filtering with RSyslog

This article describes how to use Python’s logging
library to send logs to syslog. Then, using rsyslog,
a fancier syslog, we produce a log of pre-filtered
output.

I have a process that runs a few times an hour, and
it spits out results to STDOUT, which cron dutifully
emails to me. As you might imagine, I get thousands
of emails which I never read.

I needed to send this information to a log, and one that’s
not filled with extraneous information.

Rsyslog is the standard logging daemon in Debian-based
distros, and it’s like a magical, difficult-to-use,
all-in-one log routing daemon. Not only does it support
BSD syslog’s facility.priority model, but it can filter
the content of the log line.

(It can do a lot more, but we won’t get into that.)

Python has a elaborate logging library, logging,
that has features that overlap with rsyslog, almost
to the point where rsyslog is superfluous, but to
keep complexity out of the code, we’re going to use
just a few features.

Adding Logging to the Python code

logging is a library that lets you piece together a
logging solution. Here’s the code extracted from the
top of my utility script:

import logging
import logging.handlers
from logging.handlers import SysLogHandler

sysl = SysLogHandler(address='/dev/log', facility=SysLogHandler.LOG_USER)
sysl.setFormatter(logging.Formatter('remote_db_runner: %(levelname)s - %(message)s'))
logger = logging.getLogger('remote_db_runner')
logger.addHandler(sysl)
logger.info("remote_db_runner decorator logging started")

SysLogHandler sends the log lines to a syslogd server.
The named parameter address=’/dev/log’ causes the program
to communicate via a Unix domain socket with the log server.

logging.Formatter sets a template for the log line. logging.getLogger
creates a new logger. addHandler adds a handler to the logger;
you can add more than one.

Then, once it’s all set up, it writes a message to the logger, rsyslogd.

Making a New Log File with Rsyslogd

In Debian, the files in /etc/rsyslog.d/ that end in .conf are used
to configure rsyslogd. Create a new file, /etc/rsyslog.d/company.conf:

*.* -/var/log/company.log

This configuration logs everything to company.log.

Then restart the service: sudo service rsyslogd restart

You should see a new file, /var/log/company.log with all the log messages
on the system going to it. If it’s not too busy, great. You can run
your Python script and see if the log message appears. If it does,
great.

Now, we need to change the configuration so it only logs lines that
contain “remote_db_runner:”. Change the company.conf file to:

# log specific lines to company debug log
:rawmsg,contains,"remote_db_runner:" -/var/log/company.log

Delete company.log, restart rsyslogd, and then look at
the log file after running the script.
It should only show the new lines that we’re logging
from the Python script.

(The configuration we used conflicts a little with the docs, but
reading the IETF standard, RFC5424, hints at why we must match
“rawmsg” instead of “msg”, and why we use “contains” instead of
“startswith”. The logging standard delivers data to the logging
server as a data record, not simply a string.)

With this config file, we can add lines to include other log
lines produced by our code, if we are careful to format the
log lines via logging.Formatter().

References

Rsyslogd

Python logging

RFC5424