Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

2011-04-18

Python Logging

I've been using the Python logging module for a couple of weeks now, and I want to like it because A) it's a standard module, B) it has some cool features like multiple handlers and hierarchy.  But almost every time I use it I feel like I might as well just write my own logging module suitable for my purposes... because it seems like I have to do that anyways.  The module just seems to require too much scaffolding and setup to use.

Here's what I mean. To do it properly you have to:
  • get a logger
  • set the verbosity level of the logger
  • create a file or steam handler
  • create a formatter (the default needs replacing)
  • add the formatter to the handler
  • add the handler to the logger
  • do this all again if you want to mirror to stderr AND to a file (which is why I started using logging in the first place)
  • put in code to shut down the logging (makes sure the streams get flushed) and for safety use the atexit module, meaning
    • import atexit
    • register the shutdown
  • add an exception hook so that we can log uncaught exceptions too
This is just a little too much for basic proper use, don't you think?

To be fair, there is a "simple" way to use logging which is to just use the logging module functions "BasicConfig()" and "debug|info|warning|error|etc()" functions without getting a logger for your module.  But it doesn't give the behaviour I want and even they prefer you don't use it in this manner.

What I believe is missing is a set of helper-functions and/or sytactic sugar to handle common tasks.
  • let more things like Level and Handlers be put in the argument to getLogger
  • automatically wrap common things like a File-like object and filename string into a handler instead of having the need to explicitly make one.
  • an at-exit shutdown should be somewhat implicit (maybe an option to turn it off) as well as the option to trap other exceptions
And what I'd like to have for simple operation:
  • one line (minus "import") to get a logger for my module with any optional formatting and whatnot.
  • one line to configure the root logger with all options, that can deal with an array of logging destinations, that will auto-interpret formatting strings and destinations instead of needing to create sub-handlers and formatters, etc.
Here was my first crack at collapsing all that with two helper functions, but I hate having to add more functions to import for things that should have just been available (yes it's a bit ugly).

def add_to_logging(log,whereto=None,level=10,format="%(levelname)s: %(message)s",dateformat='%Y%m%d_%H%M%S'):
    ''' shortuct to attach a destination to an existing logging object
    logfile can be file or gzip or stream or None(meaning stderr) '''
    if whereto is None: whereto = sys.stderr
    if isinstance(whereto,(str,unicode)):
        fp = opener(whereto,'w')
    else:
        fp = whereto
    fh = logging.StreamHandler(fp)
    fh.setLevel(level)
    if format: 
        formatter = logging.Formatter(format,dateformat)
        fh.setFormatter(formatter)
    log.addHandler(fh)

def setupLogging(logname=None,rootname='',timestamp=False,consoleLevel=20):
    ''' shortcut to set up a dual stderr/logname LOGGING stream 
    default level for file is DEBUG, for console is INFO
    (set consoleLevel to 0 to turn off console)
    SEE PYTHON LOGGING DOCUMENTATION FOR LOGGING BEHAVIOR
    returns a logging object'''
    import atexit
    logger = logging.getLogger(rootname)
    logger.setLevel(logging.DEBUG)
    dateformat='%Y%m%d_%H%M%S'
    # change the formatting if timestamp
    fmtstring = "%(levelname)s: %(message)s"
    if rootname is not None and rootname != '':
        fmtstring = "%(name)s:" + fmtstring
    if timestamp:
        fmtstring = "[%(asctime)s:%(name)s:%(lineno)s:%(levelname)s] %(message)s"
    # add a file if specified
    if logname:
        assert isinstance(logname,(str,unicode))
        #logging.basicConfig(filename=logname,format=fmtstring,dateformat=dateformat)
        add_to_logging(logger,logname,format=fmtstring,dateformat=dateformat)
    # add a console
    if consoleLevel!=0:
        add_to_logging(logger,sys.stderr,format=fmtstring,level=consoleLevel,dateformat=dateformat)

    # cleanup and exception handling
    atexit.register(logging.shutdown)
    # the following will capture exceptions to the logs as well
    sys.excepthook = lambda *x: logger.error('Uncaught Exception',exc_info=x)
    return(logger)



In the above "opener()" is a separate function I have that wraps opening a filename, file object, pipe, or what have you depending on the input and optionally with encoding.  Sometimes I miss how easy that is dealt with in Perl.

2005-11-22

The infamous Python indentation

I don't care how good it looks, how clean it appears, how visually aesthetic to the reader it becomes... having the proper indentation as part of the Python syntax is just an annoyingly bad idea. I use emacs for most coding but occasionally use vi(m) for quick fixes, especially when I'm running from another terminal or someone else's machine. They are not always configured equally so that they produce different indentation lengths. It's just an annoyance but it occurs often enough to make me groan. I've read people defending this type of syntactically significant white space by claiming that all editors worth their weight can be configured to do the right thing and the clean listing is worth the "slight" inconvenience. I think it's wrong to dictate a certain level of editor in order to properly and easily edit code. I should be able to use whatever I want without having to configure it, even a DOS editor. But here's a short list of how it inconveniences my edits:
  • When developing, I often disable chunks of code with an "if (0)" or "if (false)". Because there's no end-of-block keyword (indentations mark it) I can't do this in Python. I have to resort to highlighting the entire section and re-indenting it en masse. Or i have to highlight it all and comment it out. Again, this requires an editor with the correct macros. But it is still more work than the simple if(true|false) I can use with most other languages.
  • Python's lack of block end keywords makes it easy to unintentionally nest blocks. Ever have consecutive if-statements and the second if auto-indents under the first one instead of at the same level? This is a common gotcha when I'm editting and replacing a line of code.
  • In emacs I often use tab to auto-indent/clean-up code. Lack of block end keywords means the indentation is not unique. It also means that you cannot have a beautifier program.
  • It's dangerous to cut and paste code from one section to the other, or off of web pages or other programs because the indent may be different (and difficult to clean up) or it may be at the wrong level.
  • Using Python interactively from the command line becomes a pain too since you have to keep track of indents for every line.
I sympathise with the desire to produce consistent, pleasant, and hence more maintainable code. I think enforcing it this way created more annoyances than it meant to solve. A simple beautify program often does the trick when you want the script cleaned up after a lot of cut/paste/reorganization.

2005-11-16

YAML

YAML once meant "Yet Another Markup Language" and now means "YAML Aint Markup Language" but I would prefer it to mean "What do you get when you breed a Yack and a Camel". It's a data serialization format that (and this is the important part) is human readable text. It was brought to my attention a year ago by a coworker and I had since hacked it in to our system to better handle log file parsing. I completely agreed it to be the right idea. It has made my log file parsing much less stressful and more consistent. It just sucks that the online community for it is still... lacking. Some of the main YAML sites seem grossly out of date and nearly empty.

My recent frustration is dealing with incomplete implementations in the different scripting languages. I'm currently parsing Python generated YAML streams with Ruby. They seem to conform to different versions of the specification; most notoriously for me right now is how they each treat boolean values. My Python implementation wants to turn "+" and "-" into true/false (I had since hacked that out). Ruby wants to turn strings like "true/false", "yes/no", "Y/N", and even "on/off" into boolean true/false. This is causing all sorts of minor annoyances since I'm dealing with text that will have this as literal strings and not as code for boolean values. I in fact think that having the YAML spec interpreting strings as booleans is just a bad idea with perhaps some exception. Auto-interpretation of data is a touchy subject and you're bound to upset someone no matter which way you go. I'd prefer the "everything's a string" method, since that would at least guarantee consistency. But for now I just hack in fixes for my own purposes. I may submit a patch someday. I'd feel safer using the format if I had a sense of more active development however.

Ruby seems to be the only safe-haven for YAML at the moment. They seem to be trying to turn it into a Pickling or Marshalling replacement though. That's not really what I want out of YAML. It's just a really good way to display nested list and hash structures in a clean, pleasant, and consistent way. Anything more may make it too complicated to be portable and general purpose. I'd still rather use it than write my own though, since I don't really have the time.