Python DB Logging Handler for Google Appengine

February 11th, 2010

NOTE: this is really only for CRITICAL logging you want to persist separately from google’s wrap of logging. You will burn up your quota otherwise. Main advantages: implements logging handler, maintains support for identities. Good for audit trails.

UPDATE: Appears the module level variable magic is not liked by appengine-patch. I switched to that from appengine helper because of hybrid auth convenience. I have updated to remove that part to prevent any module level variable wierdness. Updated code below. Alternative: just stick it inside your django tree.
— end UPDATE

I provide here a module that extends the Python logging framework to allow you to write messages to your Google appengine database. A database LogRecord model, Handler, Formatter, and logger manager are implemented.

( jump to the code )

Since you can’t store write access files on appengine, you have no easy way to separate your logging to different file handlers. The built in logging is nifty in the dev cycle. But a long term store is desireable, with the ability to set different log levels for different parts of your system.

So the goals I set out with were:

  • Persist all the usual log record info, including file, line, module, stack, etc. as available
  • Allow multiple logger identities for easy separation and grouping
  • Don’t break anything about logging module’s expected behavior
  • Provide a convience wrapper for zero-effort use

Now, in principle I’d generally avoid logging to a database, since log records are inherently denormalized, sequentially added, and typically read only after insert. They’re the perfect candidate for a flat file. But that is not an option here. Plus, I have confidence that the Appengine table structure should allow this to perform more or less like a flat file. Caveat: depending how you use this, you could hit your quotas substantially faster.

Quick note: this Handler provides its own formatter, which just shapes the data for the Appengine table. It doesn’t make sense to provide this a format string since it is not serializing the record. Similarly, I didn’t think it useful to trim off the lineno and funcname detail, since you can just select what you want from the table.

Otherwise, you use it just like any other logger from the logging module. When you actually write a message, a db entry is created for you. The default logger, which is created when getLogger() is called with no arguments. You can interact with the logger to set levels etc. via the object, as in
log = Log2DB.getLogger()
log.setLevel(logging.DEBUG).

Records created by this logger set the identity value as ‘Log2DB’. You can filter identities from the log table by setting identity = whatever you want.

For example:

import Log2DB
log = Log2DB.getLogger()
log.error('fun times')

import logging
log.logger.setLevel(logging.DEBUG)
log.debug('unfun detail')

All the above go to the default logger identity, “Log2DB”.
To create another logger to use, say for example you want one just for mailer status, you use the getLogger proxy function.

For example:

import Log2DB
maillogger = Log2DB.getLogger('MyMailLogger')
malllogger.error('oh noes')

try:
  raise HellException
except HellException, e:
  maillogger.exception('Error: %s stack to follow', e)

A record from the above is in the same table as the default log, but has identity = ‘MyMailLogger’. I thought about actually dynamically changing the log record model class name so you would get separate tables, but given the google architecture this has limited perf gain at cost of brittleness and complexity. I may add this anyway as an option.

Later I may paste a django view for managing the resulting records.

here is the code via pastie.org:

this replaces the old version at http://pastie.org/432156
moved _LogRecordHelper to inner class to fix an issue

Extend Rails ActiveRecord and ConnectionAdapter to support dirty reads on MySQL

February 11th, 2010

Discussed here is a mixin that extends Rails to provide an easy method to switch your database session between clean and dirty reads, otherwise known as transaction isolation level.

( jump to the code )

At Rescuetime, our most interesting analysis of tracked time requires a lot of complicated data munging on the database side. We extensively optimize the structure of the datastore and the plan of queries to produce quick results. However, there is some, however small, amount of row scanning and index hunting that is inevitable. Some of these tables are also subject to rapid, row overlapping, simultaneous insert loads.

In general, reporting or analytical access has no worries about data being up to date to the nearest microsecond, although as near real time as possible is highly desired. This near real time goal rules out an ETL type solution. Additionally, there is the cost factor. If we can make this work on one database, why build two?

In the quest for minimal stress for the online system, we introduced a method for flagging database work to be dirty reads, thus preventing any kind of locking (especially index locking) on the rows in question, and applied these where possible. This is simple enough in straight SQL, but we wanted to expose it to Rails framework in a consistent manner.

What this code does is:

1) Provide stubs in the abstract database adapter for “dirty()”, “clean()”, “reallyclean()”
2) Implement them in the MySQL adapter
3) Expose them to ActiveRecord::Base as a class method, prefixed with “isolation_”

On #1, #2: the choice of really clean versus clean simply describes the read locking strategy used. EG if you have multiple selects in same transaction on same rows, “clean” returns same result from same snapshot. However “reallyclean” will return newer rows if they exist on the later selects.

On #3, the prefix “isolation_” is added (yielding “isolation_dirty() etc.) since there is already some semantic in place for “dirty” in ActiveRecord.

For MySQL we set:

dirty = READ UNCOMMITTED
clean = REPEATABLE READ
reallyclean = READ COMMITTED

See their reference.

Here is the code:


Just put this code in something like lib/mysql_adapter_extensions.rb in your project, then require that in some controller.

Example use:

Person.isolation_dirty()
result = Person.find_by_name params[:name] # some crazier query here
Person.isolation_clean()

Database Connection Pool Solution for Django + Mysql

February 11th, 2010

The solution:

An alternative DATABASE_ENGINE for django that leverages core django mysql code with minimal overlap and sqlalchemy for connection pooling. Also, a pretty ‘DRY’ stub to mix in your own pooling if desired, or adapt to use postgres instead of mysql.

1. Install sqlalchemy library: http://www.sqlalchemy.org/download.html
I used “Latest 0.5″ myself.

2. Grab: mysql_pool.tgz

3. Make mysql_pool reachable by your project
3a. Unpack in your python’s “site-packages” directory
3b. Or: unpack somewhere in your project directory, and edit the “base.py” file in mysql_pool to fix the import lines containing uw.udjango.db.engine to be the new location mysql_pool.

4. Edit your settings.py to change DATABASE_ENGINE
4a. If 3a, set to uw.udjango.db.engine.mysql_pool
4b. If 3b, set to yourproject.whatever.mysql_pool

5. Edit your settings.py to add these (required) tuning settings:
DBPOOL_WAIT_TIMEOUT = 28800 # your mysql db’s server side inactive connection kill time
# discernable by ’show GLOBAL variables;’ in mysql, look for ‘wait_timeout’, changeable if desired
DBPOOL_SIZE = 20 # the maintained number of dbconnections, over this returned conns are destroyed
DBPOOL_MAX = 100 # the max allow connections, period
DBPOOL_INTERNAL_CONN_TIMEOUT = 10 # how long to wait for mysql to give you a connection

That should do it!

The explanation:

I’ve been reading up on the various was being proposed to make some kind of persistent connection re-use part of core django. The are a couple of options I like, especially having a core database engine choice of sqlalchemy. A lot of implications there though, and I didn’t want to tackle them all.

At first I figured I’d write my own, and stubbed out the code to do that. Since I’m using mysql, I copied it’s engine tree in django.db.backends into my project and began working on it. Quickly I realized I didn’t want to rewrite, or maintain duplicates of, the whole tree. I only cared about two calls really: self.connection = Database.connect() and self.connection.close(). Modifying those would be a way to plug in pooling of my own. Also, in the process I looked to SQLAlchemy’s pool code as an example of Python implementation, and it seemed pretty good. Eventually I’d like to add some more subtlety to the pool grow/shrink strategy.

I started riffing on the solution described on Ed Menendez’s site, and merging in some of my ideas to make it more DRY.

Basically, all the classes in mysql_pool use class naming tricks to extend-without-modification and assume the identity and characteristics of the core mysql engine. That means there is no code in mysql_pool to maintain other than the “base.py” file, and even that one is able to use the same short cut for most of its classes, the exception being DatabaseWrapper, which is updated to use SQLAlchemy’s pool.

Also, I added settings.py level control of SQLAlchemy’s QueuePool, which is the pool type mysql_pool is forced to use.

Other links of note on related topics:

Proposal: user-friendly API for multi-database support

Canon 5D and Pentax-M SMC Asahi 50mm 1.4 K mount

February 11th, 2010

I picked up an older (serial 6021530) K mount 50mm 1.4 after reading the notes that say m42 won’t work on 5D, but the K mount will if aperture pin is cut.

I didn’t find any info on how to cut the pin, and man, that is one tough piece of steal. Of course after totally mangling it I figured out I could have just lifted it out if I removed the ring that is screwed on to the rear element housing and has the flange on it that protects the pin. Oh well.

It turned out that that piece (even with the protecting flange clipped off) needed to be left off anyway. It prevents infinity focus, at least with the adapter I’m using.

So, now the lens has pin and flange ring (and the metal washer/ring under it) removed. In this state the 5D works fine with it. I seem to be able to focus at infinity, but either due to lens or adapter, I can also focus a tiny bit past infinity, and an rare occasion this causes mirror collision.

Now to the good part. All of this is just impression, based on reviewing in Lightroom, and is not scientific. This lens takes beautiful shots. It has slightly more chromatic aberration than Canon EF 50mm 1.4 (expected from its age), but seems to have better microcontrast wide open. Color comes through a little more saturated than the Canon. On the full frame sensor, there also seems to be slightly less barrel distortion than the Canon.

I may have had a bad copy (is there such a thing?) of the new Zeiss 50mm 1.4 T* ZF, but from f/1.4 to f/2.8 I consistently get better looking results with this Asahi. The Zeiss had more loss of detail and more haze, with the contrast very hard to control with exposure. Stopped down past 2.8, the Zeiss was gorgeous, ultra crisp with smooth Zeiss neutral-to-cool tones. I sold it though because of the f/1.4 performance.

As far as wide-open out-of-focus behaviour, the Canon and Asahi are quite similar and both have smooth blur with that lovely circular bokeh. I’ve seen other complaints of the Zeiss’s OOF at wide apertures, and have to say it does have a wormish quality that bugged me. A matter of taste I guess.

Finally, on both Zeiss and Asahi, I had to find the appropriate exposure compensation as I adjusted ISO, in Av mode.
Something about like this:
ISO 100: +- 0 ev
ISO 200: -1/3 ev
ISO 400: -2/3 ev
ISO 800: -1 ev
Of course, the subject matter some times required further decrementing of the ev.

I’ll add some example shots once I pick some I’m not embarrassed to link here.

Murloc ringtone

February 11th, 2010

Yup, it’s the murloc going aggressive sound in mp3 format, with phone-ready bitrate (32bps, mono). I used lame to convert it, “lame -m m -b32 -h input.mp3 output.mp3″. Here is the original source: WowWiki, in normal full bitrate mode.