Planet CherryPy

May 08, 2008

Kevin Dangoor

Paver 0.7: Better than distutils, better docs and much more

I’m delighted to release Paver 0.7. If you missed my original announcement, the short story is that Paver is a new build, distribution and deployment scripting tool geared toward Python projects. My original announcement and the new foreword to the docs explain the motivation.

Ben Bangert and others pointed out a giant documentation bug in 0.4: there was a fair bit of reference doc but no doc that said “here’s how you get started with Paver”. Now there is: Paver’s Getting Started Guide.

Paver 0.7 is a big step up from 0.4 (hence the version number bump). I implemented one of the two major features I had planned for 1.0: distutils/setuptools integration. It’s really cool. Have you ever wanted to just slightly change how “sdist” or “upload” or “develop” worked? Now you can, just by writing a function in your pavement.py file. And don’t worry, you don’t need to duplicate anything between setup.py and pavement.py. It all just moves into pavement.py and Paver can even generate a setup.py file for you, since most people are use to the common “python setup.py install” command.

I’ve gone even farther than that with making it easy to use Paver and not annoy users that don’t yet have Paver. Paver can create a small zip file of Paver’s core bits so that “python setup.py install” will work just fine even for users who don’t have Paver installed. Paver can also create a virtualenv bootstrap script for you, so that users don’t necessarily need to install your package on their systems in order to use it.

Paver’s got new documentation tools that work great with Sphinx. It’s now easy to mark sections of sample code files and then include those sections in your documentation, using the built-in version of Ned Batchelder’s Cog.

And I’m definitely eating my own dogfood. Paver is built using Paver itself and the source distribution includes the paver-minilib so that setup.py install should work fine (let me know if it doesn’t!) The new Getting Started Guide uses the new documentation tools.

There are even more changes than these, and you can look at the changelog for the full list. Note that if you’re using Paver 0.4, there are a couple of trivial breaking changes.

ShareThis

by Kevin Dangoor at May 08, 2008 02:55 AM under Paver

May 05, 2008

Kevin Dangoor

simplejson heading for the stdlib

Seeing this on Bob Ippolito’s blog might seem a little odd to many:

Rewrote test suite with unittest and doctest (no more nosetest dependency)
[From simplejson 1.9]

Why on Earth would someone change from nose-style tests to unittest tests? How about so that the library can go into the Python Standard Library?

simplejson will be a great addition. Thanks to Bob and the others who are working to get simplejson in!

ShareThis

by Kevin Dangoor at May 05, 2008 02:06 AM under json

April 28, 2008

Kevin Dangoor

MichiPUG on Thursday: zc.buildout and Paver

The May Michigan Python Users Group (MichiPUG) meeting is coming up on Thursday, May 1. I’ll be leading discussion on zc.buildout and Paver. Any build/distribution/deployment tool talk is welcome, as well as our usual general talk about Python topics.

I hope to see you there!

ShareThis

by Kevin Dangoor at April 28, 2008 03:49 PM under Python

April 22, 2008

Kevin Dangoor

Paver and the building, distribution, deployment etc. of Python projects

This morning, I released a new open source “build tool” aimed at Python projects: Paver. The goal of Paver is to provide a smooth way to script up the management of your Python projects. You can read all about it on the Paver site, but I wanted to provide some background here.

Look at all these tools!

Python programmers have a great many tools at our disposal. We have tons of libraries that make it so that we don’t have to write lots of code to get our software built. We also have a broad collection of tools to help us manage our projects.

  • Python’s standard library includes distutils, for packaging up and distributing Python projects.
  • setuptools is almost part of the standard library, and quite a few projects require it. setuptools gives you cross-platform dependency management, more packaging options, script generation and a simple plugin framework.
  • zc.buildout helps you with the creation of repeatable, easily installed ready-to-run installations of projects. It gives you a contained environment so that you don’t need to muck with the global Python configuration on the system to make a working installation.
  • zc.buildout supports “recipes” that handle installation and configuration of various parts that your Python project may need
  • virtualenv gives you just the contained installation part of zc.buildout, but it does it in a slightly different way that I’ve found easier for certain, not-egg-friendly projects.
  • PasteScript can be used to generate configuration files and complete skeleton projects
  • Sphinx is a new package for generating documentation from ReStructred Text sources. It’s very cool, and it’s what I used for Paver’s site.
  • and there are many, many more

Seems great, what’s the problem?

I have personally used all of these tools at one time or another. In fact, I’ve used them all recently. In working with them, I couldn’t help but notice some aspects that made my life harder than it needed be.

For example, when using distutils or setuptools, it’s very easy to add behavior that runs before or after the setup command, because your setup files are just plain Python. It’s not as easy to customize the way a command behaves, or to add a new command entirely. You need to read the docs, make a new class and register that class somehow.

zc.buildout is awesome and makes it easy to get a predictable collection of components installed. It uses an INI file as its file format, which means that adding behavior is not straightforward. Creating a new zc.buildout recipe is very much like creating a new setuptools command: create a separate class and refer to it an in egg. I believe there’s a zc.buildout recipe for putting some commands in your INI file. Do you want Python code in your INI file?

Which also brings up another point: distutils and setuptools use a Python file and keyword parameters for their configuration. (There is also an optional INI file.) zc.buildout uses an INI file. Sphinx uses a .py file.

What would I want?

It seemed to me that life would be better if:

  • If I need to do something that takes 5 lines of Python, I could do it in little more than 5 lines of Python without adding another file for that purpose.
  • If configuration could take on a consistent, predictable form
  • If things that I do often in managing my projects took even less Python to script.
  • If the system could be used easily with multiple projects by not requiring anything else, but taking advantage of other packages when they’re present

It is with those goals and looking around that Paver came into existence. As with TurboGears, I did not want to reinvent the various parts of the whole that I’m angling for. The idea is to use zc.buildout’s machinery, not reinvent it. I used Jason Orendorff’s great path.py module rather than inventing my own abstraction there.

I didn’t set out to invent the scripting format, either. I seriously considered Zed Shaw’s Vellum which has shaped up quite nicely. But in trying it out, I realized that I really wanted my projects managed by Python scripts that had little headache and little overhead. Doing computations, loops and breaking code up into separate functions (or other organizing blocks) are all obvious for a Python programmer if the language is Python. For the record, Zed wants his build files to be just “data”, for perfectly rational reasons. For me, though, I want Python.

Now for the “this is an early release caveat“. Paver is functional, and I use it. But, its support for the various libraries is quite shallow right now, and zc.buildout/virtualenv are not at all represented yet. What I’ve released is basically the parts that I’ve needed so far, and I’ll be adding on as I need things. I figured that if others think the approach is worthwhile, we can pool our efforts and build out the Paver Standard Library a bit quicker. I should also note that while Paver should work on Windows, it’s only been used on Macs and Linux. Finally, it’s possible that Paver’s pavement.py syntax may change along the way to 1.0, but I can promise to document those changes and I don’t expect a great deal of pain in making the transitions.

Note also that if you maintain a Python library that is useful in helping people work with the projects, it’s easy to add Paver targets and such to your own library. Paver itself includes support for other libraries because of the chicken-and-egg problem. People won’t support Paver until people are using it and people wouldn’t use it if it didn’t support the kinds of things that people already do. So, Paver includes the connectors for the libraries that I need.

I’ve set up all of the various project goodies for Paver:

Ian Bicking has given me a great mass of useful feedback, which I have not yet fully digested. Mark Ramm, Ben Bangert and David Stanek also had some helpful input.

And, I’d love to hear more from you!

ShareThis

by Kevin Dangoor at April 22, 2008 04:45 PM under Paver

April 14, 2008

Kevin Dangoor

Amazon preannounces persistent storage for EC2

I don’t know if this preannouncement comes as a result of all of the Google App Engine publicity, but here it is: Amazon Web Services Blog: Storage Space, The Final Frontier. In a nutshell: AWS now lets you create a storage volume of 1GB to 1TB that can be mounted in one EC2 instance and will persist beyond the lifetime of an EC2 instance. As an added bonus, you can have automatic snapshots of your volume plunked into S3.

They say that this storage is a low-latency, high-throughput block device. So, you can run all kinds of traditional software on top of it.

This will change the competitive outlook a bit between AWS and GAE a bit, because it makes it easier for people to use all of the software pieces that they’re used to when they use AWS to manage the hardware infrastructure. This means that it’s easier to take your existing apps and skills and get them up on AWS. GAE has a fight ahead in terms of getting people to write their apps differently… but the benefit to doing so is that you no longer think of hardware infrastructure at all.

ShareThis

by Kevin Dangoor at April 14, 2008 01:34 PM under aws

April 10, 2008

Kevin Dangoor

App Engine: I agree with Ian Bicking the Luddite

Isn’t quoting out of context great:

I hate computers.
[From Ian Bicking: App Engine: Commodity vs. Proprietary]

More seriously, though, in saying “I hate computers”, Ian is actually talking about the opposite of being a Luddite. He’s dreaming of a world in which much of computing just works in the background, so that we can spend our time doing more important and interesting things in the foreground.

I’m linking to Ian here because he’s said exactly what I have been thinking about App Engine: from a Python programming perspective the APIs are simple and clear. I can easily imagine a ZODB-based implementation of Google’s data store API. Just change your imports, and you can be off of Google’s infrastructure and on to your own.

Of course, for a great many people there won’t be any reason to be off of Google’s infrastructure. App Engine is just so darn easy. Amazon Web Services is impressive because it makes scalability affordable and available. App Engine interests me because, for its broad-but-still-limited set of use cases, it makes scalability a no brainer. “Build your app like this, and you never have to think of scaling” is a nice thought. I’ve been around enough to know that people using App Engine will still have to think of scaling some, but not nearly as much as with just about any other solution.

Back to the lock-in aspect, though. I still see App Engine as likely to be utterly unsuccessful with large businesses. That is, until a new Google Appliance comes out. I’ve been predicting such a beast since Google Docs was first introduced, and I think App Engine makes it all the more likely. I still believe that there will come a time when Google will sell boxes to big companies that those companies can toss into some racks on their networks and deploy App Engine apps locally, as well as run Google Docs on their private nets. Things will get even more interesting at that point.

You can bet that Amazon is studying App Engine closely and considering their own high-level service as I write this. From a developer’s perspective, this competition is going to be awesome.

ShareThis

by Kevin Dangoor at April 10, 2008 01:17 AM under google

April 08, 2008

Kevin Dangoor

Google App Engine, not really an AWS competitor?

It occurred to me just now that Google App Engine and Amazon Web Services are only barely in competition right now. If you want an infinite storage system like AWS S3 in App Engine, you need to code it yourself (ignoring the preview limits App Engine currently has). If you want to deploy apps as easily as you can with App Engine in AWS, you need a bunch of infrastructure that AWS does not provide.

I’m happy to see that App Engine’s datastore is transactional, unlike SimpleDB. I didn’t see anything in my skim of some docs about whether App Engine has eventual consistency or if you can immediately pull out data that you stuff in. My guess is that you can immediately pull out the data you shove in. This is a win over SimpleDB, in my opinion.

App Engine is just tons higher-level than AWS. Of course, you can host anything you want in AWS. But, by trading away a bunch of that flexibility, Google has made a service that allows people to build apps that scale well with a minimum of fuss.

ShareThis

by Kevin Dangoor at April 08, 2008 11:09 AM under google

Google App Engine - big apps for Python folks

Google’s App Engine has been released. This is much cooler than just opening up BigTable for outside access (which is what TechCrunch reported over the weekend). One big difference between App Engine and Amazon Web Services is that the The Development Environment lets you build an app locally, including Google’s auth API and datastore. That’s very clever. You can build up an app completely and then deploy it when ready.

Or, in the case of the preview period, when you get an account… which, sadly, I didn’t. I rather wish there was a bit more information about when more developer slots will be opened. It would be a shame to create a cool app and have to sit on it for months. It would also be nice to know what pricing will look like, but given what they are giving away for free, I’m guessing it will not be unreasonable.

Overall, I’ve got to say that it looks like a great service on the surface.

ShareThis

by Kevin Dangoor at April 08, 2008 10:43 AM under google

April 06, 2008

Kevin Dangoor

Rumor: Google To Launch BigTable As Web Service

Interesting rumor, and totally plausible. Just as Amazon has thought “why not make some money off of all of this great infrastructure we’ve built”, it looks like Google is going to do the same thing: Source: Google To Launch BigTable As Web Service

Google may be releasing BigTable, its internal database system, as a web service to compete with Amazon SimpleDB, according to a source with knowledge of the launch.

This will be one more non-traditional database among the many interesting choices that exist today.

ShareThis

by Kevin Dangoor at April 06, 2008 02:12 AM under google

April 05, 2008

Kevin Dangoor

InternetNews - Python Fans Take Aim at the Enterprise

David Goodger, Michael Foord and I talk about Python’s enterpriseyness in InternetNews Realtime IT News – Python Fans Take Aim at the Enterprise

After years in the shadows, the open source Python programming language is becoming increasingly mainstream. There are more users and more tools. Backers of Python now argue that Python is ready for the enterprise.

ShareThis

by Kevin Dangoor at April 05, 2008 01:28 AM under Python

April 04, 2008

Kevin Dangoor

TurboGears Ultimate DVD now online, free at ShowMeDo

In June 2006, I shipped the TurboGears Ultimate DVD, featuring several hours of useful screencast material for TurboGears programmers. I produced the DVD with TurboGears 0.9a6 (or so) code, but much of what is talked about there applies to TurboGears 1.0x users. A big thanks to Ian Oszvald and Kyran Dale for putting the effort into getting 2GB of material transcoded and online: TurboGears Ultimate DVD (TG v1.0) - video tutorials to learn turbogears, web_development, web_framework, python, javascript, web_application, cheeseshop, generic_function, JSON, metaclass, widget, API, sqlalchemy, cherrypy, sqlobject, WSGI

ShareThis

by Kevin Dangoor at April 04, 2008 04:06 PM under TurboGears

April 03, 2008

Robert Brewer

CherryPy 3 request_queue_size

Well, that was instructive. Leaving server.request_queue_size at the default 5:

C:\Python24\Lib\site-packages>python cherrypy\test\benchmark.py
Starting CherryPy app server...
Started in 1.10800004005 seconds

Client Thread Report (1000 requests, 14 byte response body, 10 server threads):

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      1000 |      0 |  736.81 |    1.357 | 119.36 |
     20 |      1000 |      0 |  436.07 |    2.293 |  70.64 |
     30 |      1000 |      0 |  348.38 |    2.870 |  56.44 |
     40 |      1000 |      0 |  233.10 |    4.290 |  37.76 |
     50 |      1000 |      0 |  296.77 |    3.370 |  48.08 |
Average |    1000.0 |    0.0 | 410.226 |    2.836 | 66.456 |

Client Thread Report (1000 requests, 14 bytes via staticdir, 10 server threads):

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      1000 |      0 |  421.73 |    2.371 |  87.30 |
     20 |      1000 |      0 |  374.87 |    2.668 |  77.60 |
     30 |      1000 |      0 |  306.71 |    3.260 |  63.49 |
     40 |      1000 |      0 |  240.08 |    4.165 |  49.70 |
     50 |      1000 |      0 |  170.03 |    5.881 |  35.20 |
Average |    1000.0 |    0.0 | 302.684 |    3.669 | 62.658 |

Size Report (1000 requests, 50 client threads, 10 server threads):

    bytes | Completed | Failed | req/sec | msec/req |   KB/sec |
       10 |      1000 |      0 |  187.98 |    5.320 |    29.70 |
      100 |      1000 |      0 |  207.45 |    4.820 |    51.45 |
     1000 |      1000 |      0 |  186.89 |    5.351 |   210.81 |
    10000 |      1000 |      0 |  228.12 |    4.384 |  2262.07 |
   100000 |      1000 |      0 |  245.60 |    4.072 | 24022.01 |
100000000 |      1000 |     10 |   20.83 |   48.001 | 20358.12 |

Upping server.request_queue_size to 128:

C:\Python24\Lib\site-packages>python cherrypy\test\benchmark.py
Starting CherryPy app server...
Started in 1.10700011253 seconds

Client Thread Report (1000 requests, 14 byte response body, 10 server threads):

threads | Completed | Failed | req/sec | msec/req |  KB/sec |
     10 |      1000 |      0 |  745.38 |    1.342 |  120.75 |
     20 |      1000 |      0 |  772.32 |    1.295 |  125.12 |
     30 |      1000 |      0 |  654.11 |    1.529 |  105.97 |
     40 |      1000 |      0 |  929.02 |    1.076 |  150.50 |
     50 |      1000 |      0 |  641.03 |    1.560 |  103.85 |
Average |    1000.0 |    0.0 | 748.372 |   1.3604 | 121.238 |

Client Thread Report (1000 requests, 14 bytes via staticdir, 10 server threads):

threads | Completed | Failed | req/sec | msec/req |  KB/sec |
     10 |      1000 |      0 |  547.89 |    1.825 |  113.41 |
     20 |      1000 |      0 |  588.10 |    1.700 |  121.74 |
     30 |      1000 |      0 |  704.42 |    1.420 |  145.82 |
     40 |      1000 |      0 |  547.89 |    1.825 |  113.41 |
     50 |      1000 |      0 |  516.96 |    1.934 |  107.01 |
Average |    1000.0 |    0.0 | 581.052 |   1.7408 | 120.278 |

Size Report (1000 requests, 50 client threads, 10 server threads):

    bytes | Completed | Failed | req/sec | msec/req |   KB/sec |
       10 |      1000 |      0 |  622.35 |    1.607 |    98.33 |
      100 |      1000 |      0 |  604.74 |    1.654 |   149.37 |
     1000 |      1000 |      0 |  667.74 |    1.498 |   752.54 |
    10000 |      1000 |      0 |  890.31 |    1.123 |  8837.25 |
   100000 |      1000 |      0 |  728.44 |    1.373 | 71247.09 |
100000000 |      1000 |    202 |   12.81 |   78.094 |     None |

by fumanchu at April 03, 2008 05:00 AM under WSGI

Please don't use wsgiapp

Gordon Tillman has a wiki page up on how to mix Django content into a CherryPy site. It's easy and probably works, but please don't do it anymore.

We're officially going to deprecate the wsgiapp Tool because 1) it doesn't conform to the WSGI spec (and cannot be fixed to do so), and 2) there's a better way to mix content in a CherryPy site: tree.graft.

The tree.graft(app, script_name) method is the proper way to add Django or other WSGI content to an existing CherryPy site. Instead of nesting the two frameworks, we branch instead. To take Gordon's example, instead of:

class DjangoApp(object):
    _cp_config = {
        'tools.wsgiapp.on': True,
        'tools.wsgiapp.app': AdminMediaHandler(WSGIHandler()),
}
...
cherrypy.tree.mount(DjangoApp(), '/')

You should always write this instead:

cherrypy.tree.graft(AdminMediaHandler(WSGIHandler()), '/')

Look, if you nest the one inside the other, CherryPy's going to do an awful lot of HTTP request parsing that is going to be completely redundant, since Django's going to do it again anyway. And this code is not very fast. Your site is going to crawl. That's strike one for nesting.

Strike two is the "always on" nature of nesting as opposed to branching. When you write your request/response cycle like an onion, every component which could possibly play a part in the request has to be called, even if just to reply "I'm not involved in this one". Given the slowness of Python function calls, this is rarely a good thing. If you thought your site was crawling before... This was a major design flaw of CherryPy 2, and is a major reason CherryPy 3 is 3x faster: the old Filters were called all the time, even if you didn't need them; the new Tools are only called when they're applicable.

Strike three against the nested approach is that it's always easier to traverse a tree of siblings than it is to traverse a nested set; programmers, for some reason, like to hide information from you, including how their site components go together. The branched version will be much easier to reason about, statically analyze, and write inspection tools for.

So please, use tree.graft, and stop using the wsgiapp Tool in CherryPy 3. We're going to formally deprecate it soon.

by fumanchu at April 03, 2008 05:00 AM under WSGI

Lines of code

I was asked last week how many lines of code some of my projects are, and didn't have an answer handy. Fortunately, it's easy to write a LOC counter in Python:

"""Calculate LOC (lines of code) for a given package directory."""

import os
import re

def loc(path, pattern="^.*\.py$"):
    """Return the number of lines of code for all files in the given path.

    If the 'pattern' argument is provided, it must be a regular expression
    against which each filename will be matched. By default, all filenames
    ending in ".py" are analyzed.
    """
    lines = 0
    for root, dirs, files in os.walk(path):
        for name in files:
            if re.match(pattern, name):
                f = open(os.path.join(root, name), 'rb')
                for line in f:
                    line = line.strip()
                    if line and not line.startswith("#"):
                        lines += 1
                f.close()
    return lines

I've added the above to my company's public-domain misc package at http://projects.amor.org/misc/. Here are the results for my high-priority projects (some are proprietary):

>>> from misc import loc
>>> loc.loc(r"C:\Python24\Lib\site-packages\raisersedge")
2290
>>> loc.loc(r"C:\Python24\Lib\site-packages\dejavu")
7703
>>> loc.loc(r"C:\Python24\Lib\site-packages\geniusql")
9509
>>> loc.loc(r"C:\Python24\Lib\site-packages\cherrypy")
16391
>>> loc.loc(r"C:\Python24\Lib\site-packages\endue")
9339
>>> loc.loc(r"C:\Python24\Lib\site-packages\mcontrol")
11512
>>> loc.loc(r"C:\Python24\Lib\site-packages\misc")
4648

~= 61 kloc. Pretty hefty for a single in-house web app stack. :/ But, hey, nobody said integration projects were easy.

by fumanchu at April 03, 2008 05:00 AM under CherryPy

Web Site Process Bus

WSGI has enabled an ecosystem where site deployers can, in theory, mix multiple applications from various frameworks into a single web site, served by a single HTTP server. And that's great. But there are several areas where WSGI is purposefully silent, where there is still room for standards-based collaboration:

  • managing WSGI HTTP servers (start/stop/restart)
  • construction of the WSGI component graph (servers -> middlewares -> apps)
  • main process state control (start/stop/restart/graceful)
  • site-wide services (autoreload, thread monitors, site logging)
  • config file formats and parsing for all of the above

Most frameworks address all of the above already, to varying degrees; however, they still tend to do so in a very monolithic manner. Paste is notable for attempting to provide some of them in discrete pieces (especially WSGI graph construction and a config format tailor-made for it).

But I'm going to focus here on just two of these issues: process state and site-wide services. I believe we can separate these two from the rest of the pack and provide a simple, common specification for both, one that's completely implementable in 100 lines of code by any framework.

The problem

One of the largest issues when combining multiple frameworks in a single process is answering the question, "who's in control of the site as a whole?" Multiple frameworks means multiple code bases who all think they should provide:

  • the startup script
  • daemonization
  • dropping privileges
  • PID file management
  • site logging
  • autoreload
  • signal handling
  • sys.exit calls
  • atexit handlers
  • main thread error trapping

...and they often disagree about those behaviors. Throw Apache or lighttpd into the mix and you've got some serious deployment issues.

The typical solution to this is to have each component provide a means of shutting off each process-controlling feature. For example, CherryPy 3 obeys the config entry engine.autoreload_on = False, while django-admin.py takes a --noreload command-line arg. But these are different for each framework, and difficult to coordinate as the number of components grows. Since, for example, only one autoreloader is needed per site, a more usable solution would be to selectively turn on just one instead of turning off all but one.

For a worse example, let's look at handling SIGTERM. Currently, we have the following:

SIGTERM before WSPBus

OK, Django doesn't actually provide a SIGTERM handler, but you get the idea. If several components register a SIGTERM handler, only one of them will "win" by virtue of being the last one to register. And chances are, the winning handler will shut down its component cleanly and then exit the process, leaving other components to fend for themselves.

In fact, there's a whole list of negatives for the monolithic approach to process control and site services:

  1. Frameworks and servers have to provide all desirable site behaviors, or force their packagers/deployers to develop them ad-hoc.
  2. Frameworks and servers all have different API's for changing process state. Race conditions and unpredictable outcomes are common.
  3. Frameworks and servers all have different API's for reacting to process state changes. Resource acquisition and cleanup becomes a huge unknown.
  4. Frameworks and servers have to know they're being deployed alongside other frameworks and servers.

We could attempt to solve this with a Grand Unified Site Container, but that would most likely:

  1. force a single daemon implementation, thus eliminating innovation in process invocation,
  2. force a single configuration syntax, thus denying any market over declaration styles,
  3. force a static set of site services, limiting any improvements in process interaction,
  4. add an additional dependency to every framework,
  5. deny using HTTP servers like Apache and lighttpd in the same process (since they do their own process control), and
  6. be a dumping-ground for every other aspect of web development, from databases to templating.

A solution: the Web Site Process Bus

The Web Site Process Bus uses a simple publish/subscribe architecture to loosely connect WSGI components with site services. Here's our SIGTERM example, implemented with a WSPBus:

SIGTERM after WSPBus

The singleton Bus object does three things:

  1. It models server-availability state via a "state" attribute, which is a sentinel value from the set: (STARTING, STARTED, STOPPING, STOPPED).
  2. It possesses methods to change the state, such as "start", "stop", "restart", "graceful", and "exit".
  3. It possesses "publish" and "subscribe"/"unsubscribe" methods for named channels.

Each method which changes the state also has an equivalent named channel. Any framework, server, or other component may register code as a listener on any channel. For example, a web framework can register database-connection code to be run when the "start" method is called, and disconnection code for the "stop" method:

bus.subscribe("start", orm.connpool.start)
bus.subscribe("stop", orm.connpool.stop)

Any channel which has no listeners will simply ignore all published messages. This allows component code to be much simpler; callers do not need to know whether their actions are appropriate--they are appropriate if a listener is subscribed to that channel.

In addition to the builtin state-transition channels, components are free to define their own pub/sub channels. CherryPy's current implementation, for example, defines the additional channels start_thread and stop_thread, and registers channels for signals, such as "SIGTERM", "SIGHUP", and "SIGUSR1" (which then typically call bus methods like "restart" and "exit"). Some of these could be standardized. Other custom channels would be more naturally tightly-coupled, requiring awareness on the part of callers and callees.

Since WSPB state-changing method calls are expected to be sporadic, and often fundamentally serial (e.g., "autoreload"), their execution is synchronous. Subscribers (mostly of custom channels), however, are free to return immediately, and continue their operation asynchronously.

Benefits

The WSPB cleanly solves all of the problems outlined above. The various components are no longer in competition over process state; instead, there is a single race-free state machine. However, no single component has to know whether or how many other components are deployed in the same site.

Frameworks and servers can provide a subset of all site services, with a common, imperative-Python API for deployers to add or substitute their own. However, the WSPB doesn't define a config syntax, so each framework can continue to provide its own unique layer to translate config into that API. A deployer of a combined Pylons/Zope website could choose a Pylons startup script and config syntax to manage the lifecycle of the Zope components.

The WSPB doesn't try to instantiate or compose WSGI components (server -> middleware -> app) either. So there's even room for site daemons which provide no traditional web app functionality; instead, they specialize in providing tools to compose WSGI component graphs via a config file or even a GUI.

It also "plays nice" with mod_python, mod_proxy, mod_wsgi, FastCGI, and SCGI. Those who develop WSGI gateways for these will have a clear incentive to consolidate their ad-hoc startup and shutdown models into the WSPB. For example, a modpython gateway can use apache.register_cleanup to just call bus.stop() instead of providing custom cleanup-declaration code.

Best of all, the WSPB can be defined as a specification which any framework can provide in a small amount of code. Rather than attempt to draft the specification here (that can be hashed out on Web-SIG, since this is by no means complete), I'm just going to provide an example:

try:
    set
except NameError:
    from sets import Set as set
import sys
import threading
import time
import traceback as _traceback


# Use a flag to indicate the state of the bus.
class _StateEnum(object):
    class State(object):
        pass
states = _StateEnum()
states.STOPPED = states.State()
states.STARTING = states.State()
states.STARTED = states.State()
states.STOPPING = states.State()


class Bus(object):
    """Process state-machine and messenger for HTTP site deployment."""

    states = states
    state = states.STOPPED

    def __init__(self):
        self.state = states.STOPPED
        self.listeners = dict([(channel, set()) for channel
                               in ('start', 'stop', 'exit',
                                   'restart', 'graceful', 'log')])
        self._priorities = {}

    def subscribe(self, channel, callback, priority=None):
        """Add the given callback at the given channel (if not present)."""
        if channel not in self.listeners:
            self.listeners[channel] = set()
        self.listeners[channel].add(callback)

        if priority is None:
            priority = getattr(callback, 'priority', 50)
        self._priorities[(channel, callback)] = priority

    def unsubscribe(self, channel, callback):
        """Discard the given callback (if present)."""
        listeners = self.listeners.get(channel)
        if listeners and callback in listeners:
            listeners.discard(callback)
            del self._priorities[(channel, callback)]

    def publish(self, channel, *args, **kwargs):
        """Return output of all subscribers for the given channel."""
        if channel not in self.listeners:
            return []

        exc = None
        output = []

        items = [(self._priorities[(channel, listener)], listener)
                 for listener in self.listeners[channel]]
        items.sort()
        for priority, listener in items:
            # All listeners for a given channel are guaranteed to run even
            # if others at the same channel fail. We will still log the
            # failure, but proceed on to the next listener. The only way
            # to stop all processing from one of these listeners is to
            # raise SystemExit and stop the whole server.
            try:
                output.append(listener(*args, **kwargs))
            except (KeyboardInterrupt, SystemExit):
                raise
            except:
                self.log("Error in %r listener %r" % (channel, listener),
                         traceback=True)
                exc = sys.exc_info()[1]
        if exc:
            raise
        return output

    def start(self):
        """Start all services."""
        self.state = states.STARTING
        self.log('Bus starting')
        self.publish('start')
        self.state = states.STARTED

    def restart(self):
        """Restart the process (may close connections)."""
        self.stop()

        self.log('Bus restart')
        self.publish('restart')

    def graceful(self):
        """Advise all services to reload."""
        self.log('Bus graceful')
        self.publish('graceful')

    def block(self, state=states.STOPPED, interval=0.1):
        """Wait for the given state, KeyboardInterrupt or SystemExit."""
        try:
            while self.state != state:
                time.sleep(interval)
        except (KeyboardInterrupt, IOError):
            # The time.sleep call might raise
            # "IOError: [Errno 4] Interrupted function call" on KBInt.
            self.log('Keyboard Interrupt: shutting down bus')
            self.stop()
        except SystemExit:
            self.log('SystemExit raised: shutting down bus')
            self.stop()
            raise

    def stop(self):
        """Stop all services."""
        self.state = states.STOPPING
        self.log('Bus stopping')
        self.publish('stop')
        self.state = states.STOPPED

    def exit(self, status=0):
        """Stop all services and exit the process."""
        self.stop()

        self.log('Bus exit')
        self.publish('exit')
        sys.exit(status)

    def log(self, msg="", traceback=False):
        if traceback:
            exc = sys.exc_info()
            msg += "\n" + "".join(_traceback.format_exception(*exc))
        self.publish('log', msg)

by fumanchu at April 03, 2008 05:00 AM under WSGI

It's official: CherryPy rocks

From sucks-rocks.com:

CherryPy rocks

No, really. It rocks. Rocks, rocks, rocks.

(Thanks, jamwt!)

by fumanchu at April 03, 2008 05:00 AM under CherryPy

PyCon 2007 and CherryPy

PyCon 2007 is nearing a close; here are some notes on how it affected CherryPy:

Web application deployment

Chad Whitacre (author of Aspen) herded several cats into a room on Sunday and forced us to discuss the various issues surrounding Python web application deployment. This is hinted at in the WSGI spec:

Finally, it should be mentioned that the current version of WSGI does not prescribe any particular mechanism for "deploying" an application for use with a web server or server gateway. At the present time, this is necessarily implementation-defined by the server or gateway. After a sufficient number of servers and frameworks have implemented WSGI to provide field experience with varying deployment requirements, it may make sense to create another PEP, describing a deployment standard for WSGI servers and application frameworks.

There were three basic realms where the participants agreed we could try to collaborate/standardize:

  1. Process control: stop, start, restart, daemonization, signal handling, socket re-use, drop privileges, etc. If you're familiar with CherryPy 3, you'll recognize this list as 95% of the current cherrypy.engine object. The CherryPy team has already been discussing ways of breaking up the Engine object; this may facilitate that (and vice-versa). Joseph Tate volunteered to look at socket re-use issues specifically, but the general consensus seemed to be that much of this would be hashed out on Web-SIG.

  2. WSGI stack composition: Jim Fulton proposed that we could all agree on Paste Deploy (at least a good portion of the API) to manage this in a cross-framework manner. Most heads nodded, "yes". Jim also proposed that each of the framework authors take the next week to refamiliarize themselves with Deploy, and then start pestering Ian Bicking with specific API issues. Ian suggested that he should fork Paste Deploy into another project specifically for this. For CherryPy, this would first mean offering standard egg entry points. [Personally, I'd like to standardize on a pure-Python API for deploy, not a config file format API. In other words, make the config file format optional, so that users of CP-only apps could avoid having to learn a distinct config file format for deployment. It should be possible to transform various config file formats into the same Python object(s).]

  3. Benchmarks: Jim also suggested we create a standard WSGI HTTP server benchmark suite, with various test applications and concurrency scenarios. This would compare various WSGI HTTP servers, as opposed to CherryPy's existing benchmark suite which compares successive versions of the full CP stack. Ian volunteered to begin work on that project (with the expectation that others would contribute substantial use cases, etc).

Others who were present for at least a portion of the long discussion: me, Mark Ramm, Kevin Dangoor, Ben Bangert, Jonathan Ellis, Matt Good, Brian Beck, and Calvin Hendryx-Parker.

WSGI middleware authoring

After some discussion with Mark (and he with Ian and Ben), we agreed that CherryPy could do more in the WSGI-middleware-authoring department. There is a continuous pressure to simply re-use or fix up the existing CherryPy request object to fill this need; however, there are some fundamental problems with that approach (such as the use of threadlocals to manage context, and the difficulty of streaming WSGI output through a CherryPy app). At the moment, I'm leaning toward adding a new API to CherryPy which would be similar to the application API, but specifically targeted at middleware authoring.

by admin at April 03, 2008 05:00 AM under WSGI

help(CherryPy 3.0)

Abstract

  1. CherryPy just grew its first metaclass.
  2. CherryPy just grew its first stdlib monkeypatch.
  3. Because of 1 and 2, CherryPy is now a heck of a lot easier to learn and use.
  4. Points 1, 2, and 3 all apply to unreleased trunk code and are subject to change.

Intro

I've been a proper fool (and might still be). I've been telling everyone that CherryPy 3 is much easier to learn and use because it's been tailored to be help()-ful. What I meant was that you could open an interactive interpreter, type help(cherrypy.<thing>) and have at least some idea of what it does. I spent quite a bit of time honing the top-level namespace down to as few components as possible (and some of the component namespaces, too) in order to make help() easier to read.

This is harder to do than you might think. Unlike simple linear scripts or libraries, the most important objects when CherryPy is "live" don't exist at an interactive prompt. The Request, Response, and Session objects are all heavily dependent on the context of a real HTTP conversation. They're hard to create in a vacuum. And although there's one of each per thread while the system is running, they are implemented as thread local objects so that the CherryPy programmer can treat each of them as if there were only one: a global.

Reusing thread locals

Thread locals are a great invention, but they suffer from one serious drawback when used in a threaded framework: they allow anyone to add attributes to them. If the framework re-uses the same thread for multiple requests, it becomes difficult to reliably clean out all of those attributes between requests.

CherryPy's solution to that was to add a container in 2.1; instead of a separate thread local for the Request, Response, and Session objects, there is a single, hidden thread local called cherrypy._serving, and the Request, Response, and Session objects for each thread are attributes of the "serving" object. This makes it easy for cleanup code: it just calls cherrypy._serving.__dict__.clear() when the request ends. (Aside: this technique also allows the Request, Response and Session types to be overridden).

However, pushing those objects into a container means they're no longer so easy to reference. CherryPy code would become uglier and more difficult if, instead of:

cherrypy.request.method

...you had to write:

cherrypy._serving.request.method

So a _ThreadLocalProxy class was introduced to allow CherryPy code to keep writing the nicer, shorter syntax. In short, it passes __getattr__ (and other double-underscore methods) through to a wrapped object. So cherrypy.request became a proxy object to a wrapped Request object. Ditto for response and session.

That was fine for CherryPy 2, but one of the goals for version 3.0 is better IDE support. Most IDE's at least provide calltips for code completion, but there aren't usually any HTTP requests coming in as you're writing code! CP 2's thread local proxies didn't have a request object in the main thread (or any thread that wasn't started by the HTTP server), so typing cherrypy.request. couldn't result in a calltip as you coded. The solution for CherryPy 3 was to have the proxy's __getattr__ and friends wrap a default object if a live object could not be found. And the default objects' attributes are true defaults; if they're not overridden (in config or code), they won't change when the system goes live. This makes interactive exploration even easier; you can forget all about the threading and pretend you're looking at live, global objects.

help(proxy) isn't helpful

But there's another catch: one of the few problems with using a proxy object in pure Python is that it's no longer of the same type as the wrapped object. Unfortunately for us, Python's builtin help function uses pydoc, and pydoc calls type(obj) quite a bit.

You can certainly call help(cherrypy.request.run) and get the correct docstring, because "run" is an attribute of cherrypy.request, the proxy calls __getattr__ first, and then type() is called on the attribute, not the request object/proxy. But if you attempt help(cherrypy.request), you're in for some confusion, because the proxy implementation leaks out.

Or rather, it did leak out until just now. I took the plunge and CherryPy now monkeypatches pydoc, so that it "passes the help() call through the proxy". Monkeypatching the standard library is of course a huge no-no, but the alternative was to essentially copy and paste most of pydoc and distribute the result with CherryPy. Now, help(cherrypy.response) at least prints:

>>> help(cherrypy.response)
Help on Response in module cherrypy._cprequest object:

class Response(__builtin__.object)
 |  An HTTP Response, including status, headers, and body.
 |  
 |  Application developers should use Response.headers (a dict) to
 |  set or modify HTTP response headers. When the response is finalized,
 |  Response.headers is transformed into Response.header_list as
 |  (key, value) tuples.
 |  
 |  Methods defined here:
 |  
 |  __init__(self)
 |  
 |  check_timeout(self)
 |      If now > self.time + self.timeout, set self.timed_out.
 |      
 |      This purposefully sets a flag, rather than raising an error,
 |      so that a monitor thread can interrupt the Response thread.
 |  
 |  collapse_body(self)
 |  
 |  finalize(self)
 |      Transform headers (and cookies) into self.header_list.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __dict__ = <dictproxy object>
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__ = <attribute '__weakref__' of 'Response' objects>
 |      list of weak references to the object (if defined)
 |  
 |  body = <cherrypy._cprequest.Body object>
 |      The body of the HTTP response (the response entity).
 |  
 |  cookie = <SimpleCookie: >
 |  
 |  header_list = []
 |  
 |  headers = {}
 |  
 |  status = ''
 |  
 |  stream = False
 |  
 |  time = None
 |  
 |  timed_out = False
 |  
 |  timeout = 300

Documenting data

But there's a further flaw with the above output of help(); none of the data members of the Response class are documented! A few of them are mentioned in the class docstring, to be sure, but hardly to a truly useful extent. The Request object is an even poorer state, since it has so many more data members.

The solution for that issue is somewhat complicated, as well. It turns out that there are plenty of good documentation generators for Python code (that emit HTML or text; epydoc and pudge spring to mind), but no serious helpers for making help() more informative. This is a real shame; I would almost always rather have help() be truly helpful than go read a book or search online docs.

So I proposed a (small!) metaclass to help alleviate the problem for CherryPy. When you look at CherryPy source code, now, you might see something like this:

class Request(object):
    """An HTTP request."""

    __metaclass__ = cherrypy._AttributeDocstrings

    prev = None
    prev__doc = """
    The previous Request object (if any). This should be None
    unless we are processing an InternalRedirect."""

    # Conversation/connection attributes
    local = http.Host("localhost", 80)
    local__doc = \
        "An http.Host(ip, port, hostname) object for the server socket."

    remote = http.Host("localhost", 1111)
    remote__doc = \
        "An http.Host(ip, port, hostname) object for the client socket."

The _AttributeDocstrings metaclass does one thing: finds class members whose names look like <attrname>__doc, takes their str value, formats it, and folds it into the class docstring. Here's a snippet of the resulting help() output:

Help on Request in module cherrypy._cprequest object:

class Request(__builtin__.object)
 |  An HTTP request.
 |  
 |  local [= http.Host('localhost', 80, 'localhost')]:
 |      An http.Host(ip, port, hostname) object for the server socket.
 |  
 |  prev [= None]:
 |      The previous Request object (if any). This should be None
 |      unless we are processing an InternalRedirect.
 |  
 |  remote [= http.Host('localhost', 1111, 'localhost')]:
 |      An http.Host(ip, port, hostname) object for the client socket.

Christian's first question was, "why not just write it yourself by hand in the docstring?" Here's the long answer. The metaclass:

  1. Places the docstring nearer to the attribute declaration.
  2. Makes attribute docs more uniform ("name (default): doc").
  3. Automatically gets the attribute name right in the docstring.
  4. Automatically gets the default value right in the docstring.

I chose the naming convention because it allows the attribute name and the attribute__doc name to line up horizontally (it doesn't matter which comes first; I prefer to put the doc after the attribute). It also looks similar to the conventions in Python's C code, where doc variable names look like module_attribute__doc__ or sometimes just attribute_doc.

Code faster

Hopefully these two improvements, although more awkward than I like implementation-wise, will make using CherryPy much easier and faster. Feel free to help() us out by writing a few data member docstrings!

by fumanchu at April 03, 2008 05:00 AM under CherryPy

CherryPy 3 has fastest WSGI server yet

A couple of months ago, in response to someone else's speed claims, I posted a comment that CherryPy's built in WSGI server could serve 1200 simple requests per second. The demo used Apache's "ab" tool to test ("-k -n 3000 -c %s"). In the last few days before the release of CherryPy 3.0 final, I've done some further optimization of cherrypy.wsgiserver, and now get 2000+ req/sec on my modest laptop.

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      3000 |      0 | 2170.79 |    0.461 | 358.18 |
     20 |      3000 |      0 | 2080.34 |    0.481 | 343.26 |
     30 |      3000 |      0 | 1920.31 |    0.521 | 316.85 |
     40 |      3000 |      0 | 2051.84 |    0.487 | 338.55 |
     50 |      3000 |      0 | 2051.84 |    0.487 | 338.55 |

The improvements are due to a variety of optimizations, including:

  • Replacing mimetools/rfc822.Message with custom code for reading headers.
  • Using socket.sendall instead of a socket fileobject for writes.
  • Generic hand-tuning of code loops.

I want to make it clear that the benchmark does not exercise any part of CherryPy other than the WSGI server. I used a very simple WSGI application (not the full CherryPy stack):

def simple_app(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-type','text/plain'),
                        ('Content-Length','19')]
    start_response(status, response_headers)
    return ['My Own Hello World!']

The full stack of CherryPy includes the WSGI application side as well, and consequently takes more time. But that has risen from about 380 requests per second in October to:

Client Thread Report (1000 requests, 14 byte response body, 10 server threads):

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      1000 |      0 |  536.86 |    1.863 |  85.36 |
     20 |      1000 |      0 |  509.47 |    1.963 |  81.01 |
     30 |      1000 |      0 |  499.28 |    2.003 |  79.39 |
     40 |      1000 |      0 |  491.90 |    2.033 |  78.21 |
     50 |      1000 |      0 |  504.32 |    1.983 |  80.19 |
Average |    1000.0 |    0.0 | 508.366 |    1.969 | 80.832 |

If you want to benchmark the full CherryPy stack on your own, just install CherryPy and run the script at cherrypy/test/benchmark.py.

Here's the other script for the "bare server" benchmarks:

import re
import sys
import threading
import time
from cherrypy import _cpmodpy

AB_PATH = ""
APACHE_PATH = "apache"
SCRIPT_NAME = ""
PORT = 8080


class ABSession:
    """A session of 'ab', the Apache HTTP server  benchmarking tool."""
    parse_patterns = [('complete_requests', 'Completed',
                       r'^Complete requests:\s*(\d+)'),
                      ('failed_requests', 'Failed',
                       r'^Failed requests:\s*(\d+)'),
                      ('requests_per_second', 'req/sec',
                       r'^Requests per second:\s*([0-9.]+)'),
                      ('time_per_request_concurrent', 'msec/req',
                       r'^Time per request:\s*([0-9.]+).*concurrent requests\)$'),
                      ('transfer_rate', 'KB/sec',
                       r'^Transfer rate:\s*([0-9.]+)'),
                      ]

    def __init__(self, path=SCRIPT_NAME + "/", requests=3000, concurrency=10):
        self.path = path
        self.requests = requests
        self.concurrency = concurrency

    def args(self):
        assert self.concurrency > 0
        assert self.requests > 0
        return ("-k -n %s -c %s <a href="http://localhost:%s%s"">http://localhost:%s%s"</a> %
                (self.requests, self.concurrency, PORT, self.path))

    def run(self):
        # Parse output of ab, setting attributes on self
        args = self.args()
        self.output = _cpmodpy.read_process(AB_PATH or "ab", args)
        for attr, name, pattern in self.parse_patterns:
            val = re.search(pattern, self.output, re.MULTILINE)
            if val:
                val = val.group(1)
                setattr(self, attr, val)
            else:
                setattr(self, attr, None)


safe_threads = (25, 50, 100, 200, 400)
if sys.platform in ("win32",):
    # For some reason, ab crashes with > 50 threads on my Win2k laptop.
    safe_threads = (10, 20, 30, 40, 50)


def thread_report(path=SCRIPT_NAME + "/", concurrency=safe_threads):
    sess = ABSession(path)
    attrs, names, patterns = zip(*sess.parse_patterns)
    rows = [('threads',) + names]
    for c in concurrency:
        sess.concurrency = c
        sess.run()
        rows.append([c] + [getattr(sess, attr) for attr in attrs])
    return rows

def print_report(rows):
    widths = []
    for i in range(len(rows[0])):
        lengths = [len(str(row[i])) for row in rows]
        widths.append(max(lengths))
    for row in rows:
        print
        for i, val in enumerate(row):
            print str(val).rjust(widths[i]), "|",
    print


if __name__ == '__main__':

    def simple_app(environ, start_response):
        """Simplest possible application object"""
        status = '200 OK'
        response_headers = [('Content-type','text/plain'),
                            ('Content-Length','19')]
        start_response(status, response_headers)
        return ['My Own Hello World!']

    from cherrypy import wsgiserver as w
    s = w.CherryPyWSGIServer(("localhost", PORT), simple_app)
    threading.Thread(target=s.start).start()
    try:
        time.sleep(1)
        print_report(thread_report())
    finally:
        s.stop()

by fumanchu at April 03, 2008 05:00 AM under CherryPy

April 02, 2008

Christian Wyglendowski

Reading Chunked HTTP/1.1 Responses

For work today I wanted a way to iterate over an HTTP response with chunked transfer-coding on a chunk-for-chunk basis. I didn’t see a builtin way to do that with httplib. It supports chunked reads but you have to specify the amount that you want to read if you don’t want it to buffer. I just wanted it to read and yield each chunk that it received from the server.

For my first crack at it I really just tried to use the httplib basics:

import httplib
 
conn = httplib.HTTPConnection('localhost:8080')
conn.request('GET', '/')
r = conn.getresponse()
data = r.read(10)
while data:
    print data
    data = r.read(10)

That worked but since I won’t know the chunk size in real-life, I would probably get output similar to this:

Chunk 0
Ch
unk 1
Chun
k 2
Chunk
3
Chunk 4
...

I really wanted that chunk-for-chunk iteration. After taking a look at the very readable httplib source this evening, it wasn’t very hard to accomplish. I basically just took the httplib.HTTPResponse._read_chunked method and modified it to be a generator. I subclassed HTTPResponse and stuck my generator in an __iter__ method. Behold; now you can do this sort of thing:

if __name__ == "__main__":
    import httplib
    import iresponse
    conn = httplib.HTTPConnection('localhost:8080')
    conn.response_class = iresponse.IterableResponse
    conn.request('GET', '/')
    r = conn.getresponse()
    for chunk in r:
        print chunk

With nice results like this:

Chunk 0
Chunk 1
Chunk 2
Chunk 3
Chunk 4
...

You can download the iresponse module from my projects site. There is also a small CherryPy application that serves some data with chunked transfer-coding in case any of you want to fiddle with it.

cw

by christian at April 02, 2008 04:35 AM under work

April 01, 2008

Kevin Dangoor

MichiPUG meeting on Thursday, April 3

The Michigan Python Users Group (MichiPUG) monthly meeting is coming up on Thursday, April 3.

We’ve got two exciting topics for this month’s meeting. The first part is Mark Ramm leading some discussion on documentation tools (including Sphinx). I roped Mark into this because I haven’t yet run Sphinx, but it looks great. He’ll also talk about Idiopidae which Zed Shaw and Mark were hacking on at PyCon.

I’ll also lead some discussion and experimentation with EasyExtend, Kay Schluehr’s tool for monkeying with Python’s syntax.

Should be a fun time, and I hope to see you there!

ShareThis

by Kevin Dangoor at April 01, 2008 04:23 PM under sphinx

Psychotic: optimizing Python compiler

An idea that I had at PyCon has finally come to fruition: introducing Psychotic, a pure Python optimizing compiler that achieves some pretty impressive results.

For a new project, Psychotic has a good deal of documentation. There’s also the introductory screencast, which is “lightning talk sized” (under 5 minutes).

There’s actually a fuller announcement over at the SitePen blog.

ShareThis

by Kevin Dangoor at April 01, 2008 11:08 AM under psychotic

March 31, 2008

Kevin Dangoor

March 22, 2008

Kevin Dangoor

March 20, 2008

Kevin Dangoor

TurboGears looking for Google SoC Students!

TurboGears has been accepted as a Google Summer of Code organization. Chris Arndt has put together a page at docs.turbogears.org to pull together all of the TG Summer of Code information. Congrats to Chris and the others for pulling together this year’s Summer of Code effort for TG!

Next week is signup time for students (March 24-31), so if you’re interested in getting paid to work on TurboGears this summer, check out the TG SoC info page.

by Kevin Dangoor at March 20, 2008 11:19 AM under TurboGears

March 03, 2008

Kevin Dangoor

MichiPUG meeting: Nose by Jason Pellerin

Jason Pellerin is going to be giving a talk about his Nose testing tool at the Michigan Python Users Group (MichiPUG). I’ve been a Nose users since the very beginning, so I’m happy that Jason himself is giving a talk on it.

As usual, the meeting is on the first Thursday of the month (March 6th, in this case) at 7PM. Meetings are always free.

And, as has been the case the past several months, SRT Solutions is going to be hosting the meeting at their perfect downtown Ann Arbor location. In addition to the instructions on how to get to SRT, I’ll also mention that street parking is free after 6PM and usually readily available a couple blocks north at Ann and Fifth (right next to the City Hall/Police Station building).

Unfortunately, I most likely won’t see you there for this month’s meeting, as it’s kindergarten roundup time for Ann Arbor Schools and we’re trying to figure out the best kindergarten choice for our daughter. Hopefully, I will be seeing some of you at PyCon in a couple of weeks!

by Kevin Dangoor at March 03, 2008 02:40 AM under Python

February 08, 2008

Kevin Dangoor

Cobra programming language

So, we’ve got Jython and IronPython as Python language reimplementations. There’s also Boo, which is clearly heavily inspired by Python but has some interesting extensions (static typing, for example). I just came across Cobra.

Cobra, like Boo, is built on the .NET platform. The syntax is clearly inspired by Python, which I consider a good thing. In keeping line noise to a minimum, Cobra even ditches the “:” at the end of the line preceding a block of code. Chuck Esterbrook has also pulled inspiration from a number of other places. I recognize some D and Eiffel in there (it’s got design by contract and unit tests built right into the classes). There’s a comparison to Python available right on the Cobra site.

Something that’s interesting about Cobra is that it’s self-hosting. Even though C# has been getting more powerful over time, I’m sure that Cobra can move forward more quickly with its even more succinct syntax.

As a Python guy, though, I can’t help but notice things that seem to be missing (or are possibly just missing from the docs).

  • Functions as first class objects. All of the examples are inside of classes, which just seems silly. I also haven’t seen the syntax for passing a function around (can you even do that?) This is a powerful feature.
  • Metaclasses don’t seem to exist in Cobra. You don’t need them all the time, but you can make some APIs a lot nicer if you use them when appropriate.
  • Function parameter declaration is weaker. Function parameter capabilities seem to be the same as in any other .NET language. It allows you to have variable arguments, but that’s about as fancy as you can get.
  • Objects are not extensible. You can’t just go and hang random attributes off of an object, and there are actually some times when this is convenient to do.

I do think it’s interesting to see more languages popping up that offer both static and dynamic typing. I’ll be curious to see how that plays out over time.

by Kevin Dangoor at February 08, 2008 05:18 PM under Python

January 18, 2008

Kevin Dangoor

When something goes missing, you realize how much you use it

Like, say, the Python Cheeseshop. I’ve been using Python eggs extensively since mid-2005 and have grown used to how easy it is to easy_install “random package”. Lately, I’ve been using zc.buildout to get entire environments set up, and it’s Cheeseshop dependent in the same way that easy_install is.

Thanks to everyone who’s been involved with coding and running the Cheeseshop. It’s a great resource!

Now can we have it back, please? :)

Error...

There's been a problem with your request

psycopg.OperationalError: no connection to the server

by Kevin Dangoor at January 18, 2008 03:25 PM under Python

January 16, 2008

Uche and Chimezie Ogbuji

January 14, 2008

Uche and Chimezie Ogbuji

December 13, 2007

Uche and Chimezie Ogbuji

December 11, 2007

Kevin Dangoor

Rich UI webapps with TurboGears 2 and Dojo

I’ve just received notice that my proposed talk for PyCon 2008 has been accepted!

People who were at my PyCon 2006 “Effective Ajax with TurboGears” talk might remember that I talked about sprinkling Ajax throughout a webapp as appropriate. A lot has changed in the nearly two years since. We can now move a lot more presentation logic to the browser, and the server side becomes much more of a web services layer.

I’m going to focus on what the server side looks like (this is PyCon, after all!), using Dojo to easily implement the client side. I’m also going to talk about how you can integrate Comet (where the server sends messages back to the client) into a TurboGears 2 app.

The ideas will work in any web framework, and the code samples will also work with very little change in Pylons.

I’m looking forward to this talk, and I hope to see you there! (And don’t forget to say ‘hi’ if you’re at CodeMash next month!)

Update: Wow! Check out the PyCon 2008 list of talks. This will be my third PyCon, and I must say that that is the most impressive list of talks of the three.

by Kevin Dangoor at December 11, 2007 12:13 PM under pycon

December 04, 2007

Kevin Dangoor

Michael Foord’s Mock

I’ve started using Mock - Mocking and Test Utilities by Michael Foord (that’s fuzzyman to you :). Python’s dynamic nature (monkeypatching and whatnot) means that you don’t need to come up with some of the elaborate constructs for testing that Javafolk have had to. Mock gives you a few handy features to make stubbing out and verifying behavior easy and natural.

Fuzzyman’s new release of Mock (0.3.1) includes a couple of features that I suggested to make it more nose-friendly and to eliminate the need for global variables when using the “patch” decorator.

Thanks for the groovy module, Michael!

by Kevin Dangoor at December 04, 2007 01:22 PM under Python

Michigan Python Users Group December 2007 Meeting

The Michigan Python Users Group (MichiPUG) December meeting is coming up this Thursday, December 6th at 7PM. We’ll once again be meeting at the SRT Solutions office in downtown Ann Arbor.

This month’s topic is distributed version control systems. I’ll be talking about/demoing Mercurial, and we’ll have just general discussion about version control systems and other Python topics.

See you there!

by Kevin Dangoor at December 04, 2007 02:58 AM under mercurial dvcs michipug

December 02, 2007

Uche and Chimezie Ogbuji

December 01, 2007

Uche and Chimezie Ogbuji

November 29, 2007

Uche and Chimezie Ogbuji

November 28, 2007

Uche and Chimezie Ogbuji

November 27, 2007

Uche and Chimezie Ogbuji

November 26, 2007

Uche and Chimezie Ogbuji

November 24, 2007

Uche and Chimezie Ogbuji

November 23, 2007

Uche and Chimezie Ogbuji

November 22, 2007

Uche and Chimezie Ogbuji

November 20, 2007

Uche and Chimezie Ogbuji

November 19, 2007

Uche and Chimezie Ogbuji

November 17, 2007

Uche and Chimezie Ogbuji

November 16, 2007

Uche and Chimezie Ogbuji

November 15, 2007

Uche and Chimezie Ogbuji

November 14, 2007

Uche and Chimezie Ogbuji

November 13, 2007

Uche and Chimezie Ogbuji

November 12, 2007

Uche and Chimezie Ogbuji

November 10, 2007

Uche and Chimezie Ogbuji

November 09, 2007

Uche and Chimezie Ogbuji

November 05, 2007

Uche and Chimezie Ogbuji