tag:blogger.com,1999:blog-17001572362062005972008-07-16T16:28:37.870-07:00Valued LessonsI've learned. I'll share.Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comBlogger14125tag:blogger.com,1999:blog-1700157236206200597.post-83224526885897854992008-06-27T13:43:00.000-07:002008-06-27T13:44:10.914-07:00Message Passing Conccurrency (Actor Model) in PythonAs I wrote previously, I'm a fan of message-passing concurrency. But to use it in python, I had to write my own infrastructure. It's far from perfect, but it's working for me. I've simplified it a bit and prepared some code to share with you.<br /><br />The basic idea of my framework is to be and simple and easy to use as possible. There's no Erlang-style pattern matching or process linking. There isn't even any built-in error handling. But, there's a lot of support for the typical "react loop" style of message handling. <br /><br />The architecture is very simple. There's one thread per actor. An actor is just an object you can send a message to. The actor receives the messages in the order they were sent. You always know the sender of the message; it's built right in. There's also "Bridge" that allows you to interact with actors from non-actor code. <br /><br />The only tricky thing you'll see is a "handles" decorator. This is used to help you identify how you want an actor to handle a given message type. It helps you avoid re-creating a message loop and dispatcher yourself for every actor. <br /><br />I've included a little example of how to use the framework. It obviously quite simple, but keep in mind that I used little more for the foundation of a rather complex file synchronization application. A little bit goes a long way.<br /><br />So, here it is. Cut, paste, and run.<br /><br /><pre><br />from threading import Thread, Event<br />from Queue import Queue, Empty<br /><br />class IActor:<br /> pass<br /> # send(message, sender)<br /><br />class HandlerRegistry(dict):<br /> # should be used as a decorator<br /> def __call__(self, typ):<br /> def register(func):<br /> self[typ] = func<br /> return func<br /> return register<br /><br />OtherMessage = "Other Message"<br /><br />class Stop(Exception):<br /> def __repr__(self):<br /> return "Stop()"<br /><br />class Stopped:<br /> def __repr__(self):<br /> return "Stopped()"<br /><br />class ActorNotStartedError(Exception):<br /> def __init__(self):<br /> Exception.__init__(self, "actor not started")<br /><br />class Actor(IActor):<br /> @classmethod<br /> def spawn(cls, *args, **kargs):<br /> self = cls(*args, **kargs)<br /> self.mailbox = Mailbox()<br /> start_thread(target = self.act, as_daemon = True)<br /> return self<br /><br /> def send(self, message, sender):<br /> if self.mailbox is None:<br /> raise ActorNotStartedError()<br /> else:<br /> self.mailbox.send(message, sender)<br /><br /> def receive(self):<br /> if self.mailbox is None:<br /> raise ActorNotStartedError()<br /> else:<br /> return self.mailbox.receive()<br /><br /> # override if necessary<br /> def act(self):<br /> self.handleMessages()<br /><br /><br /><br /> handles = HandlerRegistry()<br /><br /> @classmethod<br /> def makeHandles(*classes):<br /> return HandlerRegistry((typ, handler) for cls in classes for (typ, handler) in cls.handles.iteritems())<br /><br /> def handleMessages(self):<br /> try:<br /> while True:<br /> message, sender = self.receive()<br /> self.handleMessageWithRegistry(message, sender)<br /> except Stop:<br /> pass<br /><br /> def handleMessageWithRegistry(self, message, sender):<br /> registry = self.__class__.handles<br /> handler = registry.get(message.__class__) or registry.get(OtherMessage)<br /> if handler is not None:<br /> handler(self, message, sender)<br /><br /> @handles(OtherMessage)<br /> def onOther(self, message, sender):<br /> pass<br /><br /> @handles(Stop)<br /> def onStop(self, message, sender):<br /> sender.send(Stopped(), self)<br /> raise message<br /><br />def start_thread(target, as_daemon, name = None):<br /> thread = Thread(target = target)<br /> if name:<br /> thread.setName(name)<br /> thread.setDaemon(as_daemon) <br /> thread.start()<br /> return thread<br /><br />class Mailbox:<br /> def __init__(self):<br /> self.mailbox = Queue()<br /><br /> def send(self, message, sender):<br /> self.mailbox.put((message, sender), block = False)<br /><br /> def receive(self, timeout = None):<br /> return self.mailbox.get(block = True, timeout = timeout)<br /><br />class Bridge(IActor):<br /> def __init__(self):<br /> self.mailbox = Mailbox()<br /><br /> def send(self, message, sender):<br /> self.mailbox.send(message, sender)<br /><br /> def call(self, target, request, timeout, default = None):<br /> self.sendRequest(target, request)<br /> return self.receiveResponse(timeout, default)<br /><br /> # targeted_requests can be an iterator<br /> def multiCall(self, targeted_requests, timeout, default = None):<br /> count = 0<br /> for target, request in targeted_requests:<br /> self.sendRequest(target, request)<br /> count += 1<br /> <br /> for _ in xrange(count):<br /> yield self.receiveResponse(timeout, default)<br /><br /> def stop(self, actors, timeout):<br /> stop = Stop()<br /> return list(self.multiCall(((actor, stop) for actor in actors), timeout, default = None))<br /><br /> def sendRequest(self, target, request):<br /> target.send(request, self)<br /><br /> def receiveResponse(self, timeout, default):<br /> try:<br /> message, sender = self.mailbox.receive(timeout = timeout)<br /> return message<br /> except Empty:<br /> return default<br /> <br /><br />if __name__ == "__main__":<br /> import time<br /><br /> class GetInventory:<br /> pass<br /><br /> class Task:<br /> def __init__(self, input, destination):<br /> self.input = input<br /> self.destination = destination<br /><br /> class Worker(Actor):<br /> handles = Actor.makeHandles()<br /><br /> def __init__(self, skill):<br /> self.skill = skill<br /><br /> @handles(Task)<br /> def onTask(self, task, sender):<br /> output = self.skill(task.input)<br /> task.destination.send(output, self)<br /><br /> class Warehouse(Actor):<br /> handles = Actor.makeHandles()<br /><br /> def __init__(self):<br /> self.inventory = []<br /><br /> @handles(GetInventory)<br /> def onGetInventory(self, message, sender):<br /> # copy the inventory to avoid anyone mutating it<br /> sender.send(list(self.inventory), self)<br /> <br /> @handles(OtherMessage)<br /> def onTaskResult(self, result, sender):<br /> self.inventory.append(result)<br /><br /> worker = Worker.spawn(lambda x : x * 2)<br /> positives = Warehouse.spawn()<br /> negatives = Warehouse.spawn()<br /> bridge = Bridge()<br /><br /> for val in [1, 2, 3, -2, -4, -6]:<br /> warehouse = positives if val >= 0 else negatives<br /> worker.send(Task(val, warehouse), sender = None)<br /><br /> print bridge.call(positives, GetInventory(), 1.0) #should be [ 2, 4, 6]<br /> print bridge.call(negatives, GetInventory(), 1.0) #should be [-4, -8, -12]<br /> print bridge.stop([worker, positives, negatives], 1.0) #should be [Stopped(), Stopped(), Stopped()]<br /><br /> <br /> class Start:<br /> def __init__(self, target):<br /> self.target = target<br /><br /> class Ping:<br /> def __repr__(self):<br /> return "Ping()"<br /><br /> class Pong:<br /> def __repr__(self):<br /> return "Pong()"<br /><br /> class Pinger(Actor):<br /> handles = Actor.makeHandles()<br /><br /> @handles(Start)<br /> def onStart(self, start, sender):<br /> start.target.send(Ping(), self)<br /><br /> @handles(Pong)<br /> def onPong(self, pong, sender):<br /> print "-",<br /> sender.send(Ping(), self)<br /> <br /> class Ponger(Actor):<br /> handles = Actor.makeHandles()<br /><br /> @handles(Ping)<br /> def onPing(self, ping, sender):<br /> print "+",<br /> sender.send(Pong(), self)<br /><br /> # should print lots of +-+-+-<br /> pinger = Pinger.spawn()<br /> ponger = Ponger.spawn()<br /> pinger.send(Start(ponger), sender = None)<br /> time.sleep(0.1)<br /> bridge.stop([pinger, ponger], 1.0)<br /></pre>Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-9377274089626625342008-06-27T12:43:00.001-07:002008-06-27T12:46:08.962-07:00My Experience with Message Passing ConcurrencyI'm working on a peer-to-peer file synchronization program. It's <i>really</i> concurrent and distributed, and it's forced me to<br />learn a thing or two about the concurrency models that we use as programmers. Over the past few years, I've tried both the shared-state model common to C, C++, Java, C#, Python, etc, and the message-passing model unique to Erlang and Scala. In my experience, <b>the message-passing model (aka Actor Model) is far superior to the shared-state model for writing distributed applications</b>. After having used both extensively, I'd go so far as to say that <b>for distributed programming, shared-state concurrency is upside down and backwards</b>. <br /><br />Here's my informal proof that message-passing concurrency is necessary in a distributed system:<br /><br /><ol><br /> <li>Since the system in distributed, real shared state is impossible.</li><br /> <li>The only way for the distributed components of an application to<br /> communicate is by sending a message from one to another.</li><br /> <li>Everything else is an abstraction on top of message passing.</li><br /></ol><br /><br />HTTP is an abstraction on top of message-passing. AJAX is an abstraction on top of HTTP on top of message-passing. XML-RPC is an abstraction on top of HTTP on top of message-passing. SOAP is an abstraction on top of an abstraction on top of an abstraction on top of HTTP on top of message-passing. Any RPC mechanism is an abstraction on top of message-passing. SQL queries to a remote database are an abstraction on top of message-passing. CORBA is a nasty, tangled mess on top of a foundation of message-passing.<br /><br />See a pattern?<br /><br />Not only are all of these abstractions, but they are <i>leaky</i> abstractions. Just about all RPC (Remote Procedure Call) frameworks try to pretend that remote objects or local. But the facade is impossible to keep from leaking. Calls to a remote object might fail or take arbitrarily long. If you want to make the same call to two different remote nodes, those calls must be made synchronously and sequentially; in order to call them in parallel on the remote nodes, they most be called in parallel on the local node. If a call to a remote node triggers a call back to the local node, which may trigger a call to a third node, you end up with a huge spaghetti mess of calls and threads.<br /><br />I've been down that road. It wasn't pretty. There is a better way.<br /><br />I think AJAX has opened our eyes a bit. It's a lot closer to message-passing than it is to RPC. In a REST architecture, you "send" messages by POSTing to a URL and you "receive" messages by GETing a URL. All messages are clearly local copies that were serialized and deserialized from the remote data. There isn't any leaky abstraction of data or classes pretending to be in two places at once. Data, timeouts, and failures are in your face and you have to deal with them. So, AJAX is a lot less leaky.<br /><br />But AJAX is far from perfect. I can't help but think of the Greenspun's Tenth Rule of Programming with a twist on distributed programming: "Any sufficiently complicated distributed platform contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of a message-passing system (Erlang)". Once you get message-passing in your head, you can't help but think of AJAX in that way. <br /><br />I've been down a better road. It was much more pleasant. <br /><br />About six months ago, I rewrote major portions of our application with message-passing concurrency. I took a long time. It was tricky. It hurt my head. But it worked. It's a success. It's capable of things that the shared-state system could never do.<br /><br />Having done it, I can emphatically say that if I could start all over, I would go with message-passing. Most of the work was building the infrastructure and wrapping my head around a new way of thinking. But now that I've done both of those, I've paid the costs and I'm reaping the rewards. We're moving the application in directions that would have been nearly in possible with the old concurrency model. <br /><br /><b>In my experience, message-passing concurrency is the best way to write distributed applications.</b> But it isn't an easy road to go down. Support for it just isn't there in most programming languages and environments. So far, Erlang has been the pioneer. I would humbly agree that the way I've implemented message-passing in Python, compared to Erlang, looks like an ad-hoc, bug-ridden half-implementation. But for various reasons, I can't use Erlang. I'm hoping for someone to create Erlang++ or E# or Eython or something that combines the concurrency model of Erlang with a modern programming language. Until then, I'll just keep on cobbling together what I can onto the programming language I happen to be using. <br /><br />More on that in another article.Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-87279093328575490522008-05-23T18:01:00.000-07:002008-05-23T18:13:32.872-07:00Wii Fit From A Programmer's Perspective (after one day)I bought Wii Fit. It came in the mail yesterday and I had my first try at it last night and my first workout this morning. Although I haven't had a lot of time with it so far, I think I've had enough that I can share some observations.<br /><br />I have read a lot about Wii Fit from a video game or fitness perspective, but I've never read about if from the perspective of a software developer. I think as software developers, we should have extra interest in it for three reasons:<br /><br /><ol><br /> <li>Most software developers are physically weak.</li><br /> <li>The Wii Fit Balance Board is a new and unique user interface device.</li><br /> <li>There might be some money to be made!</li><br /></ol><br /><br /><h2> Wii Fit as an antidote to programmer physical weakness</h2><br /><br />I talked myself into buying Wii Fit because of #1. <b>As a programmer, I sit all day with little or no physical activity</b>. Most of my life, in fact, has involved sitting with little or no physical activity. If you're reading this, I'm guessing that you probably don't get much physical activity either. Well guess what: that makes us weak!<br /><br />I know not <i>all</i> programmers are weak. My good friend Wes is a programmer, and he's pretty buff. But I'm weak, and I'm sure I'm not the only one. I didn't understand how weak until I played Wii Fit the first time. Just about every activity (and there are many) involved a seldom-used muscle. After a short while, I realized that <b>I have a lot of unused muscles</b>.<br /><br />Will Wii Fit make me strong? I don't know. I suppose that depends on how much I use it. But if it pushes me to use lots of muscles that I never did, then it can only make me stronger, right? In other words, if I keep using it, <b>my strength will increase monotonically</b>. Eventually it will plateau, and it's limit is probably lower than going to a gym. But I'm not going to a gym, and this is sure better than nothing.<br /><br />Plus, it's surprisingly enjoyable! I enjoyed my first workout this morning so much that an hour sped by before I looked at the time. Part of that is novelty, I'm sure, but there's something more to it, and I'm optimistic that I'll keep it up. I'm already looking forward to my next round tomorrow. <br /><br />Wow. Did you hear what I just said? I said <b>I'm looking forward to getting up early to have time to excersize</b>. Either I got hit in the head with a coconut or Wii Fit is really on to something.<br /><br /><br /><h2> Wii Fit Balance Board as a user interface device. </h2><br /><br />Although I bought for the exercise, I'm interested in the potential of the Balance Board that comes with Wii Fit. It's one of the most<br />unique user interface devices I've ever used.<br /><br />Nintendo seems to be on a role with user interface devices. It's success at turning a gyroscope into sports simulation led to the huge success of the Wii. The availability of wiimotes led to an explosion of <a href="http://www.cs.cmu.edu/~johnny/projects/wii/">interesting<br />uses</a> and <a href="http://mtrr.org/blog/index.php?s=wiimote">interest in hacking it</a>. It's certainly an interface device success story.<br /><br /><b>The big question is: how useful with the Balance Board be?</b> In Wii Fit, there seem to be two kinds of uses:<br /><br /><ol><br /> <li>A measuring device.</li><br /> <li>A control device.</li><br /></ol><br /><br />As a measure device, I think it's very useful. By "measure", I mean that the board measures how well you do a task. For example, I was doing "lunges", and it noticed that I was bending my leg too far, and told me so. That's pretty impressive. When I doing push-ups, it had a rhythm to follow and it really helped to know that someone was "watching"; I couldn't wimp out. Even just standing on it let me know that my whole life I've been standing too much on my heals. How's that for some realization!<br /><br />Just like Wii Sports was the perfect application for the Wiimote, I think Wii Fit is the perfect application for the Balance Board. I can't think of an activity better suited to it's strong point: measuring. <br /><br />But what else can it do? When we think of a user interface device, we usually think of controlling something, whether it be a web browser or a game character. As a developer, this is the possibility I was most excited about. After having used it, how well does the Balance Board work as a controlling device? I'm not so sure.<br /><br />Wii Fit comes with a couple of "balance games" showing various ways of using the board to control something. There's leaning left and right to ski, crouching and standing to ski jump, and leaning in all<br />directions to roll a ball into a hole. Some of them are actually quite impressive. But I still doubt the board's effectiveness for precise control.<br /><br />The problem is <b>I find it's really hard to control my balance</b>. I'm not kidding. Just rolling a little ball down into the hole was<br />really hard. My first time on the ski slope was a disaster. I read that a skateboard game is going to support the board. That might end up being so realistic that I can't control it, just like in real life!<br /><br />I think a big part of it is that <b>controlling my balancing is something I've never really done before.</b> Imagine handing someone who has never played video games an Xbox controller and watching them play Halo. It won't be pretty. Similarly, I remember when I was a kid and the mouse was a new input device. My teacher at school couldn't handle it. She'd never had to control anything like it. Like her, I am facing a new control device, and there's a learning curve.<br /><br />Long story short: <b>The Wii Fit Balance Board is hard to use for precise control, but maybe it's just me</b>. Hopefully, my skill with it will improve, and we as software developers will think of<br />interesting ways of using it.<br /><br /><h2>Profit</h2><br /><br />There's definetely some money to be made writing software for the Balance Board. How so? <br /><br /><ul><br /> <li>Millions of households already have a Balance Board</li><br /> <li>Millions more will soon.</li><br /> <li>They will show it to <b>everyone they know</b>.</li><br /> <li>Many will want more beyond Wii Fit.</li><br /></ul><br /><br />Even though it's only one player at a time, it a hilarious game to play in a group. Some family happened to be over last night, and they couldn't get enough of it. It was a surprisingly fun watching each other try and hula hoop and dodge shoes. The first decent, cheap party game featuring compatibility with the Balance Board is going to sell like hot cakes (which...sell fast, right?).<br /><br />I'm not saying <i>who</i> is going to make a lot of money. Nintendo might make it all. But there's probably room for little guys too.<br />With WiiWare, games like <a href="http://en.wikipedia.org/wiki/LostWinds">LostWinds</a> have shown that a small developer can take a simple idea for a new interface, add a little bit of style, and make a great game. If someone made a game like LostWinds for the Balance Board for $10, I<br />wouldn't hesitate to buy it.<br /><br />I'd be excited just to have some Linux drivers and an interesting open source hack to control my computer somehow, even if it's something silly like that MacBook-turned-into-lightsaber trick. Someone might even think of way to really "surf the web". I don't think there's much many in that, though :).<br /><br /><br /><br />So far, I'm glad I bought Wii Fit. I hope to keep it up and strengthen my muscles. It's a very fun party game, which should lead to widespread adoption and an opportunity to create interesting uses for the Balance Board. It's hard to control precisely, but I hope it will get easier as we learn to control our balance (something that some of us have never done before).Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-73461415379297560462008-04-28T15:10:00.000-07:002008-04-28T16:37:47.028-07:00Events in PythonI'm a huge fan of the <a href="http://en.wikipedia.org/wiki/Actor_model"> Actor Model </a>. I think that for most applications, it's the best way to do concurrency. As CPUs get more cores, concurrency becomes more important for programmers, and I think the Actor Model will become more important, too.<br /><br />But, sometimes you need something more light-weight for handling "events". For example, imagine you have some code listening for file changes in a particular directory. What you'd like to do is make an "event" that is "fired" whenever a file change is detected. When fired, there may be a "handler" or "listener" which is notified of the event. That "handler" is ultimately just some code which is executed when the event occurs.<br /><br />A while ago, <b>I wanted an event system like this for Python</b>. I didn't see anything builtin or any library available, so I decided to write my own. I'd like to share what I created with you.<br /><br />But first, I want to follow an example through other programming languages to give you an idea of what I was trying to accomplish and how it compares with what's out there. After that, I'll give you my implemenation in Python. Our example will be listening for changes on the file system. We want to keep the "watcher" code decoupled from the rest of the code, so we use events. <br /><br />The best implementation of events that I've used is in C#, so we'll start there. In C# 1.0, our file watcher would looks something like this:<br /><br /><pre><br />public delegate void FileChangeHandler(string source_path);<br /><br />class FileWatcher<br />{<br /> public event FileChangeHandler FileChanged;<br /><br /> public void WatchForChanges()<br /> {<br /> ...<br /> if(FileChanged != null)<br /> {<br /> FileChanged(source_path);<br /> }<br /> ...<br /> }<br />}<br /><br />class FileChangeLogger<br />{<br /> public void OnFileChanged(string source_path)<br /> {<br /> Console.WriteLine(String.Format("{0} changed.", source_path));<br /> }<br />}<br /><br />watcher = FileWatcher();<br />logger = FileChangeLogger();<br />watcher.FileChanged += new FileChangeHandler(logger.OnFileChanged);<br />watcher.WatchForChanges();<br /><br /><br /></pre><br /><br />It's pretty nice. The best part is at the end, where you can write "watcher.FileChange += ...". But, you need to type the completely useless "new FileChangeHandler", and you also need to wrap it all in a separate FileChangeLogger class. Luckily, in C# 2.0, they added <a href="http://msdn2.microsoft.com/en-us/magazine/cc163970.aspx">Anonymous Delegates</a>, which makes this much nicer:<br /><br /><pre><br />watcher.FileChanged += delegate(string source_path)<br />{<br /> Console.WriteLine(String.Format("{0} changed.", source_path));<br />}<br /></pre><br /><br />And in C# 3.0, they've made it even nicer!<br /><br /><pre><br />watcher.FileChanged += (source_path => Console.WriteLine(String.Format("{0} changed.", source_path)));<br /></pre><br /><br /><b>C# 3.0 has an event system that's downright slick, with type-inferencing and everything</b>. I don't think it gets much better than that. <br /><br />Actually, there is one thing. If there's no handler registered with the event, calling "FileChanged()" will blow up because it sees FileChanged == null until a handler is registered. This means that you have to write "if(Event != null){Event(...):}" every single time you fire the event. Every single time. If you forget, your code will seem to work fine until there's no handler, in which case it will blow up and you'll smack your forhead because you forgot that you have to repeat that line of code every single time you fire an event. I really mean every single time. It's by far the worst wart on an otherwise elgant system. I have no idea why the designers of C# thought this was a good idea. When would you ever want to fire and event and have it blow up in your face?<br /><br /><br />Anyway, let's try a different programming language, perhaps Java. It has the worst implementation of events I've ever seen. Here's our FileWatcher example: <br /><br />...<br /><br />...<br /><br />Ok, nevermind, I don't have the heart. I can imagine the code full of IListeners, and implementations, and keeping an array of them, and iterating over them, etc, and I just can't do it. I got seriously upset at one useless line in C#. In Java, it's at least 10 times worse. If you really have the stomach for it, go look at http://java.sun.com/docs/books/tutorial/uiswing/events/index.html. To me, it appears that <b>in Java, events are one giant hack around the lack of first-class functions.</b> If Java had first-class functions, none of that nonsense would be necessary. <br /><br /><br />Now that we've seen a good implementation of events in C# and avoided a bad one in Java, let's make one for Python. I'd rather it be like the C# event system, so let's see what our example would look like:<br /><br /><pre><br />from Event import Event<br /><br />class FileWatcher:<br /> def __init__(self):<br /> self.fileChanged = Event()<br /><br /> def watchFiles(self):<br /> ...<br /> self.fileChanged(source_path)<br /> ...<br /><br />def log_file_change(source_path):<br /> print "%r changed." % (source_path,)<br /><br />watcher = FileWatcher()<br />watcher.fileChanged += log_file_change<br /><br /></pre><br /><br />I think that looks pretty good. So what does the implementation of Event look like?<br /><br /><pre><br /><br />class Event(IEvent):<br /> def __init__(self):<br /> self.handlers = set()<br /><br /> def handle(self, handler):<br /> self.handler.add(handler)<br /> return self<br /><br /> def unhandle(self, handler):<br /> try:<br /> self.handlers.remove(handler)<br /> except:<br /> raise ValueError("Handler is not handling this event, so cannot unhandle it.")<br /> return self<br /><br /> def fire(self, *args, **kargs):<br /> for handler in self.handlers:<br /> handler(*args, **kargs)<br /><br /> def getHandlerCount(self):<br /> return len(self.handlers)<br /><br /> __iadd__ = handle<br /> __isub__ = unhandle<br /> __call__ = fire<br /> __len__ = getHandlerCount<br /></pre><br /><br />Wow. That was pretty short. Actually, this is one of the reasons I love Python. If the language doesn't have a feature, we can probably add it. <b>We just added one of C#'s best features to Python in 26 lines of code</b>. More importantly, we now have a nice, light-weight, easy-to-use event system for Python. <br /><br />For all of you how like full examples that you can cut and paste, here is one that you can run. Enjoy!<br /><br /><pre><br /><br />class Event:<br /> def __init__(self):<br /> self.handlers = set()<br /><br /> def handle(self, handler):<br /> self.handlers.add(handler)<br /> return self<br /><br /> def unhandle(self, handler):<br /> try:<br /> self.handlers.remove(handler)<br /> except:<br /> raise ValueError("Handler is not handling this event, so cannot unhandle it.")<br /> return self<br /><br /> def fire(self, *args, **kargs):<br /> for handler in self.handlers:<br /> handler(*args, **kargs)<br /><br /> def getHandlerCount(self):<br /> return len(self.handlers)<br /><br /> __iadd__ = handle<br /> __isub__ = unhandle<br /> __call__ = fire<br /> __len__ = getHandlerCount<br /><br />class MockFileWatcher:<br /> def __init__(self):<br /> self.fileChanged = Event()<br /><br /> def watchFiles(self):<br /> source_path = "foo"<br /> self.fileChanged(source_path)<br /><br />def log_file_change(source_path):<br /> print "%r changed." % (source_path,)<br /><br />def log_file_change2(source_path):<br /> print "%r changed!" % (source_path,)<br /><br />watcher = MockFileWatcher()<br />watcher.fileChanged += log_file_change2<br />watcher.fileChanged += log_file_change<br />watcher.fileChanged -= log_file_change2<br />watcher.watchFiles()<br /><br /></pre>Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-53389488136750325962008-04-02T17:16:00.000-07:002008-04-16T22:28:14.063-07:00Banana Cream Pie Shake in 30 SecondsI don't intend this blog to be purely about programming matters. I intend it to be about anything I've learned and would like to share. Today I'd like to share with you an incredibly delicious and easy desert recipe that I've never seen before recently but stumbled upon. You'll love it. But if you are really only interested in programming topics, then I suggest you change your bookmark to <a href="http://www.valuedlessons.com/search/label/programming">a programming-label-only link</a> or your feed reader to <a href="http://www.valuedlessons.com/feeds/posts/default/-/programming">a programming-label-only feed</a> to enjoy just my posts about programming (at least I hope you enjoy them :).<br /><br />No about that recipe... If you like banana cream pie or a banana shake, this recipe is right up your alley. It's a banana cream pie shake. It tastes exactly like a banana cream pie, except that you can drink it, it's healthy for you, and it only takes 30 seconds. Here's the recipe:<br /><br /><blockquote><br />2 frozen bananas (I chop them a bit to blend easier)<br />1 cup of liquid (I use soy milk)<br />blend until smooth<br /></blockquote><br /><br />I'm still astounded that something that costs so little (clearly less than $1) and takes so little time (a few minutes if you are slow) is so delicious and healthy.<br /><br />Finally, a little FAQ:<br /><br /><br />Q: Does it really only takes 30 seconds?<br /><br />A: It only takes 30 seconds to blend, but obviously bananas take a few hours to freeze, and you may spend a minute or two cutting, pouring, and cleaning up. Not counting the freezing, the whole process is just a few minutes. <br /><br /><br />Q: What kind of blender do I need?<br /><br />I have a <a href="http://www.amazon.com/Blendtec-TB-621-BHM-Professionals-500-Watt-Blender/dp/B000GIGZXM/ref=pd_bbs_sr_1?ie=UTF8&s=home-garden&qid=1207180812&sr=8-1">Blendtec TotalBlender</a>. Have you seen them <a href="http://www.willitblend.com/">blend an iPhone</a>? That's my blender. I'll hopefully review it for you someday; it really is that good. It's some of the best $400 I've ever spent. Frozen bananas are no challenge for it.<br /><br />If you have a lesser blender and it doesn't work at first, I'd recommend that you let the bananas thaw for a little while.<br /><br /><br />Q: Does it really taste that good?<br /><br />Yes. It's so good I'm about to stop writing this to make some more. I drank the last one so fast it didn't even make it through this whole post.<br /><br /><br />Q: Is it really healthy?<br /><br />If the ingredients are healthy, it's healthy. Bananas are healthy. Just use healthy liquid. You can even toss in extra healthy ingredients like flax seed and make it even healthier without altering the taste much. Adding chocolate might make it more tasty, but it will also make it less healthy. I prefer it without chocolate anyway.<br /><br /><br />Q: I tried it and it's <b>too</b> thick and creamy.<br /><br />A: Add ice. It dilutes it while keeping it cold and smooth. <br /><br /><br /><br />Enjoy!<br /><br /><br />Update: I made it again with a different batch of bananas, and it tasted different. It appears that the ripeness of the bananas makes a big difference. More ripe is more tasty. So, wait until the bananas are very ripe before freezing them.<br /><br />Update 2: I sprinkled on a little cinnamon, and it tastes like a mix between a shake, banana bread, and even egg nog. It's delicious!Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-30579403945616660402008-04-01T21:37:00.000-07:002008-04-02T16:40:46.122-07:00You Could Have Invented Monadic ParsingEven though Pysec (and functional programming in python in general) <a href="http://www.valuedlessons.com/2008/03/why-are-my-monads-so-slow.html">turned out to be 2-3x slower</a>, monadic parsing is still great for certain tasks. After I wrote about Pysec <a href="http://www.valuedlessons.com/2008/02/pysec-monadic-combinatoric-parsing-in.html">the first time</a>, I received comments regarding <a href="http://pyparsing.wikispaces.com/">pyparsing</a>, which is a great parsing library that looks very similar. Today, I'd like to show you a very simple example where monadic parsing really shines above all other parsing techniques, including pyparsing. This is a real-world need that I had, not just an academic thought-experiment. <br /><br />The situation is that I'm writing a peer-to-peer file synchronizer and I need to serialize binary data in an efficient way. I've choosen to use what I call "sized binary", which is basically <a href="http://en.wikipedia.org/wiki/Netstrings">netstrings</a> without the comma or <a href="http://en.wikipedia.org/wiki/Bencode">bencode's byte string</a>. The byte-string "ABC", for example, would be encoded as "3:ABC", or "Hello World!" would be "12:Hello World!". This is very efficient because "ABC" or "Hello World!" can be <b>any</b> bytes \x00-\xFF, without any sort of encoding or decoding like <a href="http://en.wikipedia.org/wiki/Uuencode">uuencode</a>.<br /><br />Stop and think for a minute how you would define a grammer to parse "3:ABC". It's so simple, right? Well, try and write a grammar for it. Try using BNF, EBNF, yacc/bison, pyparsing, SimpleParse, or any parsing libarary you know of, in any programming language. If you manage, please write me an email and let me know how you did it, because I'm not so sure it's even possible. Maybe it is in the <a href="http://www.ibm.com/developerworks/linux/library/l-cpregex.html">Perl6 grammars</a>; they're throwing the kitchen sink into that thing :). The best I've seen is that the creator of pyparsing hacked a grammar object to change how it parses the "ABC" part after a different object parsed the "3" part. This basically uses the grammar object as a global variable. It works in a single-core world, but it won't work in the many-core world we're headed for. <br /><br />But now you're getting bored with me because it's silly ringing my hands over such a simple format. I bet you could parse this format in just 4 lines of code:<br /><br /><pre><br />def parse_sized_binary(stream):<br /> size_bytes, stream = stream.readUntil(":")<br /> size = int(size_bytes)<br /> bytes, stream = stream.read(size)<br /> return bytes, stream<br /></pre><br /><br />You're right, it's really simple, at least until you have to embed this little "sized binary language" inside of a bigger language. Perhaps, like me, you need to embed binary data inside of a bigger data structure, and so you need to define a serialization for things like ints, floats, lists, and hash tables. For that stuff, you can define a grammer for a language like <a href="http://en.wikipedia.org/wiki/JSON">JSON</a> or <a href="http://en.wikipedia.org/wiki/YAML">YAML</a> and hand it over to a grammar-based parser library. But now you need a way to tell the library, "OK, when you see my binary data, hand control over to my four lines of code, and then I'll hand control back". You might even say "You know, why don't you just let me hand you any function of the form stream -> (value, stream) and let me control the parsing for a while". <br /><br /><b>Congratulations, you just invented monadic parsing</b>. The function you wrote, parse_sized_binary, of the form stream -> (bytes, stream), is a Parser Monad. Monadic parsing is just combining lots of these little parsers together. And so, it's incredibly easy to insert arbitrary parsing code, like parse_sized_binary, into any parser. <br /><br />As a complete example, using Pysec, here's a full parser for a language just like <a href="http://en.wikipedia.org/wiki/Bencode">bencode</a> but with support for floats and unicode. Notice that our parse_size_binary is renamed to bencode_sized and is given a more functional definition. <br /><br /><pre><br />from Pysec import Stub, until, lift, read, many_until, branch<br /><br />bencode_any = Stub()<br />bencode_sized = until(":") >> lift(int) >> read<br />bencode_unsized = until("e")<br />bencode_list = many_until(bencode_any, "e")<br />bencode_any.set(branch({"s" : bencode_sized,<br /> "u" : bencode_sized >> lift(lambda bytes : bytes.decode("utf8")),<br /> "i" : bencode_unsized >> lift(int),<br /> "f" : bencode_unsized >> lift(float),<br /> "l" : bencode_list,<br /> "d" : bencode_list >> lift(dict)}))<br /></pre><br /><br />I think this is sort of like <a href="http://en.wikipedia.org/wiki/Greenspun's_Tenth_Rule">Greenspun's Tenth Rule</a>. Any implementation of netstrings/sized-binary-data probably has an ad-hoc implementation of monadic parsing in it. Even my own non-monadic version of this parser (which I use for performance reasons) has an ad-hoc implementation of monadic parsing. I'm convinced that if you write a parser for a similar format, you'll implement/invent monadic parsing as well, even if you don't know it.Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-25567463109265830472008-03-19T14:10:00.000-07:002008-03-19T14:13:01.860-07:00Why are my monads so slow?If you've read the last few posts, you know that I choose to use monadic parsing for some application code I am writing. At an abstract level, it turned out beautifully. I was able to write the parser very elgantly and compactly despite the serialization format having some unique requirements. But at a practical level, it failed pretty miserably. <b>It was horribly slow</b>. With a deadline looming, I had to abandon my monadic parser and go back to using an older serialization format. But not content to give up so easily, I decided to figure out <b>why are my monads so slow?</b> I hope that my investigation may be of help to you if you choose to use similar techniques in your code.<br /><br />First, I tried running a profiler like <a href="http://docs.python.org/lib/module-profile.html">cProfile</a>. While such profilers normally work quite well, they don't work so well with lots of first-class functions. And since monadic code is full of first-class functions, the profiler didn't give much valuable information. <br /><br />So, I had to do optimizing the "old fashioned" way. I wrote a benchmark and ran both of my serializers against it. To start, I had two data points: one in traditional imperitave code and one in fully monadic code, and this is what I got:<br /><br /><pre><br />imperative: 0.465 seconds<br />monadic: 3.893 seconds<br /></pre><br /><br />Monadic code was 8x slower! But why? To narrow it down, I added some more data points by implementing the serializer in gradually more monadic ways. For example, I wrote a serializer in a functional/immutable style that passes around a "stream" rather than a string, and then another that passed around "state" in a very monadic way, but withoug using the monad class that I used in Pysec. Then I got this:<br /><br /><pre><br />imperative: 0.465 seconds<br />functional with stream: 1.018 seconds<br />almost monadic with state: 2.450 seconds<br />monadic: 3.893 seconds<br /></pre><br /><br />This gave some answers. One answer it gave was that passing around an immutable "stream" object is twice as expensive as merely passing around a string. Also, passing around a state object on top of that is even more expensive. With these clues, I was able to optimize in certain ways and reduce it to this:<br /><br /><pre><br />imperative: 0.465 seconds<br />functional with stream: 1.018 seconds<br />almost monadic with state: 1.098 seconds<br />monadic: 1.283 seconds<br /></pre><br /><br />That's a nice 3x improvement. What things were I able to do? By far the biggest cost that I was able to eliminate was object creation. I flattened two layers of abstraction into one, and thus split the number of objects created in half. The second thing I did was I created a "read until" operation that could be used more efficiently in the "grammar" of the parser. This is a form of pushing performance-critical code down the layers of abstraction. Finally, I didn't used decorators, for some often-called functions. <br /><br />In the end, <b>it looks like monadic code is about 2-3x slower in Python</b>. Almost all of that is actually the result of just doing things in a functional/immutable way. In particular, creating data structures appears to be the main bottleneck. In functional languagues, these types of operations are very common and optimized heavily, and so are typically very fast. It looks like it's not the same in Python. Just like you have to avoid recursion because there's no tail recursion, it looks like you have to avoid a functional/immutable coding style if you care about performance because object creation is so slow. On the other hand, if you don't mind the peformance hit, it makes the code much more elegant, just like recursion usually does.<br /><br />One thing of interest is that there is essentially no performance penatly for using monads over using a functional/immutable code style. The 20% penatly seen between "almost monadic" and "monadic" is only because I'm wrapping the monad in a Parser class which allows nice operator overloading. <br /><br />Here's a summary of what you can do to speed up any functional/immutable-style code, including monadic code when writing in Python:<br /><br /><ul><br /><li>Make object creation as fast as possible. Don't do anything fancy in __init__.</li><br /><li>Use as few layers of abstraction as possible, especially when there is an object created in each layer.</li><br /><li>Push common or expensive operations down the layers of abstaction, especially if it avoids creating objects.</li><br /><li>Avoid using decorators for heavily used functions.</li><br /><li>Don't use wrapper classes if you don't have to.</li><br /></ul><br /><br />As a final thought, I'd like to mention that while there's currently a substantial performance penalty for using immutable data structures, that style is going to become increasingly important as we enter the many-core era. No matter what style of concurrency you like, immutable data is always easier to work with when concurrency is involved. Concurrency and mutable data are just a bad combo. I think that it's going to be very important for language designers to address this when working on the peformance of their languages. I certainly hope future versions Python are much faster with immutable data. If they are, then the peformance penatly of using Monads will almost disappear.Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-33089330406903835642008-02-12T13:51:00.001-08:002008-02-15T14:09:07.322-08:00Pysec: Monadic Combinatoric Parsing in Python (aka Parsec in Python)<b>Update</b>: It turns out that there's a similar technique being used by <a href="http://pyparsing.wikispaces.com/">pyparsing</a>. I hadn't seen it before and when I first saw it I thought had reinvted the wheel and wasted my time. But upon further inspection, Pysec does a few things a much better than pyparsing, which happen to be the exact things that I need. There's no coincedence, of course, that Pysec matches my needs well. I'll be covering this in more detail in a future article. <br /><br /><b>Update 2</b>: I got <a href="http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html">@do syntax</a> to work! Again, stay tuned for another article on how.<br /><br />I've talked about monads in the past, but I never really covered what purpose they serve. I covered the "how" in <a href="http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html">Python</a> and <a href="http://www.valuedlessons.com/2008/01/monads-in-ruby-with-nice-syntax.html">Ruby</a>, but I doubt I'll ever full cover the "why" because it's simply too big of a subject. But today, I'd like to share with you one example of how monads are useful. In fact, it's the example that motivated me to do all of the monad stuff in the first place: parsing.<br /><br />Parsing is a topic that's been around pretty much forever in computer science and most people thing it's pretty much "solved". My experience is that we've still got a long way to go. Specifically, I'm writing an application with lots of distributed concurrency, which requires lots of data serialization and deserialization (aka parsing). There are very few good serialization libraries out there, and I've been through three or four versions of various techniques. <b>Finally, I think I have found a parsing technique that works well: monadic combinatoric parsing</b>. And it's in Python.<br /><br />What the heck does that mean? "monadic" means we're using monads. "combinatoric" means we can take monad parsers and combine them to make new monad parsers, which is extremly powerful. I call it Pysec. The design is a copy of <a href="http://legacy.cs.uu.nl/daan/parsec.html">Parsec</a> brought to Python. Notice how I said "design"; I didn't bother looking at any of their code; The design described on their web page was good enough guidance for me. But, I'm sure that their implementation is WAY better than mine. If you want to see real monadic parsing, look at Parsec. If you're interested in monadic parsing for Python, keep reading.<br /><br />Here's an example of Pysec for parsing a subset of JSON:<br /><br /><pre><br />from Pysec import Parser, choice, quoted_chars, group_chars, option_chars, digits, between, pair, spaces, match, quoted_collection<br /><br /># json_choices is a hack to get around mutual recursion <br /># a json is value is one of text, number, mapping, and collection<br /># text is any characters between quotes<br /># a number is like the regular expression -?[0-9]+(\.[0-9]+)?<br /># "parser >> Parser.lift(func)" means "pass the parsed value into func and return a new Parser"<br /># quoted_collection(start, space, inner, joiner, end)<br /># means "a list of inner separated by joiner surrounded by start and end"<br /># we have to put a lot of "spaces" in since JSON allows lot of optional whitespace<br /><br />json_choices = []<br />json = choice(json_choices)<br />text = quoted_chars("'", "'")<br />number = group_chars([option_chars(["-"]), digits, option_chars([".", digits])]) >> Parser.lift(float)<br />joiner = between(spaces, match(","), spaces)<br />mapping_pair = pair(text, spaces & match(":") & spaces & json)<br />collection = quoted_collection("[", spaces, json, joiner, "]") >> Parser.lift(list)<br />mapping = quoted_collection("{", spaces, mapping_pair, joiner, "}") >> Parser.lift(dict)<br />json_choices.extend([text, number, mapping, collection])<br /><br />print json.parseString("{'a' : -1.0, 'b' : 2.0, 'z' : {'c' : [1.0, [2.0, [3.0]]]}}")<br /></pre><br /><br />Like most monadic or functional code, it's pretty dense, so don't feel bad if you go cross-eyed looking at it the first time. Realize that most of the code is building a Parser monad called "json", which parses the test string at the end. I tried to comment each individual part to explain what's going on.<br /><br />You may be thinking "why would I want to write code like this?". One response is to look at the code: it's the bulk of JSON parsing in 15 lines that look like a grammar definition! Another response I can give is a challange: go write a parser to parse that string and then compare your code to this code. Which is shorter? Which is easier to read? Which is more elegant? While in college, I had a team project to write a compiler for a subset of Pascal. We were smart enough to use Python, but dumb enough to use Yacc and Flex. I'm sure the parser portion was pretty fast, but it was incredibly painful to get it right. Once we did, we dared not touch it for fear of breaking it. I really wish I had Parse/Pysec back then (ok, Parsec was around back then, but I hadn't even heard of Haskell or monads).<br /><br />But <b>monadic combinatoric parsing isn't just about making your code look like a grammar definition</b>. It makes it possible to combine parsers in incredibly flexible ways. For example, let's say that on a differnt project, you wrote a simplified CSV parser for numbers like this one:<br /><br /><pre><br />def line(cell):<br /> return sep_end_by(cell, match(","))<br /><br />def csv(cell):<br /> return sep_end_by(line(cell), match("\n"))<br /><br />print csv(number).parseString("1,2,3\n4,5,6")<br /></pre><br /><br />And now you realize you'd really like to put whole JSON values in your simplified CSV. In other words, you want to combine the CVS and JSON parsers. I think that you'll find that doing so really isn't as easy as it sounds. Imagine trying to combine two Yacc grammars. It hurts just thinking about that. Luckily, monadic combinatoric parsers make this incredibly easy:<br /><br /><pre><br />print csv(json).parseString("{'a' : 'A'},[1, 2, 3],'zzz'\n-1.0,2.0,-3.0")<br /></pre><br /><br />While this is a slighly contrived example, you must understand that with this technique <b>you can combine any two parsers</b> in a similar fasion. I don't know about you, but that really feels right to me. I haven't seen any parsing techniques as elegant as that. <br /><br />Everything it's perfect of course, especially since making this work in Python is a bit of a hack. Here are some downsides to this approach:<br /><br /><ul><br /> <li>It's hard to debug when you do something wrong.</li><br /> <li>It's annoying to import so many things from Pysec.</li><br /> <li>You have to hack around mutual recursion of values in a strict language like python.</li><br /> <li>There's probably a performance hit.</li><br /></ul><br /><br />Most of these can be either fixed or worked around, though, so I think long-term monadic parsing is good bet. I'd like to see what you can do with it or to make it better. <br /><br />I know how much you all like a working example, so here's some code that you can just cut, paste, and run (in Python 2.5; older versions of Python may require some tweaking). Please ignore the top half. It's mostly stuff I've covered in other articles, like <a href="http://www.valuedlessons.com/2007/12/immutable-data-in-python-record-or.html">Immutable Records</a> and <a href="http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html">monad base classes</a>. The meat starts where it says "Parser Monad". <br /><br />I'd like to talk about it in more detail, but I'll save that for another article. For now, please play around with it and see what you think.<br /><br /> <br /><pre><br />##### Base Libraries included here for convenience ###########<br /><br />def Record(*props):<br /> class cls(RecordBase):<br /> pass<br /><br /> cls.setProps(props)<br /><br /> return cls<br /><br />class RecordBase(tuple):<br /> PROPS = ()<br /><br /> def __new__(cls, *values):<br /> if cls.prepare != RecordBase.prepare:<br /> values = cls.prepare(*values)<br /> return cls.fromValues(values)<br /><br /> @classmethod<br /> def fromValues(cls, values):<br /> return tuple.__new__(cls, values)<br /><br /> def __repr__(self):<br /> return self.__class__.__name__ + tuple.__repr__(self)<br /><br /> ## overridable<br /> @classmethod<br /> def prepare(cls, *args):<br /> return args<br /><br /> ## setting up getters and setters<br /> @classmethod<br /> def setProps(cls, props):<br /> for index, prop in enumerate(props):<br /> cls.setProp(index, prop)<br /> cls.PROPS = props<br /><br /> @classmethod<br /> def setProp(cls, index, prop):<br /> getter_name = prop<br /> setter_name = "set" + prop[0].upper() + prop[1:]<br /><br /> setattr(cls, getter_name, cls.makeGetter(index, prop))<br /> setattr(cls, setter_name, cls.makeSetter(index, prop))<br /><br /> @classmethod<br /> def makeGetter(cls, index, prop):<br /> return property(fget = lambda self : self[index])<br /><br /> @classmethod<br /> def makeSetter(cls, index, prop):<br /> def setter(self, value):<br /> values = (value if current_index == index<br /> else current_value<br /> for current_index, current_value<br /> in enumerate(self))<br /> return self.fromValues(values)<br /> return setter<br /><br />class ByteStream(Record("bytes", "index")):<br /> @classmethod<br /> def prepare(cls, bytes, index = 0):<br /> return (bytes, index)<br /><br /> def get(self, count):<br /> start = self.index<br /> end = start + count<br /> bytes = self.bytes[start : end]<br /> return bytes, (self.setIndex(end) if bytes else self)<br /><br />def make_decorator(func, *dec_args):<br /> def decorator(undecorated):<br /> def decorated(*args, **kargs):<br /> return func(undecorated, args, kargs, *dec_args) <br /> <br /> decorated.__name__ = undecorated.__name__<br /> return decorated<br /> <br /> decorator.__name__ = func.__name__<br /> return decorator<br /><br />decorator = make_decorator<br /><br />class Monad:<br /> ## Must be overridden<br /> def bind(self, func):<br /> raise NotImplementedError<br /><br /> @classmethod<br /> def unit(cls, val):<br /> raise NotImplementedError<br /><br /> @classmethod<br /> def lift(cls, func):<br /> return (lambda val : cls.unit(func(val)))<br /><br /> ## useful defaults that should probably NOT be overridden<br /> def __rshift__(self, bindee):<br /> return self.bind(bindee)<br /><br /> def __and__(self, monad):<br /> return self.shove(monad)<br /> <br /> ## could be overridden if useful or if more efficient<br /> def shove(self, monad):<br /> return self.bind(lambda _ : monad)<br /><br />class StateChanger(Record("changer", "bindees"), Monad):<br /> @classmethod<br /> def prepare(cls, changer, bindees = ()):<br /> return (changer, bindees)<br /><br /> # binding can be slow since it happens at bind time rather than at run time<br /> def bind(self, bindee):<br /> return self.setBindees(self.bindees + (bindee,))<br /><br /> def __call__(self, state):<br /> return self.run(state)<br /><br /> def run(self, state0):<br /> value, state = self.changer(state0) if callable(self.changer) else self.changer<br /> state = state0 if state is None else state<br /><br /> for bindee in self.bindees:<br /> value, state = bindee(value).run(state)<br /> return (value, state)<br /><br /> @classmethod<br /> def unit(cls, value):<br /> return cls((value, None))<br /><br />######## Parser Monad ###########<br /><br />class ParserState(Record("stream", "position")):<br /> @classmethod<br /> def prepare(cls, stream, position = 0):<br /> return (stream, position)<br /><br /> def read(self, count):<br /> collection, stream = self.stream.get(count)<br /> return collection, self.fromValues((stream, self.position + count))<br /><br />class Parser(StateChanger):<br /> def parseString(self, bytes):<br /> return self.parseStream(ByteStream(bytes))<br /> <br /> def parseStream(self, stream):<br /> state = ParserState(stream)<br /> value, state = self.run(state)<br /> return value<br /><br />class ParseFailed(Exception):<br /> def __init__(self, message, state):<br /> self.message = message<br /> self.state = state<br /> Exception.__init__(self, message)<br /><br />@decorator<br />def parser(func, func_args, func_kargs):<br /> def changer(state):<br /> return func(state, *func_args, **func_kargs)<br /> changer.__name__ = func.__name__<br /> return Parser(changer)<br /><br />##### combinatoric functions #########<br /><br />@parser<br />def tokens(state0, count, process):<br /> tokens, state1 = state0.read(count)<br /><br /> passed, value = process(tokens)<br /> if passed:<br /> return (value, state1)<br /> else:<br /> raise ParseFailed(value, state0)<br /> <br />def read(count):<br /> return tokens(count, lambda values : (True, values))<br /><br />@parser<br />def skip(state0, parser):<br /> value, state1 = parser(state0)<br /> return (None, state1)<br /><br />@parser<br />def option(state, default_value, parser):<br /> try:<br /> return parser(state)<br /> except ParseFailed, failure:<br /> if failure.state == state:<br /> return (default_value, state)<br /> else:<br /> raise<br /> <br />@parser<br />def choice(state, parsers):<br /> for parser in parsers:<br /> try:<br /> return parser(state)<br /> except ParseFailed, failure:<br /> if failure.state != state:<br /> raise failure<br /> raise ParseFailed("no choices were found", state)<br /><br />@parser<br />def match(state0, expected):<br /> actual, state1 = read(len(expected))(state0)<br /> if actual == expected:<br /> return actual, state1<br /> else:<br /> raise ParseFailed("expected %r" % (expected,), state0)<br /><br />def between(before, inner, after):<br /> return before & inner >> (lambda value : after & Parser.unit(value))<br /><br />def quoted(before, inner, after):<br /> return between(match(before), inner, match(after))<br /><br />def quoted_collection(start, space, inner, joiner, end):<br /> return quoted(start, space & sep_end_by(inner, joiner), end)<br /><br />@parser<br />def many(state, parser, min_count = 0):<br /> values = []<br /><br /> try:<br /> while True:<br /> value, state = parser(state)<br /> values.append(value)<br /> except ParseFailed:<br /> if len(values) < min_count:<br /> raise<br /><br /> return values, state<br /> <br />@parser<br />def group(state, parsers):<br /> values = []<br /><br /> for parser in parsers:<br /> value, state = parser(state)<br /> values.append(value)<br /><br /> return values, state<br /><br />def pair(parser1, parser2):<br /> # return group((parser1, parser2))<br /> return parser1 >> (lambda value1 : parser2 >> (lambda value2 : Parser.unit((value1, value2))))<br /><br />@parser<br />def skip_many(state, parser):<br /> try:<br /> while True:<br /> value, state = parser(state)<br /> except ParseFailed:<br /> return (None, state)<br /><br />def skip_before(before, parser):<br /> return skip(before) & parser<br /><br />@parser<br />def skip_after(state0, parser, after):<br /> value, state1 = parser(state0)<br /> _, state2 = after(state1)<br /> return value, state2<br /><br />@parser<br />def option_many(state0, first, repeated, min_count = 0):<br /> try:<br /> first_value, state1 = first(state0)<br /> except ParseFailed:<br /> if min_count > 0:<br /> raise<br /> else:<br /> return [], state0<br /> else:<br /> values, state2 = many(repeated, min_count-1)(state1)<br /> values.insert(0, first_value)<br /> return values, state2<br /><br /># parser separated and ended by sep<br />def end_by(parser, sep_parser, min_count = 0):<br /> return many(skip_after(parser, sep_parser), min_count)<br /><br /># parser separated by sep<br />def sep_by(parser, sep_parser, min_count = 0):<br /> return option_many(parser, skip_before(sep_parser, parser), min_count)<br /> <br /># parser separated and optionally ended by sep<br />def sep_end_by(parser, sep_parser, min_count = 0):<br /> return skip_after(sep_by(parser, sep_parser, min_count), option(None, sep_parser))<br /><br />##### char-specific parsing ###########<br /><br />def satisfy(name, passes):<br /> return tokens(1, lambda char : (True, char) if passes(char) else (False, "not " + name))<br /><br />def one_of(chars):<br /> char_set = frozenset(chars)<br /> return satisfy("one of %r" % chars, lambda char : char in char_set)<br /><br />def none_of(chars):<br /> char_set = frozenset(chars)<br /> return satisfy("not one of %r" % chars, lambda char : char and char not in char_set)<br /><br />def maybe_match_parser(parser):<br /> return match(parser) if isinstance(parser, str) else parser<br /><br />def maybe_match_parsers(parsers):<br /> return tuple(maybe_match_parser(parser) for parser in parsers)<br /><br />def many_chars(parser, min_count = 0):<br /> return join_chars(many(parser, min_count))<br /><br />def option_chars(parsers):<br /> return option("", group_chars(parsers))<br /><br />def group_chars(parsers):<br /> return join_chars(group(maybe_match_parsers(parsers)))<br /> #return join_chars(group(parsers))<br /><br />def join_chars(parser):<br /> return parser >> Parser.lift("".join)<br /><br />def while_one_of(chars, min_count = 0):<br /> return many_chars(one_of(chars), min_count)<br /><br />def until_one_of(chars, min_count = 0):<br /> return many_chars(none_of(chars), min_count)<br /><br />def char_range(begin, end):<br /> return "".join(chr(num) for num in xrange(ord(begin), ord(end)))<br /><br />def quoted_chars(start, end):<br /> assert len(end) == 1, "end string must be exactly 1 character"<br /> return quoted(start, many_chars(none_of(end)), end)<br /><br />digit = one_of(char_range("0", "9"))<br />digits = many_chars(digit, min_count = 1)<br />space = one_of(" \v\f\t\r\n")<br />spaces = skip_many(space)<br /><br /><br />############# simplified JSON ########################<br /><br />#from Pysec import Parser, choice, quoted_chars, group_chars, option_chars, digits, between, pair, spaces, match, quoted_collection, sep_end_by<br /><br />#HACK: json_choices is used to get around mutual recursion <br />#a json is value is one of text, number, mapping, and collection, which we define later <br />json_choices = []<br />json = choice(json_choices)<br /><br />#text is any characters between quotes<br />text = quoted_chars("'", "'")<br /><br />#sort of like the regular expression -?[0-9]+(\.[0-9]+)?<br />#in case you're unfamiliar with monads, "parser >> Parser.lift(func)" means "pass the parsed value into func but give me a new Parser back"<br />number = group_chars([option_chars(["-"]), digits, option_chars([".", digits])]) >> Parser.lift(float)<br /><br />#quoted_collection(start, space, inner, joiner, end) means "a list of inner separated by joiner surrounded by start and end"<br />#also, we have to put a lot of spaces in there since JSON allows lot of optional whitespace<br />joiner = between(spaces, match(","), spaces)<br />mapping_pair = pair(text, spaces & match(":") & spaces & json)<br />collection = quoted_collection("[", spaces, json, joiner, "]") >> Parser.lift(list)<br />mapping = quoted_collection("{", spaces, mapping_pair, joiner, "}") >> Parser.lift(dict)<br /><br />#HACK: finish the work around mutual recursion<br />json_choices.extend([text, number, mapping, collection])<br /><br /><br />############# simplified CSV ########################<br /><br />def line(cell):<br /> return sep_end_by(cell, match(","))<br /><br />def csv(cell):<br /> return sep_end_by(line(cell), match("\n"))<br /><br />############# testing ####################<br /><br />print json.parseString("{'a' : -1.0, 'b' : 2.0, 'z' : {'c' : [1.0, [2.0, [3.0]]]}}")<br />print csv(number).parseString("1,2,3\n4,5,6")<br />print csv(json).parseString("{'a' : 'A'},[1, 2, 3],'zzz'\n-1.0,2.0,-3.0")<br /></pre>Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-29866233241690603482008-02-01T14:46:00.000-08:002008-02-13T17:13:00.839-08:00The Easy Way to Print Flash Cards (with Python and ImageMagick)Here's a valued lessons that I learned the long and hard way: how to easily print flash cards. Don't waste any time doing it the hard way like I did. Follow my easy route and save yourself some time. <br /><br />The other day, my wife had a very simple computer need: print and cut pictures into little uniform cards, like flash cards with pictures. It sounds like a simple problem, but there's a simple way and an easy way, and I'd like to share with you the easy way.<br /><br />The simple way is what she did: use some software like <a href="http://www.scribus.net/">Scribus</a> (in her case) or Microsoft Word (if she were almost anyone else), copy the pictures in, manually align them, and print. That works great ... until you try and do 700 pictures (about 80 pages of 9 pictures per page). Scribus ate all of the computer's memory, the hard drive thrashed like mad, everything slowed down, and my poor wife spent hours putting it all together. I doubt Word would have fared much better. <br /><br />But it still wasn't over. When she tried to print, the printer printed out one page with "error" on it. So, we exported to a PDF, and tried printing that: "error" again. So, we tried to print one page at a time: after about 5 minutes, it printed. All I could figure was that something bad was going on between the computer and the printer and involving sending these images uncompressed.<br /><br />At this point, I decided that it was time to call it quits on manual work and use automated work: write a program to do this for me. I'm not about to manually tell 78 pages to individually print. So, with some quick bash command line work, I told it to print each page by itself. That worked for a few pages, but some were still too big and we got the dreaded "error".<br /><br />Finally, I decdied to really role up my sleeves and write a program like we should have in the first place: in go pictures, out comes a PDF, and your print the PDF. Additionally, I wanted like to control the DPI of the images used to control the data flow to the printer. I did so, made the PDF, printed it and (after a while, still), it printed. Hurray! Even better, now if we wanted to change pictures or change sizes, we'll be able to do it with very little work.<br /><br />So here's the program. Put some pictures in a directory, run the program, and print the pdf of "flash cards" or "image thumbnails" or "reading lessons" or whatever you want to call them. You could even make this into an Automator task in Mac OS X. It uses <a href="http://www.imagemagick.org">ImageMagick</a>, though, so make sure you have that installed first.<br /><br /><pre><br />from subprocess import call<br /><br />MONTAGE = ["montage", "-bordercolor", "white"]<br />CONVERT = ["convert", "-bordercolor", "white"]<br /><br />def montage_files_of_paths(paths, name,<br /> columns = 3, rows = 3,<br /> width = 10, height = 8, xborder = .33, yborder = .33, density = 200,<br /> frame = 5, spacing = 10, <br /> intermediate_ext = "png", final_ext = "pdf"):<br /><br /> width, height, xborder, yborder = (int(val * density) for val in (width, height, xborder, yborder))<br /> thumbnail_width = int((width - (xborder * 2)) / columns)<br /> thumbnail_height = int((height - (yborder * 2)) / rows) <br /><br /> call(MONTAGE + ["-tile", "%rx%r" % (rows, columns),<br /> "-geometry", "%rx%r+%r+%r" % (thumbnail_width, thumbnail_height, spacing, spacing),<br /> "-frame", str(frame),<br /> "-density", str(density)]<br /> + [path + "[0]" for path in paths] #[0] to ignore animated GIFS<br /> + [name + "." + intermediate_ext])<br /> <br /> call(CONVERT + ["-border", "%rx%r" % (xborder, yborder),<br /> "-density", str(density),<br /> "-annotate", "0x0+%r+%r" % (xborder, yborder), name,<br /> name + "*." + intermediate_ext,<br /> name + "." + final_ext])<br /><br />import sys<br /><br />if __name__ == "__main__":<br /> if len(sys.argv) < 2:<br /> print "usage: %s image_directory, image_directory, image_directory ..." % (sys.argv[0],)<br /> else:<br /> for dirname in sys.argv[1:]:<br /> montage_files_of_paths([dirname+"/*"], dirname)<br /></pre>Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-3911760103285407252008-01-21T11:30:00.000-08:002008-01-21T22:13:04.515-08:00Garlic Programmers for Silver Code?I just read Larry O'Brien's article about <a href="http://www.knowing.net/PermaLink,guid,fde0f610-3773-47b8-9be6-d6e5a8a76858.aspx"> there being no silver programmers</a>. He makes the case that although there may be great variance in programmer productivity, "the significant thing is not that some professional programmers are awesome, it's that some professional programmers suck". I admire Larry O'Brien greatly, and I think there is much wisdom in what he says here.<br /><br />But I think he's missing something.<br /><br />I have noticed a trend in my own productivity: <b>my productivity varies greatly with the code base I'm working on</b>. This variance may be greater than that between individual programmers. While many are rightly concerned with individual and team productivity, no ever talks about code base productivity. If we do, where does it lead us?<br /><br />A few years ago, I was working for a small medical software company. One project was a new venture with a new code base. Though new, the code was already difficult to work with. Its design forced the developer to duplicate work. In my first meeting about the project, I was told we had to do lots of "fat fingering". I didn't even know what that meant. It turned out to mean typing almost the exact same thing over and over. I was horrified and ultimately wrote a code generation tool to automate the work. Though creating and maintaining it was extra work in itself, without it, <b>doing many common tasks in this code easily took 3x longer than necessary</b>.<br /><br />After that project was "shelved", I was moved to the company's main product, which was older and had a much more "mature" code base. Actually, half was new-style (similar to what I had been working on) and half was old-style. The old-style code was even worse to work with. It was so bad that I was almost giddy whenever I could work on the new-style code. I'm sure you know what kind of code I'm talking about. <b>Doing anything in this code easily took 3x even longer</b>.<br /><br />Combined, that's 9x slower. That seems too high. Is it even possible? How can a code base reduce productivity so much? Here are a few possibilities I thought of; I'm sure you can think of more:<br /><ul><br /> <li>Bad code leads to more code, and more code is slower to work with</li><br /> <li>Bad code is hard to understand</li><br /> <li>Bad code is dangerous to change</li><br /> <li>Bad code leads duplication of effort</li><br /> <li>Bad code leads to a slow change-compile-test cycle, leading to wasted time and loss of focus</li><br /> <li>Bad code from third party libraries can drive you crazy by crashing, malfunctioning randomly, or having little documentation</li><br /> <li>Bad code saps energy, interest, and morale</li><br /></ul><br /><br /><b>A bad code base makes any programmer less productive.</b> Have you ever noticed that when you work from scratch with no code base to weigh you down, you can be much more productive? Imagine that as your baseline productivity. Now think of how long it takes you to accomplish something similar on a project that you're working on right now. How does it compare? I bet you're slower on the existing project.<br /><br />But wait just a minute. Imagine if my list of the effects of bad code were reversed. What if, instead, it read:<br /><br /><ul><br /> <li>Good code leads to less code, and less code is easier to work with</li><br /> <li>Good code is easy to understand</li><br /> <li>Good code is safe to change</li><br /> <li>Good code avoids duplication of effort</li><br /> <li>Good code leads to a quick change-compile-test cycle</li><br /> <li>Good code uses libraries which remove the need for "dirty work" and let you focus on the important problems</li><br /> <li>Good code is fun to work with!</li><br /></ul><br /><br /><b>A good code base makes any programmer more productive.</b> Have you ever worked on a project that was so wonderful that you could crank out magnificent results almost effortlessly? Perhaps <a href="http://www.gnu.org/software/emacs/">it has a great DSL embedded in it</a> or has <a href="http://www.loudthinking.com/about.html">powerful infrastructure</a>. <br /><br />Even with conservative estimates, <b>if bad code slows us down 5x and good code speeds us up just 2x faster, that's easily a 10x difference. </b> <br /><br />Maybe silver programmers don't exist. But is there "silver code"? What if there are programmers who are more adept at writing silver code? If they can make a code base that's just 2x easier to work with long-term, then that makes everyone that works on that code 2x more productive, possibly forever. If such programmers exist, then they must be extremely valuable, not so much because of their own productivity, but because of the effect they will have on the productivity of others in the future. Perhaps rather than "silver bullet" programmers that kill werewolves, these are just "garlic programmers" that ward of vampires.<br /><br />Given that productivity can vary so much with quality of existing code, <b>perhaps its worth looking for "garlic programmers" who will write code that will keep the rest us of productive long-term.</b> I leave it as an exercise for the reader to figure out how to measure a programmer's garlic-ness :).Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-11799323600886807422008-01-15T08:44:00.000-08:002008-01-21T11:41:18.800-08:00List Monad in Ruby and PythonRecently, I wrote about ways to add something like Haskell's do syntax to programming languages like <a href="http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html">python</a> and <a href="http://www.valuedlessons.com/2008/01/monads-in-ruby-with-nice-syntax.html">ruby</a>. The python trick used bidirectional generators and the ruby trick used callcc. I have since been notified that there is one serious problem with both of them: the bindee (the function given to bind) can only be called once. For many monads, this limitation is acceptable, but for some, it means you can't use that nice syntax.<br /><br />Take the List monad, for example, in ruby:<br /><br /><pre><br />class ListMonad < Monad<br /> attr_accessor :array<br /><br /> def initialize(array)<br /> @array = array<br /> end<br /><br /> def bind(bindee)<br /> ListMonad.new(concat(array.map { |val| maybeUnit(bindee.call(val)).array }))<br /> end<br /><br /> def self.unit(val)<br /> self.new([val])<br /> end<br /><br /> def to_s<br /> joined = array.join(', ')<br /> "ListMonad([#{joined}])"<br /> end<br />end<br /><br />## I'm hoping this if faster than using fold (inject)<br />def concat(arrays)<br /> all = []<br /> for array in arrays<br /> all += array<br /> end <br /> all<br />end<br /><br />class Array<br /> def bind(bindee)<br /> ListMonad.new(self).bind(bindee)<br /> end<br /><br /> def bindb(&block)<br /> ListMonad.new(self).bindb(&block)<br /> end<br />end<br /></pre><br /><br />Ruby gives us the ability to extend Array to get a pretty sweet looking use of the monad:<br /><br /><pre><br />list1 = with_monad ListMonad do<br /> x =xx [1,2,3]<br /> y =xx [10,20,30]<br /> x + y<br />end<br /></pre><br /><br />It looks great, but list1 should be ListMonad([11, 21, 31, 12, 22, 32, 13, 23, 33]), be we get ListMonad([11]) instead.<br /><br />Traditional binding works fine:<br /><br /><pre><br />list2 = [1, 2, 3].bindb do |x|<br /> [10, 20, 30].bindb do |y|<br /> x + y<br /> end<br />end<br /></pre><br /><br />list2 is ListMonad([11, 21, 31, 12, 22, 32, 13, 23, 33]). So, we know our monad is fine, but our with_monad doesn't work.<br /><br />We have the same problem in python:<br /><br /><pre><br />class ListMonad(Monad):<br /> def __init__(self, vals):<br /> self.vals = vals<br /><br /> def bind(self, bindee):<br /> return self.__class__(concat((self.maybeUnit(bindee(val)) for val in self.vals)))<br /><br /> @classmethod<br /> def unit(cls, val):<br /> return cls([val])<br /><br /> def __repr__(self):<br /> return "ListMonad(%s)" % self.vals<br /><br /> def __iter__(self):<br /> return iter(self.vals)<br /><br />def concat(lists):<br /> return [val for lst in lists<br /> for val in lst]<br /><br />@do(ListMonad)<br />def with_list_monad():<br /> x = yield [1, 2, 3]<br /> y = yield [10, 20, 30]<br /> mreturn(x + y)<br /><br />list1 = with_list_monad()<br /><br />def with_list_monad_binding():<br /> return \<br /> ListMonad([ 1, 2, 3]) >> (lambda x:<br /> ListMonad([10, 20, 30]) >> (lambda y:<br /> x + y))<br /><br />list2 = with_list_monad_binding()<br /> <br /></pre><br /><br />Again, it looks good, but list1 is ListMonad([11, None, None, None, None]) while list2 is ListMonad([11, 21, 31, 12, 22, 32, 13, 23, 33]). The result of list1 sure is bizarre.<br /><br />Luckily, I have found a solution for ruby, and a bad hack that could be construed as a solution for python if you ignore some issues.<br /><br />In Ruby, the trick is to monkey with the rbind continuation so that at the end of with_monad it returns control to the point where the continuation is called. In other words, we need to store a second continuation at the point right after the first continuation is called. Confused yet? Continuations will melt your brain, and this took a little while for met get right:<br /><br /><pre><br />def with_monad_ext(monad_class, &block)<br /> finishers = []<br /><br /> rbind_ext = lambda do |monad|<br /> begin<br /> checked_callcc do |outer_cont|<br /> monad.bindb do |val|<br /> callcc do |inner_cont|<br /> finishers.push(inner_cont)<br /> outer_cont.call(val)<br /> end<br /> end<br /> end<br /> rescue ContinuationUnused => unused<br /> raise MonadEscape.new(unused.result)<br /> end<br /> end<br /><br /> val = begin<br /> monad_class.maybeUnit(block.call(rbind_ext))<br /> rescue MonadEscape => escape<br /> escape.monad<br /> end<br /><br /> finisher = finishers.pop()<br /> if finisher<br /> finisher.call(val)<br /> else<br /> val<br /> end<br />end<br /><br />list3 = with_monad_ext ListMonad do |rbind|<br /> x = rbind.call [1,2,3]<br /> y = rbind.call [10,20,30]<br /> x + y<br />end<br /></pre><br /><br />Success! list3 is correctly ListMonad([11, 21, 31, 12, 22, 32, 13, 23, 33]). The only real problem is that the normal rbind won't work. We have to make a special one and give it to the block as an argument. This wouldn't be that bad, except that Ruby has this silly limitation so we have to say rbind.call rather than just rbind, which makes it less fun to type. I named it with_monad_ext because I don't think it will be needed as often as with_monad, and with_monad is more convenient.<br /><br />Now, back to python. The problem is harder. It's rooted in the fact that for a given iterator, itr.send() is not reentrant. But, if we could copy the iterator, that would give us a possible solution. I did a little googling and found that python isn't going to have copy.copy(itr) anytime soon because <a href="http://www.python.org/dev/peps/pep-0323/">no one cares about them</a>. Some people <a href="http://www.fiber-space.de/generator_tools/doc/generator_tools.html">have made some progress</a>, but it segfaults on my computer (64-bit Linux, in case you're wondering). <br /><br />So, I came up with this terrible little hack which makes it possible to "copy" iterators:<br /><br /><pre><br />class CopyableIterator:<br /> def __init__(self, generator, log = ()):<br /> self.generator = generator<br /> self.log = list(log) #hmmm... if the logs were immutable, we wouldn't have to do this<br /> self.iterator = None<br /><br /> def getIterator(self):<br /> if self.iterator is None:<br /> self.iterator = self.generator()<br /> for value in self.log:<br /> self.iterator.send(value)<br /> return self.iterator<br /> <br /> def send(self, value):<br /> iterator = self.getIterator()<br /> self.log.append(value)<br /> return iterator.send(value)<br /><br /> def next(self):<br /> return self.send(None)<br /><br /> def copy(self):<br /> return self.__class__(self.generator, self.log)<br /></pre><br /><br />That let's us define @do_ext:<br /><br /><pre><br />@decorator_with_args<br />def do_ext(func, func_args, func_kargs, Monad):<br /> @handle_monadic_throws(Monad)<br /> def run_maybe_iterator():<br /> itr = func(*func_args, **func_kargs)<br /><br /> if isinstance(itr, types.GeneratorType):<br /> @handle_monadic_throws(Monad)<br /> def send(itr, val):<br /> try:<br /> # here's the real magic<br /> monad = Monad.maybeNew(itr.send(val))<br /> return monad.bind(lambda val : send(itr.copy(), val))<br /> except StopIteration:<br /> return Monad.unit(None)<br /> <br /> return send(CopyableIterator(lambda : func(*func_args, **func_kargs)), None)<br /> else:<br /> #not really a generator<br /> if itr is None:<br /> return Monad.unit(None)<br /> else:<br /> return itr<br /><br /> return run_maybe_iterator()<br /><br />@do_ext(ListMonad)<br />def with_list_monad_ext():<br /> x = yield [1, 2, 3]<br /> y = yield [10, 20, 30]<br /> mreturn(x + y)<br /><br />list3 = with_list_monad_ext()<br /><br /></pre><br /><br />And list3 is ListMonad([11, 21, 31, 12, 22, 32, 13, 23]). Success again! There is a downside, though. First, all of the calculations are done for all of the combinations. This is especially bad if you do this:<br /><br /><pre><br />@do_ext(ListMonad)<br />def with_list_monad_ext():<br /> print "foo"<br /> x = yield [1, 2, 3]<br /> y = yield [10, 20, 30]<br /> mreturn(x + y)<br /></pre><br /><br />Because now you'll print "foo" 9 times instead of 1.<br /><br />So there you have it: you can do the List monad with do syntax in python and ruby, but you'll have to decide whether the trade-offs are worth it.Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-33503541366842393292008-01-09T10:34:00.000-08:002008-01-10T07:06:46.419-08:00Monads in Ruby (with nice syntax!)My last <a href="http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html">article</a> was about how you can do monads in python with nice syntax. Now I'd like to present nice monad syntax in Ruby. I'm not explaining what monads are or how you can use them. For now, I'm expecting you to be familiar with monads. If you aren't, go read a <a href="http://sigfpe.blogspot.com/2006/08/you-could-have-invented-monads-and.html">nice article</a> on them.<br /><br />Imagine you have a Monad base class like this:<br /><br /><pre><br />class Monad<br /> def bind(func)<br /> raise "not implemented"<br /> end<br /><br /> def self.unit(val)<br /> raise "not implemented"<br /> end<br /><br /> # bind to block<br /> def bindb(&func)<br /> bind(func)<br /> end<br />end<br /></pre><br /><br />As an aside, Please excuse there being a "bind" and a "bindb". It's silly that Ruby differentiates between a block and Proc like that. I<br />also think it's silly that I have to keep typing "lambda {|val| func(val)}" to turn a method into a Proc and "proc.call(val)" to call a Proc. In python, methods, functions, and lambdas are all the same thing, and it's a lot easier to code with. In this sense, python is way better. But Ruby has really slick syntax for passing in the block, and python's lambda restrictions are annoying. Why can't we have the best of both? I guess then I'd need Scala or Perl6. End aside.<br /><br />Let's implement the Either monad, which I call it Failable:<br /><br /><pre><br />class Failable < Monad<br /> def initialize(value, success)<br /> @value = value<br /> @success = success<br /> end<br /><br /> def bind(bindee)<br /> if @success<br /> bindee.call(@value)<br /> else<br /> self<br /> end<br /> end<br /><br /> def self.unit(val)<br /> self.new(val, true)<br /> end<br /><br /> def to_s<br /> if @success<br /> "Success(#{@value})"<br /> else<br /> "Failure(#{@value})"<br /> end<br /> end<br />end<br /><br />def success(val)<br /> Failable.new(val, true)<br />end<br /><br />def failure(val)<br /> Failable.new(val, false)<br />end<br /></pre><br /><br />Now we can write some code that safely handles divide by zero without using exceptions (actually, exceptions essentiatlly are the Either monad, but never mind that for now):<br /><br /><pre><br />def fdiv(a, b)<br /> if b == 0<br /> failure("cannot divide by zero")<br /> else<br /> success(a / b)<br /> end<br />end<br /><br />def fdiv_with_binding(first_divisor)<br /> fdiv(2.0, first_divisor).bindb do |val1|<br /> fdiv(3.0, 1.0) .bindb do |val2|<br /> fdiv(val1, val2) .bindb do |val3|<br /> success(val3)<br /> end<br /> end<br /> end<br />end<br /><br />puts fdiv_with_binding(1.0)<br />puts fdiv_with_binding(0.0)<br /></pre><br /><br />Which prints:<br /><br /><pre><br />Success(0.666666666666667)<br />Failure(cannot divide by zero)<br /></pre><br /><br />But it's not very pretty. Luckily, ruby has callcc, which makes it very easy to define a function which I'll call "rbind" which can do this:<br /><br /><pre><br />def fdiv_with_rbinding(first_divisor)<br /> with_monad Failable do<br /> val1 = rbind fdiv(2.0, first_divisor)<br /> val2 = rbind fdiv(3.0, 1.0)<br /> val3 = rbind fdiv(val1, val2)<br /> val3<br /> end<br />end<br /><br />def with_monad(monad_class, & block)<br /> begin<br /> val = block.call()<br /> if val.class == monad_class<br /> val<br /> else<br /> monad_class.unit(val)<br /> end<br /> rescue MonadEscape => escape<br /> escape.monad<br /> end<br />end<br /><br />def rbind(monad)<br /> begin<br /> checked_callcc {|cc| monad.bind(cc)}<br /> rescue ContinuationUnused => unused<br /> raise MonadEscape.new(unused.result)<br /> end<br />end<br /><br />class MonadEscape < RuntimeError<br /> attr :monad<br /><br /> def initialize(monad)<br /> @monad = monad<br /> end<br />end<br /><br />def checked_callcc(&with_cc)<br /> callcc do |cont|<br /> val = with_cc.call(lambda {|val| cont.call(val)})<br /> raise ContinuationUnused.new(val) if cont<br /> val<br /> end<br />end<br /><br />class ContinuationUnused < RuntimeError<br /> attr :result<br /> <br /> def initialize(result)<br /> @result = result<br /> end<br />end <br /></pre><br /><br />Ruby's block syntax makes it very nice to say "with_monad Monad do", which I like. But I don't really like seeing "= rbind". I'd really like it if we could override "<<" and read that as "=<<", but I think we're stuck with unary operators ("+", "-", "~", all of which look bad here) or words. At least we can choose our name, unlike in python where we have to use "yield". Does anyone have any idea for a better name?<br /><br />Maybe "xx" would work:<br /><br /><pre><br />def fdiv_with_rbinding(first_divisor)<br /> with_monad Failable do<br /> val1 =xx fdiv(2.0, first_divisor)<br /> val2 =xx fdiv(3.0, 1.0)<br /> val3 =xx fdiv(val1, val2)<br /> val3<br /> end<br />end<br /></pre><br /><br />So there you have it. You can have nice monad syntax in Ruby using callcc. Unfortunately, I've read that not all implementations of Ruby have continuatoins, which means JRuby and IronRuby may be left out in the cold. Keep in mind that Ruby's "yield" syntax is NOT a true generator, which means that we can't use the same trick that we used in python. It's callcc or nothing.<br /><br />Here's the complete code in case you want to cut and paste it:<br /><br />############# Monad Base ##############<br /><br /><pre><br />class Monad<br /> def bind(func)<br /> raise "not implemented"<br /> end<br /> <br /> def self.unit(val)<br /> raise "not implemented"<br /> end<br /><br /> # bind to block<br /> def bindb(&func)<br /> bind(func)<br /> end<br />end<br /><br />def with_monad(monad_class, &block)<br /> begin<br /> val = block.call()<br /> if val.class == monad_class<br /> val<br /> else<br /> monad_class.unit(val)<br /> end<br /> rescue MonadEscape => escape<br /> escape.monad<br /> end<br />end<br /><br /># "reverse" bind, or bind to the return value, or bind to the continuation<br />def rbind(monad)<br /> begin<br /> mycallcc {|cc| monad.bind(cc)}<br /> rescue ContinuationUnused => unused<br /> raise MonadEscape.new(unused.result)<br /> end<br />end<br /><br /><br />class MonadEscape < RuntimeError<br /> attr :monad<br /><br /> def initialize(monad)<br /> @monad = monad<br /> end<br />end<br /><br />def mycallcc(&with_cc)<br /> used = false<br /> val = callcc do |cc|<br /> fake_cc = lambda do |val| <br /> used = true<br /> cc.call(val)<br /> end<br /><br /> with_cc.call(fake_cc)<br /> end<br /><br /> raise ContinuationUnused.new(val) unless used<br /> val<br />end <br /><br />class ContinuationUnused < RuntimeError<br /> attr :result<br /> <br /> def initialize(result)<br /> @result = result<br /> end<br />end <br /><br />############# Failable Monad ##################<br /><br />class Failable < Monad<br /> def initialize(value, success)<br /> @value = value<br /> @success = success<br /> end<br /> <br /> def bind(bindee)<br /> if @success<br /> bindee.call(@value)<br /> else<br /> self<br /> end<br /> end<br /><br /> def self.unit(val)<br /> self.new(val, true)<br /> end<br /><br /> def to_s<br /> if @success<br /> "Success(#{@value})"<br /> else<br /> "Failure(#{@value})"<br /> end<br /> end<br />end<br /><br />def success(val)<br /> Failable.new(val, true)<br />end<br /><br />def failure(val)<br /> Failable.new(val, false)<br />end<br /><br />######## Failable Monad Example ##########<br /><br />def fdiv(a, b)<br /> if b == 0<br /> failure("cannot divide by zero")<br /> else<br /> success(a / b)<br /> end<br />end<br /><br />def with_failable_binding(first_divisor)<br /> fdiv(2.0, first_divisor).bindb { |val1|<br /> fdiv(3.0, 1.0) .bindb { |val2|<br /> fdiv(val1, val2) .bindb { |val3|<br /> success(val3)<br /> }<br /> }<br /> }<br />end<br /><br />def xx(monad)<br /> rbind(monad)<br />end<br /><br />def with_failable_rbinding(first_divisor)<br /> with_monad Failable do<br /> val1 =xx fdiv(2.0, first_divisor)<br /> val2 =xx fdiv(3.0, 1.0)<br /> val3 =xx fdiv(val1, val2)<br /> val3<br /> end<br />end<br /><br />puts with_failable_binding(1.0)<br />puts with_failable_binding(0.0)<br />puts with_failable_rbinding(1.0)<br />puts with_failable_rbinding(0.0)<br /></pre>Peter Thatcherhttp://www.blogger.com/profile/01092342988993218446noreply@blogger.comtag:blogger.com,1999:blog-1700157236206200597.post-67994722606644094622008-01-07T11:21:00.000-08:002008-01-08T11:42:56.302-08:00Monads in Python (with nice syntax!)<span style="font-weight: bold;">Update:</span> I've been informed that I didn't clearly explain what monads are and what the code examples are supposed to do. Sorry. I guess I assumed too much preexposure to monads. If you're no familiar with them, there are already so many tutorials on monads that I'll simply direct you to two of my favorites: <a href="http://www.loria.fr/%7Ekow/monads/index.html">spacesuits</a> and <a href="http://en.wikipedia.org/wiki/Monads_in_functional_programming">wikipedia</a>. I've also added more explanations of what the code snippets are supposed to do. I hope that helps.<br /><br /><span style="font-weight: bold;">Update 2: </span>By popular demand, I'm including some code that you can actually run :). See the bottom of the article.<br /><br />Recently, I used monads in production code for a soon-to-be-publically-released application. Although many think they are strange, estoric, and perhaps useless, monads were the best way to solve the problem. My code is not in Haskell; it's in Python. I'm not doing anything wierd like IO in a purely functional way; I'm just parsing a file.<br /><br />The crazy, unexpected conclusion I came to is that <span style="font-weight: bold;">you can and should use monads in your code in almost any programming language. </span>There are two parts to this: "can" and "should". I think I'll save "should" for another article. Right now, I'm excited to show you how you can.<br /><br />As a preview for "should", please consider that you may be using monads already without knowing it. <a href="http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A%2F%2Fresearch.microsoft.com%2F%7Eemeijer%2FPapers%2FLINQ20.pdf&ei=RYiCR8DJO5HEgwPei8GCBw&usg=AFQjCNE8AG_VXzu_tHYeXDy7Jw85eWUcig&sig2=SLBcZbWmgDnnAwaEygDrCA">LINQ in C# is a monad (pdf)</a>, so if you've ever used it, you've used a monad. The C# guys used monads for queries for the same reason I'm using for them for parsing: they're the right tool for the job. But unlike them, I can't change the syntax of the programming language.<br /><br />The biggest challange with using monads in a "normal" programming language is that monads involve lots of closures. This is exactly the same problem you run into with <a href="http://blogs.msdn.com/wesdyer/archive/2007/12/22/continuation-passing-style.aspx">CPS</a>, which isn't surprising since a monad's "bind" operator is CPS and since continuations can be implemented with monads. By the way, if your programming laungage doesn't have closures (meaning you are stuck programming in C, C++, or Java), then monads are probably out of the question. Assuming you have closures and use them with monads directly, you end up with code like the following. It's python using the <a href="http://en.wikipedia.org/wiki/Monads_in_functional_programming#Maybe_monad">Maybe monad</a> to handle divide by zero errors. I'm using ">>" (__rshift__) overloaded to mean "bind".<br /><br /><pre><br />def mdiv(a, b):<br /> if b == 0:<br /> return Nothing<br /> else:<br /> return Something(a / b)<br /><br />def with_maybe():<br /> return \<br /> mdiv(2.0, 2.0) >> (lambda val1 :<br /> mdiv(3.0, 0.0) >> (lambda val2 :<br /> mdiv(val1, val2) >> (lambda val3 :<br /> Something(val3)))) <br /></pre><br /><br />That's not very pretty. We need a way to clean that up. How we can do so depends on the programming language. Haskell has "do" syntax built-in, which makes monadic code look like an impertive language or even a list comprehension. Ruby and Scheme have call/cc which makes it trivial to wrap a bind call with a continuation to make any monadic code look "normal". C# has LINQ, which is practically Haskell's do notation with funny names.<br /><br />But I'm using python. What does python have? Luckily for me, in python 2.5 they added bidirectional generators, and I found a way to use them to make something like "do" notation. Now we can write the above code like this (@do and mreturn are defined later):<br /><br /><pre><br />@do(Maybe)<br />def with_maybe(first_divisor):<br /> val1 = yield mdiv(2.0, 2.0)<br /> val2 = yield mdiv(3.0, 0.0)<br /> val3 = yield mdiv(val1, val2)<br /> mreturn(val3)<br /></pre><br /><br />I even copied the names "do" and "return" from Haskell, although I had to spell "return" as "mreturn". All you really have to remember is that "yield" means "bind" and that the end result is a monad. There are limitations to this technique, but it's working very well for me so far. I've implemented the Maybe monad, the Error monad, the StateChanger monad, and the Continuation monad (which will require another article to explain). I particularly like the continuation monad because it allows me to write callcc, which lets me do threadless actors (message passing) in python:<br /><br /><br /><pre><br />from collections import deque<br /><br />class Mailbox:<br /> def __init__(self):<br /> self.messages = deque()<br /> self.handlers = deque()<br /><br /> def send(self, message):<br /> if self.handlers:<br /> handler = self.handlers.popleft()<br /> handler(message)()<br /> else:<br /> self.messages.append(message)<br /><br /> def