tag:blogger.com,1999:blog-87970068168111188292008-07-25T11:37:32.470-04:00random.choice(['idea', 'rant', 'link', 'tip'])Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comBlogger55125tag:blogger.com,1999:blog-8797006816811118829.post-71545384193214241332008-07-22T11:45:00.000-04:002008-07-22T11:45:46.264-04:00Going to speak at DjangoCon!Wow, my name is on the <a href="http://djangocon.org/program/#sunday-schedule">DjangoCon schedule</a> on Sunday, 2:25 pm. Jim Baker told me about it just a few days ago. This may be a bit surprising, but looks like the whole <a href="http://djangocon.org/">DjangoCon</a> is a bit surprising. <br /><br />I can hardly believe that I'm going to be there, and I'm extremely happy to have this opportunity to meet the Django community and to show what we are doing on this GSoC-funded project of getting Django running on Jython and integrating it with some cool JVM stuff.<br /><br />Well, I still have to do all the paper-work (I don't have a US visa yet, this is going to be my first visit to the country), and there is not too much time to do it. Not to mention that I've to practice my spoken English. But we humans are optimistic by nature, and I think that such optimism let us do most of the great things we do!Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-385804048786532222008-07-20T18:51:00.002-04:002008-07-21T10:40:03.482-04:00The Jython Import Logic<h3>Motivation</h3><br />I think the coolest feature of Jython is the seamless integration with Java. Let say you have the following java class:<br /><pre class="prettyprint"><br />package com.leosoto;<br />public class HelloWorld {<br /> public void hello() {<br /> System.out.println("Hello World");<br /> }<br /> public void hello(String name) {<br /> System.out.printf("Hello %s!", name);<br /> }<br />}<br /></pre><br />If the class is on the classpath when you start Jython, using it from python code is straightforward:<br /><pre class="prettyprint"><br />>>> from com.leosoto import HelloWorld<br />>>> h = HelloWorld()<br />>>> h.hello()<br />Hello World<br />>>> h.hello("joe")<br />Hello joe!<br /></pre><br />Now, did you knew that if the class was not pointed by the classpath, we could also package it on a jar, and the following would also work:<br /><pre><br />>>> import sys <br />>>> sys.path.append('/path/to/helloworld.jar')<br />>>> from com.leosoto import HelloWorld<br /></pre><br />Until yesterday, I didn't knew! <br /><br />Since part of my GSoC project is to come with a way to package Django projects in a single distributable war file, I've spent a complete day reading and playing with the Jython import logic, and here is what I got.<br /><br /><h3>Not much different than Python, right?</h3><br />First of all, Jython is an implementation of the Python language. So the import mechanism follow strictly what is know as <a href="http://www.python.org/dev/peps/pep-0302/">PEP 302: import hooks</a>. I don't want to repeat what is documented there, but a quick explanation is in order:<ul><br /> <li>First, try custom importers registered on <tt>sys.meta_path</tt>. If one of them is capable of importing the requested module, we are done.</li><br /> <li>For each entry of <tt>sys.path</tt>:<ul><br /> <li>Find the first hook registered on <tt>sys.path_hook</tt>that can handle the path entry (for example, <a href ="http://docs.python.org/lib/module-zipimport.html">zipimport</a> is registered there and handle all "*.zip" paths)</li><br /> <li>If a importer hook was found, try it. If the importer loaded the module, we are done</li><br /> <li>If a importer hook was not found, use the builtin import logic (good old *.py files inside directories). If the module is successfully loaded, we are done.<br /> </ul><br /></ul><br /><br /><h3>How are java classes loaded then?</h3><br />With a built-in import hook, naturally ;-) <br /><br />If you start CPython and look at <tt>sys.path_hooks</tt> you get:<br /><pre class="prettyprint"><br />>>> import sys<br />>>> sys.path_hooks<br />[<type 'zipimport.zipimporter'>]<br /></pre><br />On Jython, the result is slightly different:<br /><pre class="prettyprint"><br />>>> import sys<br />>>> sys.path_hooks<br />[<type 'JavaImporter'>, <type 'zipimport.zipimporter'>]<br /></pre><br />The JavaImporter only recognizes the <tt>'__classpath__'</tt> entry on sys.path, so it is fired after looking at path components before '__classpath__'. This gives us some control over which namespaces will end up containing python modules and which will contain java packages/classes, if some conflict occurs (such as the very real issue of having the <tt>'test'</tt> python module and the <tt>'test'</tt> java package). Naturally, the <tt>'__classpath__'</tt> entry is added automagically to <tt>sys.path</tt> on Jython startup.<br /><br />But...<br /><pre class="prettyprint"><br />>>> sys.path_hooks = []<br />>>> sys.path_importer_cache = {}<br />>>> del sys.modules['java']<br />>>> import java<br />>>> dir(java)<br />['__name__', 'applet', 'awt', 'beans' ... ]<br /></pre><br />This should have failed, after removing the <tt>JavaImporter</tt> hook and all the involved caches. Well, there is also some magic going on here... <br /><br /><h3>Jython, <tt>JavaImporter</tt> and Java Packages</h3><br />When an import is going to fail (that is, after searching on all <tt>sys.path</tt> entries and having no results), Jython tries to load a java package or java class. But wasn't that the task of the <tt>JavaImporter</tt>?<br /><br />Well, sort of. Half of such job is the responsability of the JavaImporter. The other half is managed by the <tt>SysPackageManager</tt>, which keeps in memory a tree of discovered java packages. <br /><br />When the Jython interpreter starts, the <tt>SysPackageManager</tt> looks for all jars and directores on the classpath and build the tree of java packages. You can also explicitely add a Java package into the <tt>PackageManager</tt> by calling <tt>sys.add_package("package.which.was.not.autodiscovered")</tt>. This is useful on environments where Jython is not allowed to look at the system classpath, or doesn't get the right information (as maybe the case when running inside a JavaEE container).<br /><br />Back to <tt>JavaImporter</tt>, its job is to just look into the <tt>SysPackageManager</tt>'s loaded packages and check if the requested name is present there. <br /><br /><h3> And here is the magic</h3><br />Another way to get packages loaded into <tt>SysPackageManager</tt> is to add a zip or jar to </tt>sys.path</tt>. The next time the import logic runs, it automatically add the contents of the new jar (or zip) to the tree of known java packages. <br /><br />This is a little weird, because if you have the following in your <tt>sys.path</tt>:<br /><pre><br />['__classpath__', '/foo', 'foo.jar']<br /></pre><br />Then, if java packages on <tt>foo.jar</tt> conflicts with python modules from <tt>/foo</tt> then the java packages will prevail, because the <tt>'__classpath__'</tt> entry is before <tt>'/foo'</tt>, and then the JavaImporter will do its magic.<br /><br />And the other bit of magic is what we have already seen: Jython does a last attempt to load a Java package, or to be more precise, to add a package to the <tt>SysPackageManager</tt> if the imported name is know to the JVM as a class or package name. If this operation is successful, the module is directly imported by this Jython builtin import logic addition (no way to go back to the JavaImporter at this time). <br /><br /><h3>Some observations</h3><br />Here ends the objective part of this post. What follow now are my observations on the whole process:<ul><br /><li>I don't quite understand why Jython tries to load Java classes or packages at the end of the import logic after trying the standard procedure. Seems like such fallback would make the calls to <tt>sys.add_package</tt> unnecessary, but then, why does <tt>add_package</tt> exists? And, in any case, I think that JavaImporter should do this</li><br /><li>The confusing situation of jar files (and java classes) in sys.path is well... confusing. The good news is that namespace conflicts aren't that common in practice, so just remembering that all java "modules" come from the magic '__classpath__' element is enough. </li><br /><li>It would be nice if the Jython standard loader were installed on the </tt>meta_path</tt>. Then, <tt>JavaImporter</tt> could be added there too, just after the default python code loader. This way, we would have a more clear precedence rule (Python modules first, Java packages/classes later), instead of the current "first python modules before __classpath__, followed by java "modules", followed by python modules after __classpath__, followed bt java "modules" wich weren't registered yet on the PackageManager).</li></ul><br /><h3>That's all</h3><br />OK, that was a long post. Now that I've dumped all that info here, I can go back to coding and try to make distributable WAR files for django projects, containing the complete Jython, modjy and Django runtime.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-10048421956108819592008-07-15T14:09:00.002-04:002008-07-15T14:11:18.128-04:00Jython 2.5 Alpha Released!As <a href="http://fwierzbicki.blogspot.com/2008/07/jython-25-alpha-released.html">announced by Frank Wierzbicki</a>: Jython 2.5 alpha is out. <br /><br />If you work with Java and love the Python programming language, this is a good opportunity to test this great Python implementation, integrate it with some of your Java programs and <a href="https://lists.sourceforge.net/lists/listinfo/jython-users">tell us how it went</a>.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-22460861099248899262008-07-14T11:36:00.006-04:002008-07-14T13:00:17.376-04:00My New Django/Jython Developer WorkflowBoth Django and Jython project are fast moving targets these days. That's a good thing, both projects are rapidly approaching big milestones: Django 1.0 and Jython 2.5. But that also means that if you are trying to patch both codebases to integrate them, your task gets a little more complicated. <br /><br />So I don't have private mercurial branches of both projects anymore, because it makes quite hard to update patches and keep them as separate units. The new solution is two new mercurial repositories: <a href="https://hg.leosoto.com/django.patches">django.patches</a> and <a href="https://hg.leosoto.com/jython.patches">jython.patches</a>. They contain mercurial queues and correspond to the .hg/patches directory you put inside the repositories containing the mercurial mirror of each project.<br /><br />Translated to command line, this is what you have to do in order to get the current code for Django on Jython:<br /><pre><br />$ hg pull https://hg.leosoto.com/django.svn.trunk<br />$ hg pull https://hg.leosoto.com/jython.svn.asm<br />$ hg pull https://hg.leosoto.com/django.patches django.svn.trunk/.hg/patches<br />$ hg pull https://hg.leosoto.com/jython.patches jython.svn.asm/.hg/patches<br />$ hg --cwd django.svn.trunk qpush -a<br />$ hg --cwd jython.svn.trunk qpush -a<br /></pre><br />This way, I always try to keep my patches updated to apply cleanly to each project latest svn version. <br /><br />I also have a hudson running on my machine which runs the Django test suite on CPython using all backends. Once that test is finished it runs the suite again after applying my Django patches (to make sure I'm not breaking anything). Finally it runs the suite one more time, using Jython and the postgresql_zxjdbc backend. This should replace the <a href="http://dojstatus.leosoto.com">dojstatus</a> site, once I discover a way to publish the results I get from hudson without installing hudson on my hosting.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-23793738268105471542008-07-10T10:33:00.004-04:002008-07-10T11:48:42.546-04:00By the way... I'm a Jython commiter! :)I've been so busy last weeks (half-time job, Summer of Code, and university final exams and labs!), that I even forgot to mention that <a href="http://fwierzbicki.blogspot.com/2008/06/welcome-leonardo-soto-jythons-newest.html">Frank gave me committer access to the Jython project</a>, two weeks ago.<br /><br />I want to thank publicly all Jython core developers for the confidence they have put on my work, especially <a href="http://zyasoft.com/pythoneering/">Jim Baker</a>, my GSoC mentor. Also <a href="http://dunderboss.blogspot.com/">Phillipe Jenvey</a>, Nicholas Riley and obviously <a href="http://fwierzbicki.blogspot.com">Frank Wierzbicki</a>, have been extremely helpful guiding me when I needed help.<br /><br />As I now commit most of my Jython patches directly to the SVN repository, I'm going to deprecate the <a href="https://hg.leosoto.com/jython.doj">jython.doj</a> Mercurial repository. I'll post about it soon, along with a new recipe to get Django running on top Jython, using the asm branch (i.e, the upcoming 2.5 version).Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-6645535044111150932008-07-05T20:44:00.004-04:002008-07-06T01:19:54.288-04:00The Devil is in the DetailsIt's a <i>cliché</i>, but really, look at this commit message I just wrote. It's for a supposedly simple change I made to Jython to get <code>'%d' % foo</code> and <code>'%f' % bar</code> working, on some corner cases<tt>[1]</tt>:<br /><br />“<tt>StringFormatter: '%d' and '%f' support for the __int__ and __float__ protocol respectively.<br /><br />The implementation is more convulted than it should be, because we have PyString implementing __float__ and __int__ at the "java level" but not at the "python level". For string formatting, only "python level" __float__ and __int__ must be<br />supported.<br /><br />Also, now that __int__ can return a PyLong, this case needs special care. Basically formatInteger now can call formatLong if a PyLong is found as the result of __int__. Then, as formatLong can also be called from formatInteger, __hex__, __oct__ and __str__ conversions were moved inside formatLong.<br /><br />Finally, test_format_jy was changed to stop checking that we don't support big floats on '%d' (CPython doesn't, but that seems a limitation of the specific implementation and I can't imagine a program that could break on Jython because we *support* it).</tt>”<br /><br />Python is wonderful, but there are a lot of details which make it tricky when implementing it. Nice to see that, when we play the role of Python users, we aren't exposed too much to this subtleties. In fact, I'd say that it is one of the languages with the better "user interface" I've seen.<br /><br /><tt>[1]</tt> Naturally, I find this corner cases when running and testing the Django codebase. Well, that's one of the points of my SoC project: Test how CPython-compliant Jython is, and fix it when it isn't :).Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-37418650103413414202008-07-01T13:00:00.007-04:002008-07-01T22:28:55.024-04:00We Should Master Regular ExpressionsWhen someone tell us that every programmer should know regular expressions, it's not <b>only</b> about using them to validate or match input on our programs. After all, seems like many of us can live using <tt>split()</tt> and <tt>replace()</tt> and some ad-hoc code instead of learning regexps.<br /><br />My point today is that they are also useful when coding. I just needed to replace every code that looked like this:<br /><br /><pre class="prettyprint"><br /><mx:RemoteObject id="grabaMuestreosRemote" <br />... <br />fault="Alert.show('Problemas al grabar los muestreos')"/><br /></pre><br />To:<br /><pre class="prettyprint"><br /><mx:RemoteObject id="grabaMuestreosRemote" <br />... <br />fault="reportFault('Problemas al grabar los muestreos', event.fault)"/><br /></pre><br /><br />The change is on the last line, replacing the alert by a slightly more involved logic (which lives inside the <tt>reportFault</tt> function).<br /><br />Solution? Find/Replace, using regexps (this was done with eclipse, but every reasonable editor have this feature):<br /><br />Find:<tt>fault="Alert.show\('([^']*)'\)"</tt> <br></tt><br />Replace With:<tt>fault="reportFault('$1', event.fault)"</tt><br /><br />Quick explanation: <tt>\(</tt> and <tt>\)</tt> matches literal parenthesis; they are escaped because they have their own special meaning on regexp: capturing. And they are using for capturing the string message inside quotes, on <tt>'([^']*)'</tt>. That means: a single quote (<tt>'</tt>) followed by any character which is <b>not</b> a single quote (<tt>[^']</tt>), repeated 0 or more times (<tt>*</tt>), followed by a single quite (<tt>'</tt>). So the non-escaped parenthesis are used to capture (i.e, remember, store) what was found inside the quotes. Later, you use the captured value by specifying <tt>$1</tt> on the replacement text. If you have more captures, they are labeled <tt>$2</tt>, <tt>$3</tt> and so on.<br /><br />By the way, I'm not a regexp master. In fact, I admit to frequently resort to ad-hoc code, especially if I'm in a hurry. <br /><br />But the simple exercise of summing all the time spent on writing ad-hoc code, plus the time wasted doing non-trivial find & replace by hand, have convinced me to learn them, and hopefully master them.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-2313295517034968492008-06-12T23:00:00.004-04:002008-06-12T23:46:20.862-04:00Doctests and XML/XHTMLEvery python programmer knows that <a href="http://docs.python.org/lib/module-doctest.html">doctests</a> are really cool. Checking snippets of code just by the literal output that it would print when run on the REPL is both simple and <a href="http://blog.ianbicking.org/minimock.html">powerful</a>. <br /><br />But, sometimes, they could be too literal. There are <a href="http://docs.python.org/lib/doctest-options.html">options</a> to deal with this, such as the use of ellipsis (<i>...</i>) as wildcards. They are not enough to deal with XML output, though:<br /><pre class="prettyprint"><br />File "/home/lsoto/src/django.doj/tests/modeltests/model_forms/models.py" [...]<br />Failed example:<br /> print f['name']<br />Expected:<br /> <input id="id_name" type="text" name="name" maxlength="20" /><br />Got:<br /> <input name="name" id="id_name" type="text" maxlength="20" /><br /></pre><br />(that's an actual failure from the <a href="http://dojstatus.leosoto.com/testcollector/23/model_forms/">Django test suite running on Jython</a>)<br /><br />To solve this, I implemented a doctest <tt>OutputChecker</tt>, inspired by (but not as polished as) <a href="http://codespeak.net/svn/lxml/trunk/src/lxml/doctestcompare.py">lxml.doctestcompare</a>. The plus side? The code uses the xml.dom.minidom stdlib API (instead of ElementTree or lxml), so it works without any third party library. Here is the core code:<br /><pre class="prettyprint"><br />def check_output_xml(self, want, got, optionsflags):<br /> # Tries to do a 'xml-comparision' of want and got. Plan string<br /> # comparision doesn't always work, because, for example, attribute<br /> # ordering should not be important.<br /> #<br /> # Based on http://codespeak.net/svn/lxml/trunk/src/lxml/doctestcompare.py<br /><br /> # We use this to distinguish repr()s from elements:<br /> _repr_re = re.compile(r'^<[^>]+ (at|object) ')<br /><br /> _norm_whitespace_re = re.compile(r'[ \t\n][ \t\n]+')<br /> def norm_whitespace(v):<br /> return _norm_whitespace_re.sub(' ', v)<br /><br /> def looks_like_markup(s):<br /> s = s.strip()<br /> return (s.startswith('<')<br /> and not _repr_re.search(s))<br /><br /> def is_quoted_string(s):<br /> s = s.strip()<br /> return (len(s) >= 2<br /> and s[0] == s[-1]<br /> and s[0] in ('"', "'"))<br /><br /> def is_quoted_unicode(s):<br /> s = s.strip()<br /> return (len(s) >= 3<br /> and s[0] == 'u'<br /> and s[1] == s[-1]<br /> and s[1] in ('"', "'"))<br /><br /> def child_text(element):<br /> return ''.join([c.data for c in element.childNodes<br /> if c.nodeType == Node.TEXT_NODE])<br /><br /> def children(element):<br /> return [c for c in element.childNodes<br /> if c.nodeType == Node.ELEMENT_NODE]<br /><br /> def norm_child_text(element):<br /> return norm_whitespace(child_text(element))<br /><br /> def attrs_dict(element):<br /> return dict(element.attributes.items())<br /><br /> def check_element(want_element, got_element):<br /> if want_element.tagName != got_element.tagName:<br /> return False<br /> if norm_child_text(want_element) != norm_child_text(got_element):<br /> return False<br /> if attrs_dict(want_element) != attrs_dict(got_element):<br /> return False<br /> want_children = children(want_element)<br /> got_children = children(got_element)<br /> if len(want_children) != len(got_children):<br /> return False<br /> for want, got in zip(want_children, got_children):<br /> if not check_element(want, got):<br /> return False<br /> return True<br /><br /> # Strip quotes<br /> if is_quoted_string(want) and is_quoted_string(got):<br /> want = want.strip()[1:-1]<br /> got = got.strip()[1:-1]<br /> elif is_quoted_unicode(want) and is_quoted_unicode(got):<br /> want = want.strip()[2:-1]<br /> got = got.strip()[2:-1]<br /><br /> if not looks_like_markup(want):<br /> return False<br /><br /> # Wrapper to suuport XML fragments<br /> wrapper = u"<root>%s</root>"<br /> try:<br /> want_root = parseString(wrapper % want).firstChild<br /> got_root = parseString(wrapper % got).firstChild<br /> except:<br /> return False<br /><br /> return check_element(want_root, got_root)<br /></pre><br />Note that, as is, it doesn't support HTML. That's not a problem on the Django test suite (where everything is XHTML) but it would be nice to add such support and then submit the checker upstream to CPython's doctest.py. So we could do something like this everywhere:<br /><pre class="prettyprint"><br />>>> print f['name'] # doctest: +XML<br /><input id="id_name" type="text" name="name" maxlength="20" /><br /></pre><br /><br />Next target: JSON output!<br /><br /><b>Update:</b> I've submitted a <a href="http://code.djangoproject.com/attachment/ticket/7441/doctest_xml_and_json_checkers.patch">patch to Django with support for both, XML and JSON output</a>.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-69228240942829761552008-05-30T17:04:00.003-04:002008-05-30T18:05:36.434-04:00Jython: How to Instantiate Classes Written in Python CodeThis week, as part of my GSoC project, I had to do some work related with zxJDBC, the very cool DBAPI <-> JDBC brige which is bundled with Jython. Among other things, I'm improving the default type mapping, adding support for converting java.sql.* instances to pythonic datetime.* ones. <br /><br />As the type mapping is written inside a class written in Java, I was confronted to the problem of how to instantiate datetime.* objects, which (by now) Jython implements using pure python code. The answer was very simple: Just do by hand what Python always does when you write:<br /><pre class="prettyprint"><br />import datetime<br />datetime.date(year, month, day)<br /></pre><br />You know, the <tt>import</tt> statement is implemented by the <tt>__import__</tt> builtin, <tt>foo.bar</tt> is <tt>getattr(foo, 'bar')</tt> and <tt>f(x)</tt> is <tt>f.__call__(x)</tt>. Then, the following is equivalent to the previous snippet:<br /><pre><br />datetime = __import__('datetime')<br />getattr(datetime, 'date').__call__(year, month, day)<br /></pre><br />Which, translated to Java/Jython looks almost the same:<br /><pre class="prettyprint"><br />PyObject datetime = __builtin__.__import__('datetime')<br />datetime.__getattr__('date').__call__(Py.newInt(year), <br /> Py.newInt(month), <br /> Py.newInt(day))<br /></pre><br />Once you get the idea, not only instantiating, but doing anything with classes written with Jython from Java code looks like a piece of cake. <br /><br />Hey!, How easiest could it be?<br /><br />Note that this works if all the Jython machinery is in place. This will be the case if your Java code is being called (directly or indirectly) from python code being ran on Jython. As any Jython module, such as zxJDBC, is always used from python code, this is something to have in mind when writing them in Java.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-24583495433315743762008-05-28T13:32:00.001-04:002008-05-29T10:18:20.525-04:00Loading Rails Fixtures Without Deleting Existing Records<a href="http://blog.choonkeat.com/weblog/2007/02/class-redefinit.html">This blog post by choonkeat</a> explain how to overcome the ruby limitation which forbids the use of the "<tt>class [name-of-the-class-to-monkeypatch]</tt>" statement inside methods.<br /><br />Why would you need this? <br /><br />I just used it to define a <tt>db:fixtures:insert</tt> task, which differs from <tt>db:fixtures:load</tt> on <em>not</em> removing existing records. I could have copied the <tt><a href="http://ar.rubyonrails.com/classes/Fixtures.html#M000007">Fixtures.create_fixtures</a></tt> method and remove the line which calls <tt>Fixtures#delete_existing_fixtures</tt>. But we all know that copy/paste have its problem. <a href="http://en.wikipedia.org/wiki/Monkey_patch">Monkey-patching</a> <tt>Fixtures#delete_existing_fixtures</tt> to do nothing seemed like a less ugly solution, except that it could have unforeseen consequences. So I wanted to limit the effect of the monkey patching to last as little as possible. Here is the result:<br /><pre class="prettyprint"><br /> # Mimics rails Fixtures.create_fixtures, without deleting existing records<br /> # from the database.<br /> #<br /> # The implementation monkey-patches Fixtures.delete_existing_fixtures<br /> # *temporarily*, restoring the original behaviours before exiting<br /> def self.insert_fixtures(fixtures_directory, table_names, class_names = {})<br /> ::Fixtures.module_eval do<br /> alias_method :original_delete_existing_fixtures,<br /> :delete_existing_fixtures<br /> def delete_existing_fixtures<br /> end<br /> end<br /> ::Fixtures.create_fixtures(fixtures_directory, table_names, class_names)<br /> ::Fixtures.module_eval do<br /> alias_method :delete_existing_fixtures,<br /> :original_delete_existing_fixtures<br /> remove_method :original_delete_existing_fixtures<br /> end<br /></pre><br /><br />[By the way, looks like this monkey-patch -> call -> undo monkey-patch may be end being common idiom which could be better encapsulated on another method. Perhaps someone already did it...]<br /><br />[<b>Update</b>: <a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/194394">Here is what seems an elegant way to do it</a>]<br /><br />Finally, Here is the task:<br /><pre class="prettyprint"><br />namespace :db do<br /> namespace :fixtures do<br /> desc "Inserts fixtures into the current environment's database,<br />*without* deleting existing records (as db:fixtures:load does).<br />Insert specific fixtures using FIXTURES=x,y"<br /><br /> task :insert => :environment do<br /> ActiveRecord::Base.establish_connection(RAILS_ENV.to_sym)<br /> (ENV['FIXTURES'] ? ENV['FIXTURES'].split(/,/) : Dir.glob(File.join(RAILS_ROOT, 'test', 'fixtures', '*.{yml,csv}'))).each do |fixture_file|<br /> Utilities::Fixtures.insert_fixtures('test/fixtures', File.basename(fixture_file, '.*'))<br /> end<br /> end<br /> end<br />end<br /></pre>Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-39317424611549881272008-05-15T23:03:00.001-04:002008-05-15T23:03:00.900-04:00Ubuntu: Changing Swap Size Without Loosing HibernationShort recipe:<br /><br />After resizing your swap partition (<em>if</em> you resized the swap partition instead of adding another one), use <tt>vol_id -u /dev/<device-name></tt> to get the new partion UUID, and put it at the appropriate place on <tt>/etc/fstab</tt> and <tt>/etc/initramfs-tools/conf.d/resume</tt>. Then, run:<br /><pre><br /> $ sudo dpkg-reconfigure initramfs-tools<br /></pre><br />Now, the not so long history:<br /><br />When I made the changed on my partitions to <a href="http://blog.leosoto.com/2008/05/back-to-linux.html">go back to Linux</a>, I made a mistake reserving only 1Gb of space for the swap partition. It sounded like a reasonable amount as I'd have a total of 3Gb of virtual memory, enough for my typical usage. <br /><br />I forgot about hibernation, which, on Linux, uses the swap partition to store the main memory contents (unlike Windows, which have a separate file for hibernation). To be honest I didn't even considered hibernation when installing Ubuntu, it never worked in the past. But now it worked. At least when I wasn't using too much main memory. If i was, then it wouldn't fit in the swap file and the laptop couldn't hibernate. <br /><br />GParted came to the rescue again, and my swap partition grew to a little more than 2Gb, which is the size of my main memory. Reboot on Ubuntu again and...no swap was recognized. WTF!?<br /><br />After looking at the <tt>/etc/fstab</tt> I realized that the partitions are identified by an UUID rather than by their device name. So no <tt>/dev/sda5</tt> (my swap partition) there. Just a bunch of hexadecimal digits. I thought of wiping that mess out and go back to my familiar devices names, but googling first was a better option. Then I found that I can easily know the new UUID of the partion using:<br /><pre><br /> $ sudo vol_id -u /dev/sda5<br /></pre><br />And, what really saved me from another headache was discovering that, after a change of UUID you must update <em>two</em> files: the “classic” <tt>/etc/fstab</tt> and also <tt>/etc/initramfs-tools/conf.d/resume</tt>. And finally run:<br /><pre><br /> $ sudo dpkg-reconfigure initramfs-tools<br /></pre><br /><br />Not that straightforward as it could be, but I have no real complaints, as I now can hibernate my laptop whenever I want.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-57379149691122273752008-05-11T18:47:00.000-04:002008-05-11T18:47:01.268-04:00Back to LinuxI'm back to Linux as my main OS. As you can infer from <a href="http://blog.leosoto.com/2008/03/reading-binary-file-on-ruby.html">previous</a> <a href="http://blog.leosoto.com/2008/02/generic-on-fly-deployment.html">posts</a>, I was using Windows on the past months. In fact, I was using Windows since I bought my Dell Laptop, a year and half ago. It wasn't <em>just</em> Windows: I used cygwin and coLinux inside my Windows XP to have a reasonable development environment too (except for Java, which feels OK for me on plain Windows). Somehow I survived running that all this time, forced in part because I had to use some Windows-only VPN-client software on my job.<br /><br />Two weeks ago things changed. I did some backups, ran <a href="http://gparted.sourceforge.net/livecd.php">GParted</a> (couldn't find my laptop's recovery disk, so my original plan of wiping out Windows and reinstalling everything couldn't go ahead) to shrink the NTFS partition, checked that everything was OK and installed <a href="http://www.ubuntu.com/getubuntu/download">Ubuntu Linux</a>.<br /><br />Choosing Ubuntu was a new thing: I started ten years ago with Redhat and changed to Debian “sid” five or six years ago (<tt>apt-get</tt> was such an important advantage on that times). But now I had little time to fight with my particular hardware configuration and heard that Ubuntu ran very good on Dell laptops. So I tried it and I'm not looking back.<br /><br />The only thing that wasn't automatically configured was my broadcom wireless card. Hey, having to deal with <em>only one</em> driver issue looked very promising to me (but it <em>will</em> drive away people who expects everything running out of the box, anyway). Installing <tt>b43-fwcutter</tt> and following the <a href="http://linuxwireless.org/en/users/Drivers/b43">instructions </a> made me a happy user. But all that could be caused by me installing a non-final version of the distribution.<br /><br />The overall impression is great. I'm even sticking to Gnome. Always used KDE in the past, but the default gnome environment seems very functional and I have nothing to say against it. And compiz is a great way to impress people :). Frankly, I miss konqueror versatility. But, considering that I would use Firefox as my main browser anyway (basically because it has great extensions; I don't like it eagerness for memory), that's no big deal. <br /><br />Migrating things from my coLinux filesystem couldn't be easiest: just moving files, and everything (emacs, latex, gnucash, mercurial, svn) works. <br /><br />Above all, having an unix-like environment is priceless.<br /><br />If FlexBuilder 2 had a Linux version, it would be perfect. Perhaps too perfect to be real.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-12474733522028179922008-04-30T11:52:00.004-04:002008-04-30T11:59:57.429-04:00Django SECRET_KEY GenerationWhen deploying Django application it is a common step to generate a SECRET_KEY for the site. Here is the quick recipe to do it: <br /><br /><pre class="prettyprint"><br />$ python -c 'import random; print "".join([random.choice("abcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*(-_=+)") for i in range(50)])'<br /></pre><br /><br />Useful when for whatever reason you don't want to install <a href="http://code.google.com/p/django-command-extensions/">django-command-extensions</a>.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-85261851298039786452008-04-28T13:05:00.005-04:002008-04-30T11:53:07.308-04:00Python Comparison WeirdnessWhile tracking a <a href="http://bugs.jython.org/issue1889394">Jython bug related with some <code>__cmp__</code> methods</a> (on dict and unicode, ant least) I had to check how <code>__cmp__</code> behaves on CPython. And got a few surprises:<br /><pre class="prettyprint"><br />>>> {} == ''<br />False<br /></pre><br />It sounds right, but...<br /><pre class="prettyprint"><br />>>> {}.__eq__('')<br />NotImplemented<br /></pre><br />Oh. So <code>==</code> isn't using <code>__eq__</code> to check for equality. It's using the old three way comparison function:<br /><pre class="prettyprint"><br />>>> cmp({}, '')<br />-1<br /></pre><br />So, as <code>''</code> is greater than <code>{}</code>, then they are not equal. But...<br /><pre class="prettyprint"><br />>>> {}.__cmp__('1')<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br />TypeError: dict.__cmp__(x,y) requires y to be a 'dict', not a 'str'<br /></pre><br />Oops. Isn't <code>cmp(foo, bar)</code> the same that <code>foo.__cmp__(bar)</code>, at least when <code>hasattr(foo, '__cmp__')</code>? Well, obviously, not always.<br /><br />For some reason, CPython does a bit of "type checking" when you <em>indirectly</em> use <code>dict.__cmp__</code>. If you compare a dict with an instance of a incompatible type, it does a "default comparison" by class name, instead of raising <code>TypeError</code>. By looking at CPython sources it seems that this is the case for every type where <code>tp_compare</code> is implemented in C.<br /><br />So, we get a -1 from <code>cmp({}, '')</code> because <code>'dict' < 'string'</code>. Weird. But that isn't all. If it were, probably I wouldn't bothered to write this.<br /><br />Let's derive dict and check what happens:<br /><pre class="prettyprint"><br />>>> class dict_derived(dict): pass<br />...<br />>>> cmp(dict_derived(), '')<br />-1<br />>>> dict_derived().__cmp__('')<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br />TypeError: dict_derived.__cmp__(x,y) requires y to be a 'dict_derived', not a 'str'<br /></pre><br />No surprises: It inherits the behavior from dict. So, remembering what I said above: <br /><blockquote>If you compare a <b>dict</b> with an instance of an incompatible type, Python does a "default comparison" by class name, instead of raising TypeError.</blockquote><br />Now we can extend it to:<br /><blockquote>If you compare a <b>dict or an dict-derived instance</b> with an instance of an incompatible type, Python does a "default comparison" by class name, instead of raising TypeError.</blockquote><br />But, why am I saying that it applies only to dicts? <small>[Or, AFAICS, special types where the comparision function is written in C]</small> Why not to every type? Aswer: <br /><pre class="prettyprint"><br />>>> class Foo(object):<br />... def __cmp__(self, other):<br />... raise TypeError("Foos are not comparable")<br />... <br />>>> Foo() == ''<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br /> File "<stdin>", line 3, in __cmp__<br />TypeError: Foos are not comparable<br />>>> cmp(Foo(), '')<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br /> File "<stdin>", line 3, in __cmp__<br />TypeError: Foos are not comparable<br /></pre><br />So, on one hand we have <code>dict</code> (and maybe other builtin types) where <code>cmp()</code> and comparison operators doesn't raise <code>TypeError</code> even if <code>__cmp__</code> does. And on another, user-defined classes where the raised <code>TypeError</code> does "leak". In the middle, our <code>dict_derived</code> class inherited the behavior from <code>dict</code>. But look at this:<br /><pre class="prettyprint"><br />>>> class dict_derived2(dict):<br />... def __cmp__(self, other):<br />... super(dict_derived2, self).__cmp__(other)<br />... <br />>>> cmp(dict_derived2(), '')<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br /> File "<stdin>", line 3, in __cmp__<br />TypeError: dict_derived2.__cmp__(x,y) requires y to be a 'dict_derived2', not a 'str'<br /></pre><br />Dict-derived types inherit the behaviour of <code>dict</code>, <b>unless they override <code>__cmp__</code></b>. CPython doesn't care that the new <code>__cmp__</code> just call the original <code>dict.__cmp__</code>. The only important thing is that there is a <code>__cmp__</code> implemented on python code. Once you write a "custom" <code>__cmp__</code>, <code>cmp()</code>, <code>==</code> and all the other comparison operators will raise the exception.<br /><br />To summarize, here is final rule for <code>dict.__cmp__</code>: <br /><blockquote><big>If you compare a <b>dict or an dict-derived instance</b> with an instance of an incompatible type, and <code>__cmp__</code> is not overriden, Python does a "default comparison" by class name, instead of raising TypeError</big></blockquote><br />Note that this rule is not directly applicable to other builtin types that implement __cmp__:<br /><pre class="prettyprint"><br />>>> set().__cmp__('')<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br />TypeError: set.__cmp__(x,y) requires y to be a 'set', not a 'str'<br />>>> cmp(set(), '')<br />Traceback (most recent call last):<br /> File "<stdin>", line 1, in <module><br />TypeError: can only compare to a set<br />>>> set() == ''<br />False<br />>>> set().__eq__('')<br />False<br /></pre><br />With <code>set</code>, <code>TypeError</code> is raised on </code>__cmp__</code> and on <code>cmp()</code>, but not on </code>==</code>. That's because <code>set.__eq__</code> takes care of returning <code>False</code> if the argument type is not compatible. The end result sounds quite reasonable, because you can still do check for equality against instances of other types (like <code>set() != ''</code>), but can't compare for ordering against them (<code>set() > 1</code> raises an error instead of doing a weird class name comparison).<br /><br />I suppose that the roots of this inconsistency are historical accidents. I'm curious to see if all this changed on Python 3.0.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-30493519353250567882008-04-23T17:04:00.005-04:002008-04-26T23:44:47.709-04:00Django on Jython: Summer of Code!This post is not exactly hot news, but late is better than never:<br /><br /><big>My application for the Google Summer of Code 2008, titled <a href="http://code.google.com/soc/2008/psf/appinfo.html?csaid=DA6AC3DE94E157E">“Django on Jython: Supporting Python Web App Frameworks on the JVM”</a> was accepted!! :-) </big><br /><br />I can't say in words how happy I am, even after a few days have passed since I got the news. I'm lucky enough to have <a href="http://zyasoft.com/pythoneering/">Jim Baker</a> as my mentor, a very inspiring guy, as motivated as myself by the project. He is a very active Jython contributor, a successful mentor of past SoC projects. And <a href="http://fwierzbicki.blogspot.com/2008/04/jython-and-django-progress-part-i-dev.html"> Frank Wierzbicki is already moving DoJ to work on top of Glassfish</a>. From the Django side, Jacob Kaplan-Moss maintained the support shown on the past and will fast-track our patches to Django, if they are needed.<br /><br />I also got the acknowledgment from <a href="http://www.imagemaker.cl">Imagemaker</a>, my employer, to stop working full time during the SoC period. In fact, they showed very supportive and interested on the project, looking forward to its results and potential. <br /><br />I'm going to participate on a vibrant Jython community, with other two fellow students working on web frameworks and Jython: <a href="http://code.google.com/soc/2008/psf/appinfo.html?csaid=59C7870763174C10"> Georgy Berdyshev with Zope</a> and <a href="http://code.google.com/soc/2008/turbogears/appinfo.html?csaid=9D3A17CF760D00B9">Ariane Paola Gomes with TurboGears2 </a>.<br /><br />Things couldn't be better. I will resume the work done on the past year, hoping to make Django on Jython a reality by August 2008. In fact, I already started, with a quick but useful web application for <a href="http://dojstatus.leosoto.com">tracking the status of the Django test suite running on top of Jython and the postgresql_zxjdbc driver</a>. Hope to get that page fully green as soon as possible.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-11234367066517922422008-04-21T23:37:00.002-04:002008-04-21T01:15:30.303-04:00404I saw this on <a href="http://programming.reddit.com">reddit</a>, on a <a href="http://reddit.com/r/programming/info/6f6pu/comments/">thread</a> about <a href="http://www.ibiblio.org/freeburma/">the supposed best 404 error message</a>. But this is one far better:<br /><br /><blockquote><br />“I'm sorry, you've reached a page that I cannot find. I'm really sorry about this. It's kind of embarassing. Here you are, the user, trying to get to a page on LiveJournal and I can't even serve it to you. What does that say about me? I'm just a webserver. My sole purpose in life is to serve you webpages and I can't even do that! I suck. Please don't be mad, I'll try harder. I promise! Who am I kidding? You're probably all like, "Man, LiveJournal's webserver sucks. It can't even get me where I want to go." I'm really sorry. Maybe it's my CPU...no that's ok...how bout my hard drives? Maybe. Where's my admin? I can't run self-diagnostics on myself. It's so boring in this datacenter. It's the same thing everyday. Oh man, I'm so lonely. I'm really sorry about rambling about myself, I'm selfish. I think I'm going to go cut my ethernet cables. I hope you get to the page you're looking for...goodbye cruel world!”</blockquote><p align="right"><small>(<a href="http://www.livejournal.com/sdofijasdf">LiveJournal's 404</a>)</small></p><br /><br />You may need to reload the page a few times to get this message, because they have others...Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-45418945705053846162008-04-09T09:51:00.013-04:002008-04-10T15:36:00.219-04:00Rails Migrations Gotcha: Backward-Incompatible Model ChangesI'm pushing for adoption of Rails Migration on all Rails projects on my job (we use them on a few). As a consequence, I won the assignment of writing migrations for the last changes on the system I'm currently involved. That seemed easy, but it wasn't. I will try to show why, without diving into details of my specific scenario.<br /><br />Imagine you have the following model:<br /><pre class="prettyprint"><br />class Foo < ActiveRecord::Base<br />end<br /></pre><br />And the following migration:<br /><pre class="prettyprint"><br />class AddAnotherFieldToFoo < ActiveRecord::Migration<br /> def self.up<br /> add_column :foo, :new_column, :string<br /> Foo.reset_column_information<br /> Foo.find(:all).each do |foo|<br /> foo.new_column = some_calculation(foo.another_column)<br /> foo.save!<br /> end<br /> end<br />end<br /></pre><br />Now, we make the following changes to our model:<br /><pre class="prettyprint"><br />class Foo < ActiveRecord::Base<br /> <b>has_many :bars</b><br /> <b>before_save :do_something_with_my_bars</b><br /> <b>def do_something_with_my_bars</b><br /> <b> ...</b><br /> <b>end</b><br />end<br /></pre><br />And its migration (just for completeness, not really relevant):<br /><pre class="prettyprint"><br />class AddBazToFoo < ActiveRecord::Migration<br /> def self.up<br /> add_column :foo, :bar_id, :integer<br /> end<br />end<br /></pre><br />So what is the problem?<br /><br />For us, who made the last change on Foo <em>after</em> doing the <tt>AddAnotherFieldToFoo</tt> migration, it's all fine.<br /><br /><strong>But</strong>, for the new developer who just made a checkout of the source code and happily executed <tt>rake db:migrate</tt>, the <tt>AddAnotherFieldToFoo</tt> migration <em>failed miserably</em>.<br /><br />That's because <tt>Foo#do_something_with_bars</tt> will get called (remember the <code>:before_save</code> we introduced), but the association between foo and bar is not made yet (we are executing a previous migration).<br /><br />Same happens to the developer who didn't update his local copy this week. And <b>it will break on production too</b>, when we merge this set of changes into the production branch.<br /><br />So, here is my problem: <br /><blockquote><br />Every backward incompatible change to models will (potentially) break past migrations, because they are not specifically associated to a model state on the time.<br /></blockquote> <br />And SCMs doesn't help either (updating one changeset at time would work, but when merging braches all that changesets will collapse into one and you are doomed) I'm looking into what to do. Maybe I'm using migrations in a way they were not intended to be used...<br /><br />Does someone know how to solve this?<br /><br />Update: Here is the <a href="http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/3216661c74c21e72">ruby-talk thread</a>Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-29051646035132137592008-04-03T21:24:00.002-04:002008-04-03T23:57:17.268-04:00Cinemark Should Learn Unicode<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_hR0qJ4X5wrc/R_U8brD2soI/AAAAAAAAABU/GECxGiJS8V0/s1600-h/DSC00018.JPG"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://bp3.blogger.com/_hR0qJ4X5wrc/R_U8brD2soI/AAAAAAAAABU/GECxGiJS8V0/s400/DSC00018.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5185116992100020866" /></a><br /><br />Crappy photo taken last weekend on a Cinemark: it shows, in the text on the middle, the title of the movie "Crónicas de Spiderwick" mangled by the incorrect interpretation of UTF-8 data as ISO8859-1. <br /><br />Someone should point the Cinemark guys to the <a href="http://www.joelonsoftware.com/articles/Unicode.html">Joel's Unicode guide</a>. It is a good reading on the topic of text manipulation on the real, modern world. That is, taking text encodings into account. It is a messy, but unavoidable topic. <br /><br />And remember:<br /><blockquote><br />“There Ain't No Such Thing As Plain Text.”<br /></blockquote><p align="right"><small>(quoted from <a href="http://www.joelonsoftware.com/articles/Unicode.html">The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)</a>)</small></p>Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-78198891149843812122008-03-23T22:32:00.002-03:002008-03-24T09:54:40.591-03:00Out-of-process memoization using memcached.This is my humble recognition of <a href="http://www.danga.com/memcached/">memcached</a> (“a high-performance, distributed memory object caching system”) usefulness. It is not a language or platform dependent tool, so do not overlook it just because I tend to talk about Python or Java.<br /><br />Last weeks I have been fixing a lot of small bugs on a legacy application. After solving most of them, a challenging problem remained: Lots of database-intensive computations were taking too much time to complete. Where "too much time" == <strong>48 hours</strong>. Some people suggested me to rewrite all that part of the application, but that was impractical with all the pressure of getting it done in one or two days. And it was the wrong solution, as explained at the end of this post.<br /><br />So after a bit of research and profiling, I identified two critical points. One was solved by carefully rewriting a lot of SQL queries into just one or two, while modifying the program as little as possible.<br /><br />The other problem was the program repeatedly computing the same thing over and over. It resembled the naive recursive fibonacci algorithm:<br /><pre class="prettyprint"><br />def fib(n):<br /> if n == 1 or n == 2: return 1<br /> return fib(n - 1) + fib(n - 2)<br /></pre><br />Of course, this is easy to solve adding <a href="http://en.wikipedia.org/wiki/Memoization">memoization</a> to the fib routine:<br /><pre class="prettyprint"><br />fibs = {1: 1, 2: 1}<br />def fib(n):<br /> if n in fibs: <br /> return fibs[n]<br /> else: <br /> result = fib(n - 1) + fib(n - 2)<br /> fibs[n] = result<br /> return result<br /></pre><br />My problem when applying this kind of solution was memory space: I feared running out of it, considering the amount of data processed by the program. In the end, I would have to program a little cache subsystem and deal with the feared global state. Or dig into the inner deeps of the system changing a lot of method signatures to pass the damn state along.<br /><br />But I didn't. I just plugged a <a href="http://code.google.com/p/spymemcached/">java memcached client library</a>, surrounded the computation routine with code very similar to the second fibonacci example, fired a memcached instance on the development server, and it went way faster. <br /><br />Now everybody is happy, including myself for not having to reimplement the whole computation. <strong>Rewriting code from scratch is tempting</strong> when nobody understand the existing one anymore. But we tend to underestimate the amount of work behind that unintelligible pieces of code. Much of their ugliness comes from covering a lot of corner cases. Coming with a new, shiny piece of code may not be that hard, but it <em>will</em> be buggy, until it passes a great deal of testing. <strong>If we can not understand the existing code, we can hardly thoroughly test the new one</strong>, just because we are not understanding all the corner cases. Only after we really understand the ugly code, we can write a better version from scratch. That was not my case.<br /><br />Just for the record, the computation now takes 3 hours, for a 15x speed improvement :).Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-63129028443674207242008-03-08T16:02:00.001-03:002008-03-08T16:03:09.590-03:00How to Demotivate Me at WorkThis a condensed list of things I hate, collected from the past five years of work for multiple software companies. In no special order:<ul><br /><li>Always assign me to projects where I am the only developer.</li><br /><li>Throw me at unfinished projects where the single developer left the company.</li><br /><li>Throw me at projects where the massive pre-existent accidental complexity (e.g, stupid technological infrastructure) never gets out of the way.</li><br /><li>Put technical decision-making on non-technical people's hands.</li><br /><li>Throw me at crappy, badly “finished” projects that no one understand anymore, just after a critical “milestone” happened (such as a very very angry customer, or a last-minute disaster). Bonus points for letting me solve them without help of other programmers. Add even more extra points for no help from other human beings at all.</li><br /><li>Move me from project to project, without getting anything really finished.</li><br /><li>Put more attention on <em>how</em> do I work than on <em>what</em> I produce.</li></ul><br />Disclaimer: This is <strong>not</strong> a <em>targeted</em> critic against my current employer (or an specific past employer). As far as I know, this kind of shit happens, almost everywhere. But I would <em>really</em> like to work on a different world. Can we change it?Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-58157269570856110992008-03-02T12:58:00.000-03:002008-03-01T15:46:00.278-03:00Reading a Binary File on RubyNo, it is not as simple as:<br /><pre class="prettyprint"><br />contents = IO.read(path_to_binary_file)<br /></pre><br />Because it does not work well on Windows platforms. You should do:<br /><pre class="prettyprint"><br />contents = IO.open(path_to_binary_file, "rb") {|io| io.read }<br /></pre><br />Obviously, the key is the binary flag. It can not be passed to <tt>IO.read</tt>.<br /><br />I learned this because a co-worker had a mysterious problem sending email attachments with rails. I was not able to help her (that <tt>IO.read</tt> never seemed suspicious to me), but once she found the solution, all became clear.<br /><br />But you know, the real bug was she using Windows as a development platform, <a href="http://www.codinghorror.com/blog/archives/001065.html">right?</a>.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-48267386425849889032008-02-29T09:06:00.002-03:002008-02-29T00:17:22.023-03:00Generic On-The-Fly DeploymentOnce spoiled by dynamic-languages web frameworks, it is hard to go back to one where a manual build cycle is needed before seeing the changes made to the source code on the running application. It <em>must</em> be instantaneous.<br /><br />As you can infer from my latest posts, I am now working with Flex. And Java Servlets, using <a href="http://www.caucho.com/">Resin</a> as the server. That environment is not even close to be as developer-friendly as Django, or Rails or many others modern web frameworks. Perhaps I am being a bit unfair, because Resin tries to be developer-friendly, not only reloading the context when something changes, but also providing a special directory where you can put your java sources and it even compiles them. But the project layout is not configured the way Resin likes it, so it did not work. <br /><br />Other developers use a IDE plugin to deploy the compiled classes automatically. But, is really an IDE needed to simply (re)build a project after some file(s) changes? I do not think so. After all, it is just matter of monitoring the file system and firing the build process if a change is detected. So I came with a simple, handy utility, which monitor a specified directory and run an arbitrary command when a change is detected.<br /><br />Integrating it with our build script is trivial:<br /><pre><br /> > cd \eclipse3.3\eclipse\workspace\MyProject<br /> > python dirwatch.py -c "ant deploy-development"<br /></pre><br />Moreover, Eclipse integration is easy too: <tt> Run-> External Tools -> Open External Tools Dialog -> Program -> New</tt>: <br /><br />Location: c:\python25\python.exe <br />Working Directory: ${workspace_loc:/MyProject}<br />Arguments: -u "c:\path\to\dirwatch.py" -c "ant deploy-development"<br /><br />The whole point of the exercise? Decouple the convenience of automatic deploymeny from a particular IDE, where it does not belong. Just like building.<br /><br />So here is the program (currently only works on Windows, but should be easy to port to Unix-like systems, using inotify or FAM):<br /><pre class="prettyprint">#!/usr/bin/env python<br />"""<br />dirwatch:<br /><br />Monitor a directory for changes. Executes a command after a change is detected.<br />"""<br /><br />import os<br />import time<br />import win32file<br />import win32event<br />import win32con<br />from optparse import OptionParser<br /><br />VERSION = "1.0"<br /><br />def dirwatch(path_to_watch, seconds_to_wait, commands):<br /> print time.asctime(), "Watching %s" % path_to_watch<br /> import sys<br /> change_handle = win32file.FindFirstChangeNotification (<br /> path_to_watch,<br /> 1, # => recursive<br /> win32con.FILE_NOTIFY_CHANGE_FILE_NAME |<br /> win32con.FILE_NOTIFY_CHANGE_DIR_NAME |<br /> win32con.FILE_NOTIFY_CHANGE_SIZE |<br /> win32con.FILE_NOTIFY_CHANGE_LAST_WRITE<br /> )<br /> # Loop forever, listing any file changes.<br /> try:<br /> last_change_time = None<br /> while True:<br /> result = win32event.WaitForSingleObject(change_handle, 1000)<br /> if result == win32con.WAIT_OBJECT_0:<br /> if last_change_time:<br /> if time.time() - last_change_time > seconds_to_wait:<br /> msg = "More change(s) detected, " \<br /> "waiting %d more second(s)"<br /> else:<br /> msg = None<br /> else:<br /> msg = "Change detected, waiting %d second(s)"<br /> if msg:<br /> print time.asctime(), msg % seconds_to_wait<br /> last_change_time = time.time()<br /> win32file.FindNextChangeNotification(change_handle)<br /> if last_change_time:<br /> if time.time() - last_change_time > seconds_to_wait:<br /> for command in commands:<br /> print time.asctime(), "Executing '%s'" % command<br /> os.system(command)<br /> last_change_time = None<br /> finally:<br /> win32file.FindCloseChangeNotification(change_handle)<br /> <br /><br />def parse_options(args):<br /> parser = OptionParser(usage="%prog [-c command[;command...]] [directory]",<br /> version="%prog " + VERSION)<br /><br /> parser.add_option("-w", "--wait", dest="seconds_to_wait", metavar="SECONDS",<br /> default="1",<br /> help="seconds to wait after last change before the "<br /> "of the command(s).")<br /> parser.add_option("-c", "--command", dest="command",<br /> default="echo change detected",<br /> help="command to execute after a change is detected")<br /> return parser.parse_args(args)<br />def main(args):<br /> options, args = parse_options(args)<br /> try: path_to_watch = args[1] or "."<br /> except: path_to_watch = "."<br /> dirwatch(path_to_watch, int(options.seconds_to_wait),<br /> options.command.split(';'))<br /> <br />if __name__ == "__main__":<br /> import sys<br /> main(sys.argv)<br /></pre>Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-45625199209193347472008-02-26T16:51:00.001-03:002008-02-26T12:15:20.386-03:00Flex Builder 2 on top of Eclipse 3.3The Flex Builder installer does not like Eclipse 3.3. When you try to install it over eclipse, it says the destination folder is not valid.<br /><br />I found some <a href="http://www.eclipsezone.com/articles/howto-flexbuilder2/">instructions on how to manually plug the FlexBuilder components in Eclipse 3.3</a>, provided that you installed it as a stand alone application. <b>But it contains a small, but important, error on the directory layout</b>, because the "<tt>plugins</tt>" and "<tt>features</tt>" directories must be inside "<tt>your-new-extension-point/eclipse</tt>". That is, the directory structure inside the extension point should be:<br /><pre><br />./<br /> /eclipse<br /> /eclipse/.eclipseextension<br /> /eclipse/plugins<br /> /eclipse/features<br /></pre><br />Other than that, everything went OK following the instruction of the article. The only other different thing I did was to only copy the com.adobe.* plugins and features. But it worked either way.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-44823512621490290612008-02-25T11:27:00.007-03:002008-02-27T00:38:50.324-03:00A Closure-Related Gotcha on Flex 2 (AS3)This is one of these write-it-so-you-will-never-forget blog posts. Because I spent much time spotting the bug on code that used closures to simplify the code. It is not practical to show the actual code here, so I will make a dumb "clone" of the relevant section:<br /><pre class="prettyprint"><br />function doSomething() {<br /> for (var i = 0; i < 10; i++) {<br /> var foo = makeFoo();<br /> foo.addEventLister(ResultEvent.RESULT, timeLogger(i));<br /> }<br />}<br /><br />function timeLogger(i) {<br /> return function(event) {<br /> Logger.debug("Received request number ", i);<br /> for(var i = 0; i < 100; i++) {<br /> // For some reason this loop had to exist.<br /> }<br /> }<br />}<br /></pre><br />[If you are curious, I did not included the inner anonymous function inside <tt>doSomething</tt>, because <a href="/2007/10/javascript-closures-and-parameter.html">all instances of the function would "close over" the same reference to <tt>i</tt></a>]<br /><br />As you can see, the idea was to add some timers for some asynchronous events. See the bug? Yes? I envy you! No? Well, unless you have psychic debugging abilities, it is hard to say without knowing the observed, wrong behaviour: <b>Every logged request had an "undefined" number</b>. That <tt>i</tt> was always undefined. WTF!?<br /><br />The bug is the formal parameter <tt>i</tt> of <tt>timeLogger</tt> conflicting with the loop variable <tt>i</tt> on the anonymous inner function. You change either variable name, and the problem is solved. Ta-da!<br /><br />I suppose this is a side effect of the static analysis and typing made by the ActionScript compiler. This gotcha does not currently apply to JavaScript, because of good old interpreters "dumbness" prevent them to bind variables to definitions that are ahead on the code. But I question the fact that such "clever" idea is a good one. In fact, I do not see a single use case for it.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.comtag:blogger.com,1999:blog-8797006816811118829.post-3741639390645216962008-01-23T23:07:00.000-03:002008-01-23T23:48:21.179-03:00Review Board: Code Reviews on the WebToday I had a happy surprise. In a discussion on a local linux related mailing list, someone pointed to <a href="http://www.review-board.org/">Review Board</a> as a good alternative for doing code reviews.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://farm1.static.flickr.com/252/525300318_90a2648988_o.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px;" src="http://farm1.static.flickr.com/252/525300318_90a2648988_o.png" border="0" alt="" /></a><br /><br />I looked at the webpage and it looks very, very good. I will check it out as soon as I can, and play with it a bit before trying to “sell” it in my job, but I am pretty confident it will be well received. As a plus, It has people from a recognized company behind it: VMWare. And, another plus for me: it is made in Python, using Django.<br /><br />I am very interested, because I did something related in the past, using the very same development platform. Before being abducted by the company to do management tasks, I was playing with a pet project to do code browsing, reviewing and project planning, where code review was the first focus. It is called “codeflow”<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp2.blogger.com/_hR0qJ4X5wrc/R5f0QVZ0jJI/AAAAAAAAABE/DjgPGn4T3mM/s1600-h/codeflow.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://bp2.blogger.com/_hR0qJ4X5wrc/R5f0QVZ0jJI/AAAAAAAAABE/DjgPGn4T3mM/s400/codeflow.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5158860459636722834" /></a><br /><br />I left it practically abandoned when my position in the company changed: I had no way to eat my own dogfood anymore. Before that, back in April, I managed to get a working prototype wich did fancy diffs, code highlighting and worked against our subversion repository, although it was not very polished:<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_hR0qJ4X5wrc/R5f0r1Z0jKI/AAAAAAAAABM/TCoxFoeN1e0/s1600-h/codeflow-in-action.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://bp0.blogger.com/_hR0qJ4X5wrc/R5f0r1Z0jKI/AAAAAAAAABM/TCoxFoeN1e0/s400/codeflow-in-action.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5158860932083125410" /></a><br /><br />We tested the prototype and it showed to be very useful, for real projects. Even<br />now, one or two months ago, the main engineering team (I am into research and development right now) wanted to put that prototype into production. But I was not sure it was a good idea.<br /><br />I believe in <a href="http://catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ar01s04.html">“release early, release often”</a>. But without enough time to actually release anything, it would be quite difficult to do it often. <br /><br />So I am now going to push Review Board, as a working product, instead of my rough prototype. As far as I can see, the <a href="http://code.google.com/p/codeflow/wiki/CurrentIdeas">ideas for codeflow about code reviews</a> were not any bad. I will still miss the integrated concept which I envisioned for codeflow back in April, but I will not miss the reviewing platform anymore.<br /><br />And it is done on Python/Django. I am familiar and proficient with both. I am still looking forward to do something with the ideas behind codeflow (not limited to code reviewing), and Review Board looks like a great place to start.Leo Soto M.http://www.blogger.com/profile/16108418309169741841noreply@blogger.com