tag:blogger.com,1999:blog-72543860644391463712009-06-01T23:53:17.133-07:00akaliasakaliasnoreply@blogger.comBlogger30125tag:blogger.com,1999:blog-7254386064439146371.post-69998487986511906042009-05-30T06:33:00.000-07:002009-05-30T07:17:16.254-07:00New Wave: PyJamas?The Wave demo is very impressive. The thing that stood out the most to me was their usage of Google Web Toolkit. This is based on Java, a language typically ridiculed by the dynamic language crowd. It must have something going for it.<br /><br />One of the head wave guys mentioned `the amazing tools` surrounding Java (IDEs, automatic refactoring helpers). A lot of this is only possible due to its static nature.<br /><br />For this numbskull, used to writing little scripts generally only about 500 lines python is OK. Lately I have been starting to break up my scripts into more than one file. It's actually really painful making changes when it effects multiple files. I'm a `sketcher` so I like to change things a lot.<br /><br />So.. Java has `amazing` refactoring capabilities. The only thing I don't really like about it from what I have seen is the intense verbosity and visual noise. You'd think all that would be a lot of weight and mud underneath your boots. <br /><br />Perhaps not. With snippets, auto-completion and refactoring helpers available in a good IDE you may even be able to develop faster in Java than Python. You are working at a lower level, but a level that your stupid servant knows well enough to be able to help you. "No you fucking idiot! It's a MetaDuperPolymorpher class! Go get me the fucking arg spec will you!... Ah forget it. I'll do it myself."<br /><br />Maybe programmers need to go beyond plain-text editors and use markup for type annotations and whatever other metadata the computers need. It *IS* 2009 We could and should have our cake and eat it too; a nice syntax for humans AND all the metadata a compiler could want. <br /><br />We need rich-*source* editors /languages. I wonder if in 2035 people will still be using plain text wrapped with an 80 column width?<br /><br />To bridge the gap maybe we could integrate debuggers/shells right into our editors. Why couldn't you edit a `live` program with all the introspective data that would provide? Sure you'd have to be careful about side effects but if you had a nice interface to the `debugger` you could launch into the program at any point. <br /><br />I have an editor plugin that allows me to push/pull lines from IPython, and pull auto-completion. So my exploratory missions form the basis of my scripts. Sure beats tedious copy/pasting in between. Sometimes I'll push the whole file through and then edit functions in my editor, pushing the changes and then test them out in the shell. Who the hell wants to edit multi-line functions inside IPython? Really I'd like a REPL inside my editor and *smart* access to it's history and type introspection.<br /><br />With the latest version of Wing-IDE, it actually pulls the source-completion data from the debugged process when you are debugging. Set a breakpoint, and start editing `live` (kind of) I really like this work-flow. The line between repl shell and editor blurs. An *Integrated* Development Environment.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-6999848798651190604?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com1tag:blogger.com,1999:blog-7254386064439146371.post-87928283202841415662009-05-29T06:25:00.000-07:002009-05-29T06:49:54.689-07:00SproxAn interesting CRUD library based on sqlalchemy / formencode / toscawidgets. It actually generates the html. Looks quite handy if you wanted fully generated forms.<br /><br /><a href="http://sprox.org/#form-generation"> Sprox </a><br /><br />I'm not sure if it can do nested model forms or if it can that the functionality is decoupled from the html generation and available for use. It seems like the author has put a lot of <a href="http://sprox.org/class_diagram.html">thought</a> into it. I will definitely investigate it and ToscaWidgets further. I haven't really had much experience with auto-generated forms. I get the feeling that they are hard to customize but it's likely worth auto-generating when possible.<br /><br />There doesn't seem to be m?any projects focused on (semi or fully) automatically creating list/pagination filter forms. This is quite commonly required in admin sections of a site. <br /><br />I can't even remember if Django does.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-8792828320284141566?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-49615563056551389932009-05-29T01:55:00.000-07:002009-05-29T06:21:18.886-07:00sqaencode 0.1I have been messing around a bit and now pretty much have a working lib for encoding nested models into forms and back and automatically generating formencode Schema from sqlalchemy models.<br /><br /><pre class="blackboard">E:\python_repos\sqaencode>nosetests --with-coverage --cover-package=sqaencode<br />................<br />Name Stmts Exec Cover Missing<br />----------------------------------------------------<br />sqaencode 6 6 100%<br />sqaencode.constants 10 10 100%<br />sqaencode.decode 47 45 95% 126, 161<br />sqaencode.encode 28 26 92% 90-91<br />sqaencode.util 12 12 100%<br />sqaencode.validators 46 42 91% 30, 64, 66, 129<br />----------------------------------------------------<br />TOTAL 149 141 94%<br />----------------------------------------------------------------------<br />Ran 16 tests in 0.962s<br /><br />OK</pre><br />I created a ModelSchema class, subclassing formencode.Schema and giving it an inline __metaclass__ inheriting from formencode.declarative.DeclarativeMeta. I over-rode the __repr__ to output an (almost) eval()able representation.<br /><br /><pre class='blackboard'><span class='storage'>class</span> <span class='entity'>ModelSchema</span>(<span class='superclass'>Schema</span>):<br /> <span class='storage'>class</span> <span class='entity'>__metaclass__</span>(<span class='superclass'>DeclarativeMeta</span>):<br /> <span class='storage'>def</span> <span class='support'>__repr__</span>(<span class='variable'>cls</span>):<br /> model <span class='keyword'>=</span> <span class='variable'>cls</span>.__model__.<span class='variable'>__name__</span><br /> base <span class='keyword'>=</span> [ <span class='string'>"class </span><span class='stringInterpolation'>%(model)s</span><span class='string'>Schema(model_schema(</span><span class='stringInterpolation'>%(model)s</span><span class='string'>)):"</span> <span class='keyword'>%</span> <span class='support'>dict</span><span class='metaFunctionCallPy'> (<br /> </span><span class='variable'>model</span><span class='metaFunctionCallPy'> </span><span class='keyword'>=</span><span class='metaFunctionCallPy'> model)</span> ]<br /><br /> <span class='keyword'>for</span> arg <span class='keyword'>in</span> SCHEMA_ARGS:<br /> <span class='metaFunctionCallPy'>base.append(</span><span class='string'>' </span><span class='stringInterpolation'>%-20s</span><span class='string'> = </span><span class='stringInterpolation'>%r</span><span class='string'>'</span><span class='metaFunctionCallPy'> </span><span class='keyword'>%</span><span class='metaFunctionCallPy'> (arg, </span><span class='support'>getattr</span><span class='metaFunctionCallPy'>(</span><span class='variable'>cls</span><span class='metaFunctionCallPy'>, arg)))</span><br /><br /> <span class='metaFunctionCallPy'>base.append(</span><span class='string'>''</span><span class='metaFunctionCallPy'>)</span><br /> <span class='keyword'>for</span> key, validator <span class='keyword'>in</span> <span class='support'>sorted</span><span class='metaFunctionCallPy'>(</span><span class='variable'>cls</span><span class='metaFunctionCallPy'>.fields.items())</span>:<br /> <span class='keyword'>if</span> <span class='metaFunctionCallPy'>key.startswith(</span><span class='string'>'_'</span><span class='metaFunctionCallPy'>)</span>: <span class='keyword'>continue</span><br /><br /> args <span class='keyword'>=</span> <span class='metaFunctionCallPy'>non_default_validator_args(validator)</span><br /> <span class='metaFunctionCallPy'>base.append(</span><span class='string'>' </span><span class='stringInterpolation'>%-20s</span><span class='string'> = </span><span class='stringInterpolation'>%s</span><span class='string'>(</span><span class='stringInterpolation'>%s</span><span class='string'>)'</span><span class='metaFunctionCallPy'> </span><span class='keyword'>%</span><span class='metaFunctionCallPy'> (<br /> key, </span><span class='support'>type</span><span class='metaFunctionCallPy'>(validator).__name__, args) )</span><br /><br /> <span class='keyword'>return</span> <span class='string'>'</span><span class='constant'>\n</span><span class='string'>'</span>.<span class='metaFunctionCallPy'>join(base)</pre><br />Using the model_schema factory to create a ModelSchema and then getting a repr:<br /><pre class='blackboard'>In [<span class='constant'>1</span>]: <span class='keyword'>from</span> sqaencode <span class='keyword'>import</span> model_schema<br /><br />In [<span class='constant'>2</span>]: <span class='metaFunctionCallPy'>model_schema(Product)</span><br />Out[<span class='constant'>2</span>]:<br /><span class='storage'>class</span> <span class='entity'>ProductSchema</span>(<span class='metaFunctionCallPy'>model_schema(Product)</span>):<br /> ignore_key_missing <span class='keyword'>=</span> <span class='constant'>True</span><br /> allow_extra_fields <span class='keyword'>=</span> <span class='constant'>True</span><br /> pre_validators <span class='keyword'>=</span> []<br /><br /> active <span class='keyword'>=</span> <span class='metaFunctionCallPy'>Bool()</span><br /> amount <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> colour <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> description <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> featured <span class='keyword'>=</span> <span class='metaFunctionCallPy'>Bool()</span><br /> <span class='support'>id</span> <span class='keyword'>=</span> <span class='metaFunctionCallPy'>Int()</span><br /> image <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> image_thumb <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> image_zoom <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> keywords <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> material <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> name <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> ordernum <span class='keyword'>=</span> <span class='metaFunctionCallPy'>Int()</span><br /> sku <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UnicodeString()</span><br /> views <span class='keyword'>=</span> <span class='metaFunctionCallPy'>Int()</span><br /></pre><br />The __metaclass__ __repr__ hack serves a dual purpose. a) for debugging and b) as templating. By printing the repr you can use that as a template for customizing a model schema. You actually inherit from a dynamically generated class ( a function call taking a model and optional arguments). I wasn't even too sure you could do that in python. It's nice I don't have to create a new meta class mechanism and can stick with existing formencode semantics.<br /><br />Note the `ignore_key_missing` flag that is by default set to True. I think when using this I will just define which fields to validate purely by virtue of what is included in the html. eg If there is no `sku` field in the form then it will not be validated.<br /><br />What if I *didn't* want to globally ignore missing keys and wanted to manually declare which to ignore? Formencode has an in-built sub-classing mechanism whereby if you declare `some_key = None` then some_key will not be validated at all.<br /><br />To declare a Product model_schema with plural Colors inline:<br /><pre class='blackboard'><span class='storage'>class</span> <span class='entity'>ProductSchema</span>(<span class='metaFunctionCallPy'>model_schema(Product, </span><span class='variable'>nested</span><span class='keyword'>=</span><span class='constant'>True</span><span class='metaFunctionCallPy'>)</span>):<br /> colors <span class='keyword'>=</span> <span class='metaFunctionCallPy'>model_schema(Color, </span><span class='variable'>plural</span><span class='keyword'>=</span><span class='constant'>True</span><span class='metaFunctionCallPy'>)</span><br /></pre><br />With Python 2.6 you can do inline customisable declarations of relations:<br /><pre class='blackboard'><span class='storage'>class</span> <span class='entity'>ProductSchema</span>(<span class='metaFunctionCallPy'>model_schema(Product, </span><span class='variable'>nested</span><span class='keyword'>=</span><span class='constant'>True</span><span class='metaFunctionCallPy'>)</span>):<br /> <span class='entity'>@sqaencode.plural</span><br /> <span class='storage'>class</span> <span class='entity'>colors</span>(<span class='metaFunctionCallPy'>model_schema(Color)</span>):<br /> some_field <span class='keyword'>=</span> <span class='metaFunctionCallPy'>NonDefault()</pre><br /><br />I'm thinking about creating a mechanism whereby I subclass sqlalchemy.types.* for metadata purposes to further drive automatic schema generation. A cool thing about Django's tight integration is the high level data types. Url, Email etc You declare higher level properties to what are essentially stored as VARCHAR types in the database. It's not *just* a string.<br /><br />SqlAlchemy, while really great, (rightly) doesn't try and abstract beyond basic SQL. There is nothing stopping an end user however doing something like this:<br /><pre class='blackboard'><span class="lineNumber">1 </span><span class='storage'>class</span> <span class='entity'>Url</span>(<span class='superclass'>Unicode</span>): <span class='keyword'>pass</span><br /><span class="lineNumber">2 </span><span class='storage'>class</span> <span class='entity'>Email</span>(<span class='superclass'>Unicode</span>): <span class='keyword'>pass</span><br /><span class="lineNumber">3 </span><br /><span class="lineNumber">4 </span>higher_level_table <span class='keyword'>=</span> <span class='metaFunctionCallPy'>Table ( </span><span class='string'>'higher_level'</span><span class='metaFunctionCallPy'>, metadata,<br /><span class="lineNumber">5 </span> Column(</span><span class='string'>'url'</span><span class='metaFunctionCallPy'>, Url(</span><span class='constant'>32</span><span class='metaFunctionCallPy'>)),<br /><span class="lineNumber">6 </span> Column(</span><span class='string'>'email'</span><span class='metaFunctionCallPy'>, Email(</span><span class='constant'>32</span><span class='metaFunctionCallPy'>)),<br /><span class="lineNumber">7 </span> Column(</span><span class='string'>'id'</span><span class='metaFunctionCallPy'>, Integer(), </span><span class='variable'>primary_key</span><span class='keyword'>=</span><span class='constant'>True</span><span class='metaFunctionCallPy'>, </span><span class='variable'>autoincrement</span><span class='keyword'>=</span><span class='constant'>True</span><span class='metaFunctionCallPy'>, </span><span class='variable'>nullable</span><span class='keyword'>=</span><span class='constant'>False</span><span class='metaFunctionCallPy'>),<br /><span class="lineNumber">8 </span>)</span><br /><span class="lineNumber">9 </span></pre><br />If Url and Email were imported column types from sqaencode.types then you could add them to the sqalchemy => formencode type mapping and they would be picked up by model_schema()<br /><br />What's on the todo? <br /><ul><li> Options for automatically generating child Schemas. </li> <li> Setting the length on String/UnicodeString validators automatically to the max length of the corresponding column. </li> <li> Automatically create unique column validators </li> </ul><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-4961556305655138993?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-40104491101003745522009-05-27T22:20:00.000-07:002009-05-28T22:53:28.285-07:00ReadingA few days ago I scribbled an entry into my TODO.txt:<blockquote> Learn about Python packaging, namespaces etc, best practices. setup.py etc etc </blockquote> I started reading a lot and watching whatever screen casts I could. One useful book was <a href="http://www.packtpub.com/expert-python-programming/book">Expert Python Programming</a>. It is light on actual python and heavy on the soft skills in using open source python tools to refine your work flow. That is exactly what I was looking for. It is only about 300 pages so I skimmed through most of it in a day.<br /><br />The <a href="http://showmedo.com/videotutorials/series?name=mcfckfJ4w">Agile Development Tools in Python</a> series by <a href="http://percious.com/blog/">Christopher Perkins</a>, while pretty lighton gave some nice quick overviews of some very useful tools.<br /><br />Between these two sources I got a good overview of:<br /><ul><li>setuptools</li><li>distutils</li></ul><ul><li>virtualenv</li><li>paster</li></ul><ul><li>nosetest</li><li> coverage</li></ul><ul><li>docutils</li><li>reST</li><li>sphinx</li></ul><br /><h3>virtualenv</h3><br />virtualenv will set up an isolated version of python, where applications can remain blissfuly ignorant of the rapid change in the outside world. If you have an application that needs a particular version of a lib and another requiring a different one then this is the tool for the job.<br /><br />It will set up a folder structure with (std) Lib and site-packages folders and a scripts directory. One of the scripts is of course the virtual python interpreter. It also has a tremendously useful `activate` script. This will put the Scripts (bin on *nix) folder on PATH and the lib folders on PYTHONPATH. It also modifies your console prompt to display the name of the current virtualenv.<br /><pre>(myenv) C:\myenv</pre><br />I haven't used it much as of yet but I can imagine that it could get pretty confusing once you had a tonne of them so that is a nice and thoughtful addition.<br /><br />The net result being that if you `activate` an environment and then run `python some_python.py` from an arbitrary directory it will run it using the virtualenv. Likewise for any of the scripts in your (*nix: bin, win: Scripts) folder. There is no need for explicitly referencing them. When you are done you just `deactivate` the environment with corresponding script.<br /><br />setuptools is installed by default into each new virtualenv and easy_install is included in the scripts folder. You use easy_install to, funnily enough, easily install the packages you require.<br /><br />virtualenv seems of great utility, however I imagine it could get pretty wasteful in terms of network usage. You don't really want to be downloading 10 packages EVERY time you start some new project. It would be handy if all of your local virtualenv environments used some sort of global local cache when using easy_install.<br /><br />If you needed for example cherrypy and you already had the latest version on your hide drive some where it would just pull the egg from there. Otherwise it would first pull the package into the cache from pypi.<br /><br />It would be good if this all happened transparently and each virtualenv's easy_install respected some global distutils.cfg setting for cache location. Where do eggs lay about usually? In a Nest. I'll probably happily find that this has been sorted out.<br /><br />Even if you had a cache, saving yourself some bandwidth, if you just copied an egg into each environment that would still be disk wastage. This seems like a good use for symbolic linking. It would be silly to reinvent that in python. Although, I'm not too sure how well Windows supports something like that. *nix is definitely superior in that respect.<br /><h3>paver</h3><br />Another interesting util is one called paver that sort of ties together virtualenv and distutils/setuptools to allow a more `pythonic` zc.buildout. Buildout uses declarative ini files for everything and is apparently harder to hack if you want to do anything out of the ordinary. (I wouldn't know for sure if these are valid criticisms )<br /><br />From what I have read of buildout vs paver/virtualenv I think I will invest in the latter. It just seems to appeal more to my personal sense of aesthetics.<br /><h3>paster</h3><br />Paster is a hodgepodge macgyver swiss knife created by Ian Bicking. It can launch WSGI applications, launch missiles, sink ships and god knows what else. This utility, `eclectic` in the words of Ian, is basically the kitchen sink.<br /><br />Of interest to me is the project template creation commands. It will create a folder with setup.py, README.txt etc and run you through a command line based wizard to set it all up, asking version number, author name and the like. These project templates can give a bit of an insight into how other people structure their projects.<br /><br />Where do the tests go? Inside the distribution proper? Housed inside the same folder as the setup.py? I'm somewhat leaning towards having the tests included in the public package. ie package.tests.fixtures I think it makes it easy then to import fixtures in doctested examples documentation.<br /><h3>distutils / setuptools</h3><br />With regard to structuring projects, if there is one thing I can hope to take away from all this, then it is the `setup.py develop` command.<br /><br />This distutils (or is it setuptools?) command will put your under development package on sys.path so you don't have to run `setup.py install` every time you make changes. Quite handy. Of course you could have a virtualenv where you have a package in `develop` mode and another where you have ran a full `setup.py install` at a certain version.<br /><h3>nosetests</h3><br />Nosetests has a really great coverage plugin. It will run your tests, and you can specify which package/module you want it to dump a list of line ranges that haven't been `covered` by your test code. This makes it super easy to tell which code paths still needs testing( or culling! ). I'm assuming this is only really of use for pure python code. If so, that is another tick next to dynamic `interpreted` languages.<br /><h3>sphinx</h3><br />I also experimented with docutils/ reST / Sphinx for documentation. Sphinx is the spiffy new reST based documentation system developed originally for Python proper. It has a built-in indexer and javascript search client and seemingly many other great features. For this reason it seems a LOT of python projects are using it now. I imagine the fact that it's quite easy on the eye out of the box doesn't hurt take-up much either.<br /><h3> A new project</h3><br />I started on creating a setup.py enabled, nose/doc tested, sphinx documented project. It *really* slows down the whole process. I suppose it's a lot to try and learn all at once.<br /><br />I'm still not really sold on test first prototyping. TDD of something you have a little bit of experience with, sure. TDD/documenting something when you are in new terrain and you KNOW that you are going to mess up and have to redux anyway, just seems wasteful.<br /><br />I probably just don't `get it` yet. `Slow is fast` has generally held true in a lot of other areas of my experience.<br /><br />What else to read up on? I need to learn more about the standard library logging package.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-4010449110100374552?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com2tag:blogger.com,1999:blog-7254386064439146371.post-1867956015807108282009-05-22T22:39:00.001-07:002009-05-22T22:40:30.709-07:00Souvenir<blockquote><br />prison color blue<br />it's a uniform of choice<br />count yourself lucky <br />that you don't write the software<br /></blockquote><br /><br />Some lines from the great Neil Finn's song `Souvenir`. Count yourself lucky indeed.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-186795601580710828?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-8337550977881549372009-05-22T20:13:00.000-07:002009-05-27T23:39:46.001-07:00FormEncode + SqlAlchemy = SQAEncodeThe way I do admin model CRUD at the moment is have a models.py file containing all my <a href="http://www.sqlalchemy.org/">sqlalchemy</a> tables/classes and their respective <a href="http://formencode.org/Design.html#basic-metaphor">formencode</a>.Schema/s in a forms.py file. The problem is that I end up having to repeat all the fields and to me that just stinks.<br /><br />It's not a big deal as far as typing them out goes as I will typically declare a table / model in sqlalchemy and then use a homebrewed scaffolding function to generate derived code for the Schema and html for the actual form. When actually editing a model form I use <a href="http://genshi.edgewall.org/">genshi</a>'s <a href="http://genshi.edgewall.org/wiki/Documentation/filters.html#html-form-filler">HTMLFormFiller</a> filter to set the blank forms values.<br /><br />This approach really is keeping in the spirit of the<a href="http://formencode.org/"> formencode </a>library. The author's <a href="http://formencode.org/Design.html#presentation">philosophy</a> is that after a while configuring an `autogenerated` form from a model by declaring options in code is just as much work as declaring in html.<br /><br />Configuration includes black/white listing certain fields, whether they are allowed to be empty (null/None) and also logical/presentational ordering and grouping of individual fields.<br /><br /><a href="http://docs.formalchemy.org/forms.html">formalchemy</a> on the other hand will take a model and automatically generate a form with the values already set. It can not though do nested models. ie A parent model with an arbitrary amount of inline children to CRUD in the same form. The author considers this `almost always bad design` It only allows what I term `relating` the parent/root model to existing models.<br /><br />While it *may* sometimes save you from doing any `<a href="http://docs.formalchemy.org/forms.html#configuring-and-rendering-forms">configuration</a>` whatsoever in the case where the form validation schema reflects perfectly the model, in my (admittedly meagre) experience this would be fairly rare.<br /><br />The authors of <a href="http://docs.formalchemy.org/forms.html">formalchemy</a> recommend the use of CSS to customize the appearance of the auto generated forms. For more extensive customization you can override each fields renderer (<input type = 'text'> vs <textarea> etc) or the global form generation function.<br /><br />Having not really used <a href="http://docs.formalchemy.org/forms.html">formalchemy</a> that much I can't say with certainty just how unweildy it gets trying to customize a forms options / looks. I'd guess that you would end up having to do just as much `configuration`, ie repeating of fields / `work`<br /><br />Looking at a lot of my formencode Schemas in relation to their underlying model, it seems they are declared in a `white list` manner. The schemas (and html) simply don't declare fields that aren't `public`. The underlying type map is reasonably consistent. A sqlalchemy Bool becomes a formencode Bool, a Unicode maps to a UnicodeString etc. It's typically only the options that are changed.<br /><br />I have an emybronic idea to auto generate<a href="http://formencode.org/"> formencode </a>validators from an sqlalchemy model, mapping validators to column/relation types. I imagine `configuring` this would look something like the following.<br /><br /><pre class='blackboard'><span class="lineNumber">1 </span><span class='storage'>class</span> <span class='entity'>ProductSchema</span>(<span class='superclass'>ModelSchema</span>): <br /><span class="lineNumber">2 </span> _model <span class='keyword'>=</span> Product<br /><span class="lineNumber">3 </span> <br /><span class="lineNumber">4 </span> long_description <span class='keyword'>=</span> <span class='metaFunctionCallPy'>options(</span><span class='variable'>not_empty</span><span class='metaFunctionCallPy'> </span><span class='keyword'>=</span><span class='metaFunctionCallPy'> </span><span class='constant'>True</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber">5 </span> date_created <span class='keyword'>=</span> <span class='constant'>None</span><br /><span class="lineNumber">6 </span> name <span class='keyword'>=</span> <span class='metaFunctionCallPy'>UniqueName()</pre><br /><br />The ModelSchema would use some type of __metaclass__. Any `private` field declared as None would be blacklisted from the underlying derived schema. Above `options` would just be an alias for dict. A dict would be used to configure the auto mapped validator. `not_empty = True` would override the not_empty argument mapped from the respective columns nullable property. UniqueName above would completely override from the UnicodeString. (In fact any column with unique = True could probably have an auto generated `unique field` validator attached)<br /><br />I'm also currently working on some sqlalchemy formencode integration functions that will take basic one table models and their relations and encode them into nested dictionaries / primary key lists. Also, the other way round, taking a nested dict (as from a formencode.NestedVariables pre validated Schema) and creating/updating/deleting an object tree/graph. <br /><br />SqlAlchemy mappers makes it quite easy to introspect classes for relations so the object_graph func signature just looks like:<br /><br /><pre class="blackboard"><span class="lineNumber">1 </span><span class="metaFunctionCallPy">object_graph(nd_dict, root_model)</span></pre><br /><br />This all works fine in basic unit test land AND for one model forms. The problem I'm having is ONETOMANY child objects in a form. Imagine an Invoice form with an inline InvoiceItems table with each row representing an individual item. On a new invoice there would be 3 `blank` rows. How would you know which ones to ignore? Which ones to delete?<br /><br />I want to set a `_keep` flag for each item. The _keep flag will be reflected as a checkbox in the form and if unchecked will mean that the child item should be ignored in the validation process.<br /><br />I could leave the child tables ( invoice items in the concrete sense) empty and use javascript to add a [+] button to add new children. That to me seems like a `bent over` concession.<br /><br />The `object_graph` function which CRUDS model[s] from a nestable dict is currently decoupled from the validation process and knows nothing of form errors. This might make it hard. Before that I was successfully using `child_crud` hooks in my CRUD controllers for this purpose.<br /><br />I'll have to sit down and work out which cases I'll need to accommodate. I want to be able to update/delete existing models, create new models and ignore cases where the child CRUD form partials have not been edited.<br /><br />How to get the formencode.ForEach validator to ignore items without messing up the errors index? These problems were all solved using a child_crud hook in the controllers but trying to abstract and decouple things is making it a lot harder.<br /><br />It will be a nut worth cracking anyway.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-833755097788154937?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-45868480103329479452009-05-11T06:55:00.001-07:002009-05-22T22:02:53.051-07:00PyParsing + SqlAlchemy = Basic Search EngineToday I learned about writing recursive descent parsers using the PyParsing library. I managed to cobble together an sqlalchemy expression builder for a basic search engine. <br /><br /><pre class='blackboard'><span class="lineNumber"> 1 </span><span class='comment'>#################################### IMPORTS ###################################<br /><span class="lineNumber"> 2 </span></span><br /><span class="lineNumber"> 3 </span><span class='comment'># PyParsing<br /><span class="lineNumber"> 4 </span></span><span class='keyword'>from</span> pyparsing <span class='keyword'>import</span> ( CaselessLiteral, Literal, Word, alphas, quotedString,<br /><span class="lineNumber"> 5 </span> removeQuotes, operatorPrecedence, ParseException,<br /><span class="lineNumber"> 6 </span> stringEnd, opAssoc ) <br /><span class="lineNumber"> 7 </span><br /><span class="lineNumber"> 8 </span><span class='comment'># SqlAlchemy<br /><span class="lineNumber"> 9 </span></span><span class='keyword'>from</span> sqlalchemy <span class='keyword'>import</span> and_, not_, or_<br /><span class="lineNumber">10 </span><br /><span class="lineNumber">11 </span><span class='comment'>################################## LIKE ESCAPE #################################<br /><span class="lineNumber">12 </span></span><br /><span class="lineNumber">13 </span>LIKE_ESCAPE <span class='keyword'>=</span> <span class='string'>r'</span><span class='constant'>\\</span><span class='string'>'</span><br /><span class="lineNumber">14 </span><br /><span class="lineNumber">15 </span><span class='storage'>def</span> <span class='entity'>like_escape</span>(<span class='variable'>s</span>):<br /><span class="lineNumber">16 </span> <span class='keyword'>return</span> <span class='string'>'%'</span> <span class='keyword'>+</span> ( <span class='metaFunctionCallPy'>s.replace(</span><span class='string'>'</span><span class='constant'>\\</span><span class='string'>'</span><span class='metaFunctionCallPy'>, </span><span class='string'>'</span><span class='constant'>\\\\</span><span class='string'>'</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber">17 </span> .<span class='metaFunctionCallPy'>replace(</span><span class='string'>'%'</span><span class='metaFunctionCallPy'>, </span><span class='string'>'</span><span class='constant'>\\</span><span class='string'>%'</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber">18 </span> .<span class='metaFunctionCallPy'>replace(</span><span class='string'>'_'</span><span class='metaFunctionCallPy'>, </span><span class='string'>'</span><span class='constant'>\\</span><span class='string'>_'</span><span class='metaFunctionCallPy'>)</span> ) <span class='keyword'>+</span> <span class='string'>'%'</span><br /><span class="lineNumber">19 </span><br /><span class="lineNumber">20 </span><span class='comment'>############################### REUSABLE ACTIONS ###############################<br /><span class="lineNumber">21 </span></span><br /><span class="lineNumber">22 </span><span class='storage'>class</span> <span class='entity'>UnaryOperation</span>(<span class='support'>object</span>):<br /><span class="lineNumber">23 </span> <span class='storage'>def</span> <span class='support'>__init__</span>(<span class='variable'>self</span>, <span class='variable'>t</span>):<br /><span class="lineNumber">24 </span> <span class='variable'>self</span>.op, <span class='variable'>self</span>.a <span class='keyword'>=</span> t[<span class='constant'>0</span>]<br /><span class="lineNumber">25 </span><br /><span class="lineNumber">26 </span> <span class='storage'>def</span> <span class='support'>__repr__</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">27 </span> <span class='keyword'>return</span> <span class='string'>"</span><span class='stringInterpolation'>%s</span><span class='string'>:(</span><span class='stringInterpolation'>%s</span><span class='string'>)"</span> <span class='keyword'>%</span> (<span class='variable'>self</span>.name, <span class='support'>str</span><span class='metaFunctionCallPy'>(</span><span class='variable'>self</span><span class='metaFunctionCallPy'>.a)</span>) <br /><span class="lineNumber">28 </span><br /><span class="lineNumber">29 </span> <span class='storage'>def</span> <span class='entity'>express</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">30 </span> <span class='keyword'>return</span> <span class='variable'>self</span>.operator[<span class='constant'>0</span>]<span class='metaFunctionCallPy'>(</span><span class='variable'>self</span><span class='metaFunctionCallPy'>.a.express())</span><br /><span class="lineNumber">31 </span><br /><span class="lineNumber">32 </span><span class='storage'>class</span> <span class='entity'>BinaryOperation</span>(<span class='support'>object</span>):<br /><span class="lineNumber">33 </span> <span class='storage'>def</span> <span class='support'>__init__</span>(<span class='variable'>self</span>, <span class='variable'>t</span>):<br /><span class="lineNumber">34 </span> <span class='variable'>self</span>.op <span class='keyword'>=</span> t[<span class='constant'>0</span>][<span class='constant'>1</span>]<br /><span class="lineNumber">35 </span> <span class='variable'>self</span>.operands <span class='keyword'>=</span> t[<span class='constant'>0</span>][<span class='constant'>0</span>::<span class='constant'>2</span>]<br /><span class="lineNumber">36 </span><br /><span class="lineNumber">37 </span> <span class='storage'>def</span> <span class='support'>__repr__</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">38 </span> <span class='keyword'>return</span> <span class='string'>"</span><span class='stringInterpolation'>%s</span><span class='string'>:(</span><span class='stringInterpolation'>%s</span><span class='string'>)"</span> <span class='keyword'>%</span> ( <span class='variable'>self</span>.name, <br /><span class="lineNumber">39 </span> <span class='string'>","</span>.<span class='metaFunctionCallPy'>join(</span><span class='support'>str</span><span class='metaFunctionCallPy'>(oper) </span><span class='keyword'>for</span><span class='metaFunctionCallPy'> oper </span><span class='keyword'>in</span><span class='metaFunctionCallPy'> </span><span class='variable'>self</span><span class='metaFunctionCallPy'>.operands)</span> ) <br /><span class="lineNumber">40 </span><br /><span class="lineNumber">41 </span> <span class='storage'>def</span> <span class='entity'>express</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">42 </span> <span class='keyword'>return</span> <span class='variable'>self</span>.operator[<span class='constant'>0</span>]<span class='metaFunctionCallPy'>(</span><span class='keyword'>*</span><span class='metaFunctionCallPy'>( oper.express() </span><span class='keyword'>for</span><span class='metaFunctionCallPy'> oper </span><span class='keyword'>in</span><span class='metaFunctionCallPy'> </span><span class='variable'>self</span><span class='metaFunctionCallPy'>.operands ))</span> <br /><span class="lineNumber">43 </span><br /><span class="lineNumber">44 </span><span class='storage'>class</span> <span class='entity'>SearchAnd</span>(<span class='superclass'>BinaryOperation</span>):<br /><span class="lineNumber">45 </span> name <span class='keyword'>=</span> <span class='string'>'AND'</span><br /><span class="lineNumber">46 </span> operator <span class='keyword'>=</span> [and_]<br /><span class="lineNumber">47 </span><br /><span class="lineNumber">48 </span><span class='storage'>class</span> <span class='entity'>SearchOr</span>(<span class='superclass'>BinaryOperation</span>):<br /><span class="lineNumber">49 </span> name <span class='keyword'>=</span> <span class='string'>'OR'</span><br /><span class="lineNumber">50 </span> operator <span class='keyword'>=</span> [or_]<br /><span class="lineNumber">51 </span><br /><span class="lineNumber">52 </span><span class='storage'>class</span> <span class='entity'>SearchNot</span>(<span class='superclass'>UnaryOperation</span>):<br /><span class="lineNumber">53 </span> name <span class='keyword'>=</span> <span class='string'>'NOT'</span><br /><span class="lineNumber">54 </span> operator <span class='keyword'>=</span> [not_]<br /><span class="lineNumber">55 </span><br /><span class="lineNumber">56 </span><span class='comment'>############################### REUSABLE GRAMMARS ##############################<br /><span class="lineNumber">57 </span></span><br /><span class="lineNumber">58 </span>AND <span class='keyword'>=</span> <span class='metaFunctionCallPy'>CaselessLiteral(</span><span class='string'>"and"</span><span class='metaFunctionCallPy'>)</span> <span class='keyword'>|</span> <span class='metaFunctionCallPy'>Literal(</span><span class='string'>'+'</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber">59 </span>OR <span class='keyword'>=</span> <span class='metaFunctionCallPy'>CaselessLiteral(</span><span class='string'>"or"</span><span class='metaFunctionCallPy'>)</span> <span class='keyword'>|</span> <span class='metaFunctionCallPy'>Literal(</span><span class='string'>'|'</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber">60 </span>NOT <span class='keyword'>=</span> <span class='metaFunctionCallPy'>CaselessLiteral(</span><span class='string'>"not"</span><span class='metaFunctionCallPy'>)</span> <span class='keyword'>|</span> <span class='metaFunctionCallPy'>Literal(</span><span class='string'>'!'</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber">61 </span><br /><span class="lineNumber">62 </span>searchTermMaster <span class='keyword'>=</span> (<br /><span class="lineNumber">63 </span> <span class='metaFunctionCallPy'>Word(alphas)</span> <span class='keyword'>|</span> <span class='metaFunctionCallPy'>quotedString.copy()</span>.<span class='metaFunctionCallPy'>setParseAction( removeQuotes )</span> )<br /><span class="lineNumber">64 </span><br /><span class="lineNumber">65 </span><span class='comment'>########################## THREAD SAFE PARSER FACTORY ##########################<br /><span class="lineNumber">66 </span></span><br /><span class="lineNumber">67 </span><span class='storage'>def</span> <span class='entity'>like_parser</span>(<span class='variable'>model</span>, <span class='variable'>fields</span><span class='keyword'>=</span>[]):<br /><span class="lineNumber">68 </span> <span class='storage'>class</span> <span class='entity'>SearchTerm</span>(<span class='support'>object</span>):<br /><span class="lineNumber">69 </span> <span class='storage'>def</span> <span class='support'>__init__</span>(<span class='variable'>self</span>, <span class='variable'>tokens</span>):<br /><span class="lineNumber">70 </span> <span class='variable'>self</span>.term <span class='keyword'>=</span> tokens[<span class='constant'>0</span>]<br /><span class="lineNumber">71 </span><br /><span class="lineNumber">72 </span> <span class='storage'>def</span> <span class='entity'>express</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">73 </span> <span class='keyword'>return</span> <span class='metaFunctionCallPy'>or_ (<br /><span class="lineNumber">74 </span> </span><span class='keyword'>*</span><span class='metaFunctionCallPy'>( </span><span class='support'>getattr</span><span class='metaFunctionCallPy'>(model, field).like( like_escape(</span><span class='variable'>self</span><span class='metaFunctionCallPy'>.term),<br /><span class="lineNumber">75 </span> </span><span class='variable'>escape</span><span class='metaFunctionCallPy'> </span><span class='keyword'>=</span><span class='metaFunctionCallPy'> LIKE_ESCAPE) <br /><span class="lineNumber">76 </span> </span><span class='keyword'>for</span><span class='metaFunctionCallPy'> field </span><span class='keyword'>in</span><span class='metaFunctionCallPy'> fields )<br /><span class="lineNumber">77 </span> )</span><br /><span class="lineNumber">78 </span><br /><span class="lineNumber">79 </span> <span class='storage'>def</span> <span class='support'>__repr__</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">80 </span> <span class='keyword'>return</span> <span class='variable'>self</span>.term<br /><span class="lineNumber">81 </span> <br /><span class="lineNumber">82 </span> searchTerm <span class='keyword'>=</span> <span class='metaFunctionCallPy'>searchTermMaster.copy()</span>.<span class='metaFunctionCallPy'>setParseAction(SearchTerm)</span><br /><span class="lineNumber">83 </span><br /><span class="lineNumber">84 </span> searchExpr <span class='keyword'>=</span> <span class='metaFunctionCallPy'>operatorPrecedence( searchTerm,<br /><span class="lineNumber">85 </span> [ (NOT, </span><span class='constant'>1</span><span class='metaFunctionCallPy'>, opAssoc.RIGHT, SearchNot),<br /><span class="lineNumber">86 </span> (AND, </span><span class='constant'>2</span><span class='metaFunctionCallPy'>, opAssoc.LEFT, SearchAnd),<br /><span class="lineNumber">87 </span> (OR, </span><span class='constant'>2</span><span class='metaFunctionCallPy'>, opAssoc.LEFT, SearchOr) ] )</span><br /><span class="lineNumber">88 </span><br /><span class="lineNumber">89 </span> <span class='keyword'>return</span> searchExpr <span class='keyword'>+</span> stringEnd<br /><span class="lineNumber">90 </span><br /><span class="lineNumber">91 </span><span class='comment'>########################### SEARCH FIELDS LIKE HELPER ##########################<br /><span class="lineNumber">92 </span></span><br /><span class="lineNumber">93 </span><span class='storage'>def</span> <span class='entity'>search_fields_like</span>(<span class='variable'>s</span>, <span class='variable'>model</span>, <span class='variable'>fields</span>):<br /><span class="lineNumber">94 </span> <span class='keyword'>if</span> <span class='support'>isinstance</span><span class='metaFunctionCallPy'>(fields, </span><span class='support'>basestring</span><span class='metaFunctionCallPy'>)</span>: fields <span class='keyword'>=</span> [fields]<br /><span class="lineNumber">95 </span> parser <span class='keyword'>=</span> <span class='metaFunctionCallPy'>like_parser(model, fields)</span><br /><span class="lineNumber">96 </span> <span class='keyword'>return</span> <span class='metaFunctionCallPy'>parser.parseString(s)</span>[<span class='constant'>0</span>].<span class='metaFunctionCallPy'>express()</span><br /><span class="lineNumber">97 </span><br /><span class="lineNumber">98 </span><span class='comment'>################################################################################<br /><span class="lineNumber">99 </span></pre><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-4586848010332947945?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-71794762633498471392009-04-03T03:59:00.000-07:002009-05-03T21:49:56.170-07:00ListFilter Prototype<div id="media"> <object id="csSWF" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="768" height="554" codebase="http://active.macromedia.com/flash7/cabs/ swflash.cab#version=9,0,28,0"> <param name="src" value="http://blogdata.akalias.net/FilterPrototype/FilterPrototype.flv"/> <param name="bgcolor" value="#1a1a1a"/> <param name="quality" value="best"/> <param name="allowScriptAccess" value="always"/> <param name="allowFullScreen" value="true"/> <param name="scale" value="showall"/> <param name="flashVars" value="autostart=false"/> <embed name="csSWF" src="http://blogdata.akalias.net/FilterPrototype/FilterPrototype_controller.swf" width="768" height="554" bgcolor="#1a1a1a" quality="best" allowScriptAccess="always" allowFullScreen="true" scale="showall" flashVars="autostart=false" pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash"></embed> </object> </div><br /><br />Multiple space terminated regular expressions for list filtering.<br /><br />The examples presented are somewhat hare-brained. a) Cause I'm hare-brained b) I'm really tired at the moment. Note the dramatic pauses :) Can.....you.....see....what....I'm.........do....ing You can imagine the usefulness though.<br /><br />The filter is a prototype implementation of a filtering syntax for a QuickOpen Panel specifically designed for `quick` (try not to laugh) *multiple item* selection. The idea is to allow you to just keep typing in to refine your search. NOT this OR that. NOT that. `Open All In New Window` etc. <br /><br />It is implemented using the editors actual text buffer API.<br /><br />My editors current QuickOpen panel (while it does support multiple selection using ctrl-enter) works a lot like the FireFox url history search; space terminated tokens that each list item must match else will be filtered from the list. This works great for selecting one item but what if you want to open up multiple items at a time? <br /><br />Plain regex searches are too unwieldy and lack the speed of entry desired.<br /><br />The filtering is a regex extension of the current `all space terminated word chunks in any order`. It maintains most of the benefits and generally works as before for non regex characters. If you just need to type in some alphanumerics then it should be just as quick. It's essentially multiple regexes instead of multiple substring matches.<br /><br />OPERATORS:<br /><br />' ' AND operator<br />! NOT operator<br />| OR operator<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-7179476263349847139?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-77724253990748741012009-04-02T18:32:00.000-07:002009-04-03T04:22:04.559-07:00jQuery = $;<img src="http://blogdata.akalias.net/hash_events.jpg"/><br /><br />I have been making a foray into JavaScript recently for work and having had good experiences with `jQuery Lightbox` decided to use it for the basis of a job I was doing.<br /><br />It is essentially a carpet gallery website with collections of colors (1:M). The designer wanted it to be `AJAX` ( a term that seems to have been hijacked to mean any page updated without slow browser refresh )<br /><br />In the middle of the page is a large image. Above it are next/prev collection links and tiny `swatches` (thumbnails) containing links to each color for the current selection. <br /><br />To either side of the image are next/prev color in collection links and upon mouseover a window will appear showing a magnified area following the cursor which is changed to a crosshair. The cursor will change via css styling to `cursor:wait` whilst waiting for the zoom image to load.<br /><br />Upon changing color (via swatch, next/prev collection/color ) an animated gif will show while the chosen color's image is loading.<br /><br />At first I was keeping a counter of the current colors position in the collections array of colors (clicking on the little thumbnails would take the title attribute, slug it and use that for the color, updating the current position index by with an $.inArray(color, colors) )<br /><br />The filenames to load was, and are, a function of current collection and color.<br /><br />The problem was having the state in an internal counter didn't really work for `open in new tab` or for sending links. "Hey check out this carpet... no the red one... Did you type in that url properly? Just paste it in."<br /><br />I did some googling for `ajax urls` and stumbled upon a technique that sounded useful. That being polling the hash location for changes and then setting page state as a function of the hash. They said the ideal `polling` rate was 100ms. Sounded pretty hacky but at least the urls worked.<br /><br />I searched for and found a <a href="http://plugins.jquery.com/files/jquery.hashhistory.js.txt"> plugin </a> for jQuery that allows you to set event handlers for when the window hash changes. It uses polling but is responsive (42 ms) and works on IE 6.<br /><br />I therefore just set a callback to update all the links and the image upon hash changing. I split the hash on '--' to find the current collection and color<br /><br />The great thing about it is that the event is fired on page load so it goes through the `change color` routine. Updates all the links, shows the loading animated gif etc.<br /><br />You can send links to people, open in next tab from any of the links etc<br /><br />jQuery and its plugins made everything really straight forward. The only head scratching was getting the magnifying glass to work. None of the plugins worked out of the box for images that changed src. They worked mainly for a `static` gallery.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-7772425399074874101?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-19698507193687680922009-04-02T17:55:00.000-07:002009-05-03T21:52:49.390-07:00CTags 2: TDDThis is part two of CTags. See <a href="http://akalias.blogspot.com/2009/03/captain-bisect-and-his-rag-ctag-crew.html"> part 1 </a><br /><br />Test Driven Development<br /><br />Adherents religiously write tests first for *every* function they write.<br /><br />Personally, I think a tonne of unit tests while prototyping is a waste of time for `simple` stuff with only one programmer working. It tends to feel like walking in mud. I prefer higher level tests that while they may not pinpoint the exact cause of failure won't double refactoring efforts. Especially when prototyping. Spend ages testing that `The Wrong Way` works? No thanks... Prototype, then rewrite with tests.<br /><br />Sometimes though, `TDD` really is indispensable while exploring. Especially when working on stuff that is pushing the limits of your understanding. If I start encountering bugs I usually see it as a sign I need to start writing tests. Rather than using transient print statements to debug I'll write some unit tests.<br /><br />The TagFile class below (now commented) is an example of when I found testing while prototyping invaluable.<br /><br /><pre class='blackboard'><span class="lineNumber"> 1 </span><span class='comment'>#################################### IMPORTS ###################################<br /><span class="lineNumber"> 2 </span></span><br /><span class="lineNumber"> 3 </span><span class='keyword'>from</span> __future__ <span class='keyword'>import</span> with_statement<br /><span class="lineNumber"> 4 </span><br /><span class="lineNumber"> 5 </span><span class='keyword'>import</span> os<br /><span class="lineNumber"> 6 </span><span class='keyword'>import</span> bisect<br /><span class="lineNumber"> 7 </span><span class='keyword'>import</span> mmap<br /><span class="lineNumber"> 8 </span><span class='keyword'>import</span> unittest<br /><span class="lineNumber"> 9 </span><br /><span class="lineNumber"> 10 </span><span class='comment'>################################### CONSTANTS ##################################<br /><span class="lineNumber"> 11 </span></span><br /><span class="lineNumber"> 12 </span><br /><span class="lineNumber"> 13 </span><span class='string'>"""<br /><span class="lineNumber"> 14 </span>The tags in a `tags` file are listed one per line formatted as so:<br /><span class="lineNumber"> 15 </span><br /><span class="lineNumber"> 16 </span> tag_name<TAB>file_name<TAB>ex_cmd;"<TAB>extension_fields<br /><span class="lineNumber"> 17 </span><br /><span class="lineNumber"> 18 </span>"""</span><br /><span class="lineNumber"> 19 </span><br /><span class="lineNumber"> 20 </span><span class='comment'># symbolic constants for column indexes<br /><span class="lineNumber"> 21 </span></span>SYMBOL <span class='keyword'>=</span> <span class='constant'>0</span><br /><span class="lineNumber"> 22 </span>FILENAME <span class='keyword'>=</span> <span class='constant'>1</span><br /><span class="lineNumber"> 23 </span><br /><span class="lineNumber"> 24 </span><span class='comment'>################################################################################<br /><span class="lineNumber"> 25 </span></span><br /><span class="lineNumber"> 26 </span><span class='storage'>class</span> <span class='entity'>TagFile</span>(<span class='support'>object</span>):<br /><span class="lineNumber"> 27 </span> <span class='storage'>def</span> <span class='support'>__init__</span>(<span class='variable'>self</span>, <span class='variable'>p</span>, <span class='variable'>column</span>):<br /><span class="lineNumber"> 28 </span> <span class='string'>""" <br /><span class="lineNumber"> 29 </span> <br /><span class="lineNumber"> 30 </span> Instantiate a new TagFile<br /><span class="lineNumber"> 31 </span><br /><span class="lineNumber"> 32 </span> @p path to `tags` file<br /><span class="lineNumber"> 33 </span> @column which column to read<br /><span class="lineNumber"> 34 </span><br /><span class="lineNumber"> 35 </span> """</span><br /><span class="lineNumber"> 36 </span><br /><span class="lineNumber"> 37 </span> <span class='variable'>self</span>.p <span class='keyword'>=</span> p<br /><span class="lineNumber"> 38 </span> <span class='variable'>self</span>.column <span class='keyword'>=</span> column<br /><span class="lineNumber"> 39 </span><br /><span class="lineNumber"> 40 </span> <span class='storage'>def</span> <span class='support'>__getitem__</span>(<span class='variable'>self</span>, <span class='variable'>index</span>):<br /><span class="lineNumber"> 41 </span> <span class='string'>"self.fh is the mmap opened by get"</span><br /><span class="lineNumber"> 42 </span> <span class='comment'># Seek to a certain byte index<br /><span class="lineNumber"> 43 </span></span> <span class='variable'>self</span><span class='metaFunctionCallPy'>.fh.seek(index)</span><br /><span class="lineNumber"> 44 </span><br /><span class="lineNumber"> 45 </span> <span class='comment'># The position is likely to be halfway through a line so read up to<br /><span class="lineNumber"> 46 </span></span> <span class='comment'># the first new line and throw away the `junk`<br /><span class="lineNumber"> 47 </span></span> <span class='variable'>self</span><span class='metaFunctionCallPy'>.fh.readline()</span><br /><span class="lineNumber"> 48 </span><br /><span class="lineNumber"> 49 </span> <span class='comment'># Note that it's actually returning the column from the line *after* the<br /><span class="lineNumber"> 50 </span></span> <span class='comment'># line region containing the index.<br /><span class="lineNumber"> 51 </span></span> <br /><span class="lineNumber"> 52 </span> <span class='keyword'>return</span> <span class='variable'>self</span><span class='metaFunctionCallPy'>.fh.readline()</span>.<span class='metaFunctionCallPy'>split(</span><span class='string'>'</span><span class='constant'>\t</span><span class='string'>'</span><span class='metaFunctionCallPy'>)</span>[<span class='variable'>self</span>.column]<br /><span class="lineNumber"> 53 </span><br /><span class="lineNumber"> 54 </span> <span class='storage'>def</span> <span class='support'>__len__</span>(<span class='variable'>self</span>):<br /><span class="lineNumber"> 55 </span> <span class='comment'># bisect.bisect_left search must know how large the file is<br /><span class="lineNumber"> 56 </span></span> <span class='keyword'>return</span> <span class='metaFunctionCallPy'>os.stat(</span><span class='variable'>self</span><span class='metaFunctionCallPy'>.p)</span>.st_size<br /><span class="lineNumber"> 57 </span><br /><span class="lineNumber"> 58 </span> <span class='storage'>def</span> <span class='entity'>get</span>(<span class='variable'>self</span>, *<span class='variable'>tags</span>):<br /><span class="lineNumber"> 59 </span> <span class='string'>"""<br /><span class="lineNumber"> 60 </span> <br /><span class="lineNumber"> 61 </span> Get all lines for one or more tags<br /><span class="lineNumber"> 62 </span> <br /><span class="lineNumber"> 63 </span> """</span><br /><span class="lineNumber"> 64 </span><br /><span class="lineNumber"> 65 </span> <span class='keyword'>with</span> <span class='support'>open</span><span class='metaFunctionCallPy'>(</span><span class='variable'>self</span><span class='metaFunctionCallPy'>.p, </span><span class='string'>'r+'</span><span class='metaFunctionCallPy'>)</span> <span class='keyword'>as</span> fh:<br /><span class="lineNumber"> 66 </span> <span class='comment'># mmap is not needed but delivers performance increase<br /><span class="lineNumber"> 67 </span></span> <span class='variable'>self</span>.fh <span class='keyword'>=</span> <span class='metaFunctionCallPy'>mmap.mmap(fh.fileno(), </span><span class='constant'>0</span><span class='metaFunctionCallPy'>)</span><br /><span class="lineNumber"> 68 </span><br /><span class="lineNumber"> 69 </span> <span class='keyword'>for</span> tag <span class='keyword'>in</span> tags:<br /><span class="lineNumber"> 70 </span> <span class='comment'># As __getitem__ returns the colum from the line region *after*<br /><span class="lineNumber"> 71 </span></span> <span class='comment'># that containing the index pt then bisect( alias of<br /><span class="lineNumber"> 72 </span></span> <span class='comment'># bisect_right ) will give the wrong index.<br /><span class="lineNumber"> 73 </span></span><br /><span class="lineNumber"> 74 </span> b4 <span class='keyword'>=</span> <span class='metaFunctionCallPy'>bisect.bisect_left(</span><span class='variable'>self</span><span class='metaFunctionCallPy'>, tag)</span><br /><span class="lineNumber"> 75 </span><br /><span class="lineNumber"> 76 </span> <span class='comment'># Move the file to the position found<br /><span class="lineNumber"> 77 </span></span> <span class='metaFunctionCallPy'>fh.seek(b4)</span><br /><span class="lineNumber"> 78 </span><br /><span class="lineNumber"> 79 </span> <span class='comment'># Iterate over file. There may be more than one tag line to get<br /><span class="lineNumber"> 80 </span></span> <span class='comment'># per symbol/filename<br /><span class="lineNumber"> 81 </span></span> <span class='keyword'>for</span> l <span class='keyword'>in</span> fh:<br /><span class="lineNumber"> 82 </span> <span class='comment'># Compare search vs line at column<br /><span class="lineNumber"> 83 </span></span> comp <span class='keyword'>=</span> <span class='support'>cmp</span><span class='metaFunctionCallPy'>(l.split(</span><span class='string'>'</span><span class='constant'>\t</span><span class='string'>'</span><span class='metaFunctionCallPy'>)[</span><span class='variable'>self</span><span class='metaFunctionCallPy'>.column], tag)</span><br /><span class="lineNumber"> 84 </span><br /><span class="lineNumber"> 85 </span> <span class='comment'># This handles the case of being `left of left` due to <br /><span class="lineNumber"> 86 </span></span> <span class='comment'># __getitem__ index being left of symbol it returns<br /><span class="lineNumber"> 87 </span></span> <span class='comment'># ie wait until catch up<br /><span class="lineNumber"> 88 </span></span> <span class='keyword'>if</span> comp <span class='keyword'>==</span> <span class='keyword'>-</span><span class='constant'>1</span>: <span class='keyword'>continue</span><br /><span class="lineNumber"> 89 </span> <span class='comment'># If line is greater then have moved on to next symbol<br /><span class="lineNumber"> 90 </span></span> <span class='keyword'>elif</span> comp: <span class='keyword'>break</span><br /><span class="lineNumber"> 91 </span><br /><span class="lineNumber"> 92 </span> <span class='comment'># Found tag!<br /><span class="lineNumber"> 93 </span></span> <span class='keyword'>yield</span> l<br /><span class="lineNumber"> 94 </span><br /><span class="lineNumber"> 95 </span> <span class='comment'># close mmap<br /><span class="lineNumber"> 96 </span></span> <span class='variable'>self</span><span class='metaFunctionCallPy'>.fh.close()</span><br /><span class="lineNumber"> 97 </span><br /><span class="lineNumber"> 98 </span><span class='comment'>##################################### TESTS ####################################<br /><span class="lineNumber"> 99 </span></span><br /><span class="lineNumber">100 </span><span class='storage'>class</span> <span class='entity'>CTagsTest</span>(<span class='superclass'>unittest.TestCase</span>):<br /><span class="lineNumber">101 </span> <span class='storage'>def</span> <span class='entity'>test_tags_files</span>(<span class='variable'>self</span>):<br /><span class="lineNumber">102 </span> <span class='string'>"""<br /><span class="lineNumber">103 </span><br /><span class="lineNumber">104 </span> This test basically iterates over each line in the tags file creating<br /><span class="lineNumber">105 </span> a list of lines for each unique symbol it finds. It then compares this<br /><span class="lineNumber">106 </span> list to that returned by the TagFile binary search. <br /><span class="lineNumber">107 </span><br /><span class="lineNumber">108 </span> """</span><br /><span class="lineNumber">109 </span><br /><span class="lineNumber">110 </span> <span class='comment'># Successfully passed test on 10MB+ tags file<br /><span class="lineNumber">111 </span></span> <span class='comment'># tags = r"C:\Python25\Lib\tags"<br /><span class="lineNumber">112 </span></span> <br /><span class="lineNumber">113 </span> tags <span class='keyword'>=</span> <span class='string'>'tags'</span><br /><span class="lineNumber">114 </span> tag_file <span class='keyword'>=</span> <span class='metaFunctionCallPy'>TagFile(tags, SYMBOL)</span><br /><span class="lineNumber">115 </span><br /><span class="lineNumber">116 </span> <span class='keyword'>with</span> <span class='support'>open</span><span class='metaFunctionCallPy'>(tags, </span><span class='string'>'r'</span><span class='metaFunctionCallPy'>)</span> <span class='keyword'>as</span> fh:<br /><span class="lineNumber">117 </span> latest <span class='keyword'>=</span> <span class='string'>''</span><br /><span class="lineNumber">118 </span> lines <span class='keyword'>=</span> []<br /><span class="lineNumber">119 </span><br /><span class="lineNumber">120 </span> <span class='keyword'>for</span> l <span class='keyword'>in</span> fh:<br /><span class="lineNumber">121 </span> symbol <span class='keyword'>=</span> <span class='metaFunctionCallPy'>l.split(</span><span class='string'>'</span><span class='constant'>\t</span><span class='string'>'</span><span class='metaFunctionCallPy'>)</span>[SYMBOL]<br /><span class="lineNumber">122 </span><br /><span class="lineNumber">123 </span> <span class='keyword'>if</span> symbol !<span class='keyword'>=</span> latest:<br /><span class="lineNumber">124 </span><br /><span class="lineNumber">125 </span> <span class='keyword'>if</span> latest:<br /><span class="lineNumber">126 </span> tags <span class='keyword'>=</span> <span class='support'>list</span><span class='metaFunctionCallPy'>(tag_file.get(latest))</span><br /><span class="lineNumber">127 </span> <span class='variable'>self</span><span class='metaFunctionCallPy'>.assertEqual(lines, tags)</span><br /><span class="lineNumber">128 </span><br /><span class="lineNumber">129 </span> lines <span class='keyword'>=</span> []<br /><span class="lineNumber">130 </span><br /><span class="lineNumber">131 </span> latest <span class='keyword'>=</span> symbol<br /><span class="lineNumber">132 </span><br /><span class="lineNumber">133 </span> lines <span class='keyword'>+=</span> [l]<br /><span class="lineNumber">134 </span><br /><span class="lineNumber">135 </span><span class='keyword'>if</span> <span class='support'>__name__</span> <span class='keyword'>==</span> <span class='string'>'__main__'</span>:<br /><span class="lineNumber">136 </span> <span class='metaFunctionCallPy'>unittest.main()</span><br /><span class="lineNumber">137 </span><br /><span class="lineNumber">138 </span><span class='comment'>################################################################################</pre><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-1969850719368768092?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-89440391989878363952009-03-29T20:21:00.000-07:002009-04-02T18:25:53.733-07:00CTags 1: Captain Bisect And His Rag (c)Tag Crew<h2> CTags<br /></h2><blockquote> Ctags generates an index (or tag) file of language objects found in source files<br />that allows these items to be quickly and easily located by a text editor or<br />other utility. A tag signifies a language object for which an index entry is<br />available (or, alternatively, the index entry created for that object).<br /></blockquote><p>Python has a philosophy of `Batteries Included` (and `Designer Straight Jacket`)<br />and its standard library has many useful modules to allow you to zoom out and<br />fly. Say what? What pixie dust have you have been snorting!?<br /></p><p>The higher level you are the higher level you can go. If your stuck in the<br />trenches of detail it's hard to see emergent patterns to exploit which can help<br />you simplify code. Refactoring and simplifying is generally an iterative process<br />of simplifying, getting a higher level view and simplifying again.<br /></p><p>Taking a bottom up approach in this helps as you know what building blocks you<br />need to create along the way. Don't want to be *too* simplistic. How can you do<br />things `top down` if you aren't `on top`?<br />Maybe you have flown over many times already and are just implementing an old<br />innovation from a time honoured map. That's exactly the spirit of prototyping.<br />Sending in the scouts to survey the terrain and report back.<br /></p><p>However, those troops get mighty mutinous really quick if you don't take the<br />time to at least make a spot before you send em out like a swarm of jellyfish<br />(especially when they are being fired at)<br /></p><p>Strategy? Fly over, send in the scouts and maintain communication.</p><p>What the hell is all that got to do with CTags? Not much I admit. But we does<br />love to babble.</p><p>Python std libraries are your `A Team` of special operatives that can do the<br />work you tell em without constant supervision and no need to worry about the<br />messy details. Go nuke PHP! "Yes Sir!"<br /></p><p>One of these crackshot ninjas is Captain `bisect` with his special move<br />`bisect_left`.<br /></p><p>He's a great leader because he's absolutely fastidious in keeping his company of<br />troops in perfect sorted order at all times.<br /></p><p>Some of his duties involves training new recruits and he'll do a trick where<br />upon meeting them all for the first time he will get them to silently line up in<br />alphabetical order while he has his back turned. (we'll say alphabetical by<br />name, but well, these guys *are* macho army men so..)<br /></p><pre class="blackboard"><span class="keyword">>>></span> scum <span class="keyword">=</span> <span class="string">"omewEDjyFapAdxslfhbgBcnCtkqzuvri"</span><br /><br /><span class="keyword">>>></span> <span class="support">len</span><span class="metaFunctionCallPy">(scum)</span> <span class="keyword">==</span> <span class="constant">32</span><br /><span class="constant">True</span><br /><br /><span class="keyword">>>></span> men <span class="keyword">=</span> <span class="string">''</span>.<span class="metaFunctionCallPy">join(</span><span class="support">sorted</span><span class="metaFunctionCallPy">(scum))</span><br /><span class="keyword">>>></span> men<br /><span class="string">'ABCDEFabcdefghijklmnopqrstuvwxyz'</span></pre>( Isn't it funny how the ones that try and make them selves seem big are in fact the smallest? )<br /><br /><div style="padding: 1em; background-color: black; color: white;">" I bet I can find the position in the line of any one of you maggots after at most 5 guesses "<br /><p></p><p>A, Yeah right, he thinks. "Find q sir" </p><p>bisect, "man 16 are you a lesser man than q?" k, "Yes"<br />bisect, "man 24 are you a lesser man than q?" s, "No"<br />bisect, "man 20 are you a lesser man than q?" o, "Yes"<br />bisect, "man 22 are you a lesser man than q?" q, "No" (smiles)<br />bisect, "man 21 are you a lesser man than q?" p, "Yes"<br /></p><p>bisect, "man 22 you are q!"<br /><br />troops, "Bravo Sir!"</p></div><p>( bisect is a a bit of a kookoo and uses 0 based indexing. Lucky the scum neverseem to get confused )<br /></p><p>How does he do it!? </p><p>He starts in the middle (32 / 2 = 16) and compares what he was searching for with what he finds there. What he finds, k, is less than q so he knows he can rule out all other men before k as they too would be less than q.<br /></p><p>This only worked for him because all the men are in order.<br /></p><p>He then subdivides again. The man at position 16, k, is less than q so his area will start at<br />17 and extend in the opposite direction ending (as before) at 32 (the greatest man). He then looks for his next midpoint with an eye to ruling out another half of the remaining men.</p><p>( He repeats this simple process until his start point is no longer less than the<br />end point )<br /></p><pre class="blackboard">((17 + 32) // 2 = 24)<br />men[24] (s) is not lt q so his end point becomes 24<br /><br />((17 + 24) // 2 = 20)<br />men[20] (o) is less than q so his start point becomes 21<br /><br />((21 + 24) // 2 = 22)<br />men[22] (q) is not less than q :) so his end point becomes 22<br /><br />((21 + 22 // 2 = 21))<br />men[21] (p) is less than q so his start point becomes 22<br /></pre>Start is 22 and end is 22 so he has a match!<p>With 64 men (double 32) he would only take 1 more guess; As each guess rules<br />out half and half is the inversion of double.<br /><br />What about 128 (or some arbitrary number?) How many times can you half 128<br />before you get one (the right `one`).<br /><br />Or in other words what to the power of two makes 128?<br /><br />32 log 2 = 5<br />64 log 2 = 6<br />128 log 2 = 7<br /></p><pre class="blackboard"><span class="keyword">import</span> math<br /><span class="metaFunctionCallPy">math.log(</span><span class="constant">128</span><span class="metaFunctionCallPy">, </span><span class="constant">2</span><span class="metaFunctionCallPy">)</span><br /></pre>Captain bisect's talent scales exceedingly well and in fact he's ready to put<br />forth his talent to whatever use come up with.<br /><pre class="blackboard"><span class="keyword">import</span> bisect<br /><span class="metaFunctionCallPy">bisect.bisect_left(some_sequence, search)</span><br /></pre><p></p><p>WOAH What a verbose explanation! Too much detail! Couldn't you just have said<br />use `bisect.bisect_left` to index left most occurence of any item in a sequence.<br /><br />Yeah, that's kind of the point. Python is chock full of handy high level<br />abstractions you can trust to do the job without worrying about the details.<br />You are already zooming.<br /></p><p>"Ok you admit your are babbling but what the heck is this got to do with<br />CTags?" </p><p>To paraphrase, Ctags generates an index (in sorted order) of symbols to<br />be quickly and easily located. Sounds like a job for Captain bisect. He is<br />actually made of this stern stuff called `C` that makes him faster than anything<br />you could genetically engineer in your laboratories. </p><p>Python also has this useful thing going called duck typing. bisect will work<br />on any class that exposes a __getitem__ method.<br /><br />But what if you wanted Captain bisect to search a 50 MB ctags file? You can't<br />sub class a file object... can you? In any case each index would just return<br />a character wouldn't it?</p><p>A sneak preview of how to use bisect and mmap to binary search CTags `tags` files. Explanation to follow.<br /></p><pre class="blackboard"><span class="lineNumber"> 1 </span><span class="comment">#################################### IMPORTS ###################################<br /><span class="lineNumber"> 2 </span></span><br /><span class="lineNumber"> 3 </span><span class="keyword">from</span> __future__ <span class="keyword">import</span> with_statement<br /><span class="lineNumber"> 4 </span><br /><span class="lineNumber"> 5 </span><span class="keyword">import</span> os<br /><span class="lineNumber"> 6 </span><span class="keyword">import</span> bisect<br /><span class="lineNumber"> 7 </span><span class="keyword">import</span> mmap<br /><span class="lineNumber"> 8 </span><span class="keyword">import</span> unittest<br /><span class="lineNumber"> 9 </span><br /><span class="lineNumber">10 </span><span class="comment">################################### CONSTANTS ##################################<br /><span class="lineNumber">11 </span></span><br /><span class="lineNumber">12 </span><span class="comment"># CSV Column in tag file<br /><span class="lineNumber">13 </span></span>SYMBOL <span class="keyword">=</span> <span class="constant">0</span><br /><span class="lineNumber">14 </span>FILENAME <span class="keyword">=</span> <span class="constant">1</span><br /><span class="lineNumber">15 </span><br /><span class="lineNumber">16 </span><span class="comment">################################################################################<br /><span class="lineNumber">17 </span></span><br /><span class="lineNumber">18 </span><span class="storage">class</span> <span class="entity">TagFile</span>(<span class="support">object</span>):<br /><span class="lineNumber">19 </span> <span class="storage">def</span> <span class="support">__init__</span>(<span class="variable">self</span>, <span class="variable">p</span>, <span class="variable">column</span>):<br /><span class="lineNumber">20 </span> <span class="variable">self</span>.p <span class="keyword">=</span> p<br /><span class="lineNumber">21 </span> <span class="variable">self</span>.column <span class="keyword">=</span> column<br /><span class="lineNumber">22 </span><br /><span class="lineNumber">23 </span> <span class="storage">def</span> <span class="support">__getitem__</span>(<span class="variable">self</span>, <span class="variable">index</span>):<br /><span class="lineNumber">24 </span> <span class="variable">self</span><span class="metaFunctionCallPy">.fh.seek(index)</span><br /><span class="lineNumber">25 </span> <span class="variable">self</span><span class="metaFunctionCallPy">.fh.readline()</span><br /><span class="lineNumber">26 </span> <span class="keyword">return</span> <span class="variable">self</span><span class="metaFunctionCallPy">.fh.readline()</span>.<span class="metaFunctionCallPy">split(</span><span class="string">'</span><span class="constant">\t</span><span class="string">'</span><span class="metaFunctionCallPy">)</span>[<span class="variable">self</span>.column]<br /><span class="lineNumber">27 </span><br /><span class="lineNumber">28 </span> <span class="storage">def</span> <span class="support">__len__</span>(<span class="variable">self</span>):<br /><span class="lineNumber">29 </span> <span class="keyword">return</span> <span class="metaFunctionCallPy">os.stat(</span><span class="variable">self</span><span class="metaFunctionCallPy">.p)</span>.st_size<br /><span class="lineNumber">30 </span><br /><span class="lineNumber">31 </span> <span class="storage">def</span> <span class="entity">get</span>(<span class="variable">self</span>, *<span class="variable">tags</span>):<br /><span class="lineNumber">32 </span> <span class="keyword">with</span> <span class="support">open</span><span class="metaFunctionCallPy">(</span><span class="variable">self</span><span class="metaFunctionCallPy">.p, </span><span class="string">'r+'</span><span class="metaFunctionCallPy">)</span> <span class="keyword">as</span> fh:<br /><span class="lineNumber">33 </span> <span class="variable">self</span>.fh <span class="keyword">=</span> <span class="metaFunctionCallPy">mmap.mmap(fh.fileno(), </span><span class="constant">0</span><span class="metaFunctionCallPy">)</span><br /><span class="lineNumber">34 </span><br /><span class="lineNumber">35 </span> <span class="keyword">for</span> tag <span class="keyword">in</span> tags:<br /><span class="lineNumber">36 </span> b4 <span class="keyword">=</span> <span class="metaFunctionCallPy">bisect.bisect_left(</span><span class="variable">self</span><span class="metaFunctionCallPy">, tag)</span><br /><span class="lineNumber">37 </span> <span class="metaFunctionCallPy">fh.seek(b4)</span><br /><span class="lineNumber">38 </span><br /><span class="lineNumber">39 </span> <span class="keyword">for</span> l <span class="keyword">in</span> fh:<br /><span class="lineNumber">40 </span> comp <span class="keyword">=</span> <span class="support">cmp</span><span class="metaFunctionCallPy">(l.split(</span><span class="string">'</span><span class="constant">\t</span><span class="string">'</span><span class="metaFunctionCallPy">)[</span><span class="variable">self</span><span class="metaFunctionCallPy">.column], tag)</span><br /><span class="lineNumber">41 </span><br /><span class="lineNumber">42 </span> <span class="keyword">if</span> comp <span class="keyword">==</span> <span class="keyword">-</span><span class="constant">1</span>: <span class="keyword">continue</span><br /><span class="lineNumber">43 </span> <span class="keyword">elif</span> comp: <span class="keyword">break</span><br /><span class="lineNumber">44 </span><br /><span class="lineNumber">45 </span> <span class="keyword">yield</span> l<br /><span class="lineNumber">46 </span><br /><span class="lineNumber">47 </span> <span class="variable">self</span><span class="metaFunctionCallPy">.fh.close()</span><br /><span class="lineNumber">48 </span><br /><span class="lineNumber">49 </span><span class="comment">##################################### TESTS ####################################<br /><span class="lineNumber">50 </span></span><br /><span class="lineNumber">51 </span><span class="storage">class</span> <span class="entity">CTagsTest</span>(<span class="superclass">unittest.TestCase</span>):<br /><span class="lineNumber">52 </span> <span class="storage">def</span> <span class="entity">test_tags_files</span>(<span class="variable">self</span>):<br /><span class="lineNumber">53 </span> tags <span class="keyword">=</span> <span class="string">r"tags"</span><br /><span class="lineNumber">54 </span> tag_file <span class="keyword">=</span> <span class="metaFunctionCallPy">TagFile(tags, SYMBOL)</span><br /><span class="lineNumber">55 </span><br /><span class="lineNumber">56 </span> <span class="keyword">with</span> <span class="support">open</span><span class="metaFunctionCallPy">(tags, </span><span class="string">'r'</span><span class="metaFunctionCallPy">)</span> <span class="keyword">as</span> fh:<br /><span class="lineNumber">57 </span> latest <span class="keyword">=</span> <span class="string">''</span><br /><span class="lineNumber">58 </span> lines <span class="keyword">=</span> []<br /><span class="lineNumber">59 </span><br /><span class="lineNumber">60 </span> <span class="keyword">for</span> l <span class="keyword">in</span> fh:<br /><span class="lineNumber">61 </span> symbol <span class="keyword">=</span> <span class="metaFunctionCallPy">l.split(</span><span class="string">'</span><span class="constant">\t</span><span class="string">'</span><span class="metaFunctionCallPy">)</span>[SYMBOL]<br /><span class="lineNumber">62 </span><br /><span class="lineNumber">63 </span> <span class="keyword">if</span> symbol !<span class="keyword">=</span> latest:<br /><span class="lineNumber">64 </span><br /><span class="lineNumber">65 </span> <span class="keyword">if</span> latest:<br /><span class="lineNumber">66 </span> tags <span class="keyword">=</span> <span class="support">list</span><span class="metaFunctionCallPy">(tag_file.get(latest))</span><br /><span class="lineNumber">67 </span> <span class="variable">self</span><span class="metaFunctionCallPy">.assertEqual(lines, tags)</span><br /><span class="lineNumber">68 </span><br /><span class="lineNumber">69 </span> lines <span class="keyword">=</span> []<br /><span class="lineNumber">70 </span><br /><span class="lineNumber">71 </span> latest <span class="keyword">=</span> symbol<br /><span class="lineNumber">72 </span><br /><span class="lineNumber">73 </span> lines <span class="keyword">+=</span> [l]<br /><span class="lineNumber">74 </span><br /><span class="lineNumber">75 </span><span class="keyword">if</span> <span class="support">__name__</span> <span class="keyword">==</span> <span class="string">'__main__'</span>:<br /><span class="lineNumber">76 </span> <span class="metaFunctionCallPy">unittest.main()</span><br /><span class="lineNumber">77 </span><br /><span class="lineNumber">78 </span><span class="comment">################################################################################</span></pre><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-8944039198987836395?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-21584212905958667342008-08-28T18:41:00.001-07:002008-08-28T18:47:00.381-07:00GSOC OverViewMy GSOC project was all about testing for PyGame;<br /><ul><br /> <li> I wrote lots of tests; Almost every module in PyGame now has at least one test </li><br /> <li> Test modules can now be isolated in subprocesses; one segfault no longer brings down the whole test suite </li><br /> <li> Can now test for speed regressions; important for real time software such as games </li><br /> <li> PyGame Automated Build Page extended<br /> <ul> <li> Shows / Collects more info </li> <li> Runs tests in subprocesses </li></ul><br /> </li><br /> <li> Test Stubbing Utility: A Testing "Todo List" </li><br /> <li> Optional Interactive Tests / Test Tagging </li><br /></ul><br /><br /><p>For writing the tests I wrote a small utility that inspects the PyGame package and finds all the untested callables (functions, properties) and creates test stubs, including documentation for each so you don't have to leave the editor. The stubber knows which functions have already been tested by using a naming scheme for all of the tests. Essentially, "test_$callable__$comment", namespaced by having TestCase[s] per Class and a test module per module.<br /></p><p>In this way I could create stubs for each module, essentially a TODO list, and cycle through all the modules looking for tests that were easy to write. The functions in PyGame are many and greatly varied, each requiring somewhat specialised knowledge to test. I wasn't able to write tests for all them but hopefully the test stubbing utility will help enable some testing sprints. I intend to develop a testing website where people can submit bugs/tests in the form of a unittest.<br /></p><p>PyGame has a somewhat unique set of requirements compared to most python libraries in that most of the framework is actually written in C. C code when it goes awry can do some very strange things. We had a test runner running all of the tests in one single process so if one failed hard it would bring down the whole suite. This can be a bit of a pain so I developed a test runner that isolates each module in a subprocess.<br /></p><p>Some of the tests in PyGame have requirements that make them unsuitable for running as part of the main test suite. For example some require a CDRom, a JoyStick, take way too long or need interaction with a human. With the test runner script I extended unittest with the ability to exclude certain tests by tags. The tags can be module, class or individual test level and are inheritable/ over-ridable.<br /></p><p>Another extension to the test runner was the ability to randomize the run ordering of tests, so along with the test results the seed is printed out. If there are failures you can seed the randomizer with the failure inducing seed. We also wanted to be able to record the timings of each individual test so we could make comparisons between revisions / platforms. I again extended the test runner with that ability.<br /></p><p>I worked with Brian Fisher to extend the PyGame automated build page to record the test results in a ZODB and utilize the new test runner to run tests in subprocesses. We will be able to use this information for detecting speed regressions amongst other things.</p><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-2158421290595866734?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-25302540588857501972008-08-23T08:25:00.000-07:002008-08-23T08:29:27.669-07:00Johnny, Kick A Hole Right In The SkyJohnny, Kick a hole right in the sky! Won't some body testify? Poke a lion in it's eye!<br /><br />I bought pygame-testify.net today, and set up a python/cgi based form that takes a zip and enumerates the results + adds the (safe evaled) test results dict to a ZODB. <br /><br />I found a multi-part python snippet for POST[ing] of test results. <br /><br />The test/build page is starting to come together. <br /><br />I am using htpasswd for security.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-2530254058885750197?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-90249736100842324942008-08-02T16:22:00.000-07:002008-08-02T16:55:14.039-07:00todo_xxxxxxxI recently altered the "fail incomplete tests" mechanism we use in the pygame test runner. Before we were doing assertions on test_utils.test_not_implemented(). This would check a module level variable test_utils.fail_incomplete_tests, which we would set as desired depending on whether we wanted to fail incomplete tests.<br /><br />This was a fairly non-invasive technique but as I was already hijacking the test loading mechanism for filtering tests by tags, I realized I could alter the TestLoader class to pick up tests starting with the prefix "todo_" as well as "test_". I would call TestCase.fail directly which would only run if picking up todo_ tests.<br /><br />This of course meant altering all the stubs. I pondered briefly doing a mass search and replace, completely automating it but I don't really trust that for tests.<br /><br />For the test stubs I have been including the documentation so it's really easy to walk through a test file writing tests without having to leave the editor. I was just using inspect.getdoc to get the __doc__ string.<br /><br />It seems the documentation included in the .doc files is different to that contained in the __doc__ for each function. The __doc__ seems to be the function signature and a very brief, usually one sentence description. The .doc files contains a lot more detailed descriptions that can be very useful when writing tests. <br /><br />I quickly added a docs_as_dict() function to makeref.py, then added it to the stub generator. The stub generator will add both the __doc__ and the .doc file documentation to each stub.<br /><br />I went through semi-manually updating all the unfilled out stubs for each test file with the more complete docs and the new todo_xxxxx test naming. It took about an hour but I feel more confident than if I had just grep'd it.<br /><br />Everything is pretty much now in place for the test site I wanted to create.<br /><br />Test Timing<br />Test Tagging<br />Isolated Tests<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-9024973610084232494?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-72629913789465564612008-07-18T21:14:00.000-07:002008-07-18T21:37:48.991-07:00import test.unittest as unittestI split the test runner further, now into three files, with all the monkey business in unittest_patch. <br /><br />The patching is done by a patch() function taking an optparse options object as the solo argument, which drives the decisions behind which parts of unittest are patched.<br /><br />With the features we wanted I had to override some methods in a quite drastic way. I even needed to override TestCase.run, a many many line method. The only way I could do this was to basically copy/paste, alter and monkey-patch in. This meant sometimes calling private members.<br /><br />Unfortunately, the author of unittest had decided somewhere between python 2.4 and 2.5 that he would rename all the private members from the double underscore preceding __name_mangling convention to a single underscore _caution. <br /><br />As my mentor said (or something like it), "using an underscore is a warning, that said member is an implementation detail not an interface".<br /><br />What to do? We now include a 2.5 version of unittest in the test directory. Apparenly pygame has come full circle; it was included way back in the day before PyUnit was part of the standard library.<br /><br />All of our individual test files, typically $module_test.py, all import an unpatched unittest and run unittest.main() to make the module "conveniently executable". Only when running the complete suite is unittest enhanced with extra functionality.<br /><br />While I was in there tinkering with the internals, recording timings of individual tests I moved the redirect std(err|out) per module to per test. I then patched the TextTestRunner to dump stderr/stdout on error.<br /><br /><pre class='blackboard'><span class='storage'>def</span> <span class='entity'>printErrorList</span>(<span class='variable'>self</span>, <span class='variable'>flavour</span>, <span class='variable'>errors</span>):<br /> <span class='keyword'>for</span> test, err <span class='keyword'>in</span> ((e[<span class='constant'>0</span>], e[<span class='constant'>1</span>]) <span class='keyword'>for</span> e <span class='keyword'>in</span> errors):<br /> <span class='variable'>self</span>.stream.writeln(<span class='variable'>self</span>.separator1)<br /> <span class='variable'>self</span>.stream.writeln(<span class='string'>"</span><span class='stringInterpolation'>%s</span><span class='string'>: </span><span class='stringInterpolation'>%s</span><span class='string'>"</span> <span class='keyword'>%</span> (flavour, test))<br /> <span class='variable'>self</span>.stream.writeln(<span class='variable'>self</span>.separator2)<br /> <span class='variable'>self</span>.stream.writeln(<span class='string'>"</span><span class='stringInterpolation'>%s</span><span class='string'>"</span> <span class='keyword'>%</span> err)<br /><br /> <span class='comment'># DUMP REDIRECTED STDERR / STDOUT ON ERROR / FAILURE<br /></span> <span class='keyword'>if</span> <span class='variable'>self</span>.show_redirected_on_errors:<br /> stderr, stdout <span class='keyword'>=</span> <span class='support'>map</span>(<span class='variable'>self</span>.tests[test].get, (<span class='string'>'stderr'</span>,<span class='string'>'stdout'</span>))<br /> <span class='keyword'>if</span> stderr: <span class='variable'>self</span>.stream.writeln(<span class='string'>"STDERR:</span><span class='constant'>\n</span><span class='stringInterpolation'>%s</span><span class='string'>"</span> <span class='keyword'>%</span> stderr)<br /> <span class='keyword'>if</span> stdout: <span class='variable'>self</span>.stream.writeln(<span class='string'>"STDOUT:</span><span class='constant'>\n</span><span class='stringInterpolation'>%s</span><span class='string'>"</span> <span class='keyword'>%</span> stdout)<br /></pre><br /><br />It would be relatively easy to add in support for show locals() etc.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-7262991378946556461?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-69295385161573357562008-07-15T18:12:00.000-07:002008-07-15T18:49:21.895-07:00RedesignI decided to (had to) redesign the test runner, this time cutting more directly to the root of matters, overriding select methods of unittest classes.<br /><br />Before, in subprocess mode, I was calling the individual test modules, which would in turn run unittest.main() with all the attendant pains of cmd line options conflicting and output parsing. (we have to add profiling, exclusion by tags etc). One major design change I made was to unify the single / subprocess modes to use one test runner, (test_runner.py). <br /><br />In it, along with a lot of utility functions, is defined a run_test() function. It takes a list of modules and an options object as arguments. It compiles a dictionary of the test results and on completion either returns the dict or in subprocess mode pretty prints it to stdout. (This is then eval'd for an all_results.update(result))<br /><br /><pre class='blackboard'>RESULTS_TEMPLATE = {<br /> 'output' : '', # unittest.TextTestRunner output<br /> 'stderr' : '', # stderr outpout <br /> 'stdout' : '', # stdout output<br /> 'num_tests' : 0, # taken directly from the unittest results object<br /> 'failures' : [], # ditto<br /> 'errors' : [], # ditto<br />}<br /></pre><br /><br />In single process mode run_tests.py just imports from test_runner.py run_test() function and passes it the optparse options object and list of modules to search for tests. <br /><br />Both run_tests.py and test_runner.py, share the same optparse cmd line parser options. In subprocess mode, run_tests.py calls test_runner.py with essentialy the same sys.argv it was initiated with. if __main__ it runs the run_test() function on a list of [args[0]]. Now all the extra functionality and cmd line parsing is all in one place.<br /><br />There were quite a few extra little changes that have made it not perfect but a lot better. Adding exclution by tagging functionality took 10 minutes, most of the time being spent on picking a format. <br /><br /><pre class='blackboard'>|Tags:display|</pre><br /><br />Adding profiling decorators or whatever other functionality is desired will also be a lot easier now.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-6929538516157335756?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-25683905332862328902008-07-10T18:09:00.000-07:002008-07-15T18:51:40.340-07:00Comedy Of Errors** Build Page / Testing **<br />==========================<br /><br />As reported earlier, in reaction to the crashing tests rendering the build page ineffective, I have been working on creating a script to isolate test modules in subprocesses. The approach I took, was to compile the results of each isolated test into the same form as the old test runner. A quick hack, or so I hoped.<br /><br />I realised that subprocess out of the box has no cross platform non-blocking calls, so you can't timeout on hung tests. I had to find a recipe for this which unfortunately required win32 extensions. Not really a big deal but still time spent and dependencies.<br /><br />So what we have is a test runner parsing the results of a unittest text report, meant for human consumption, which is then in turn parsed by the automated build pages regexes. This seems pretty ridiculous, especially as the form is not exactly machine friendly. I could have (should have?) hacked into the build page code and modularised the test parsing code there, sharing between the test runner and build page.<br /><br />But then if you are going to do that why not just replace the TextTestRunner class with something completely customised for the job? Replace unittest bit by bit in an adhoc as-needed fashion? Slowly building a framework? I didn't want to. I'm not really supposed to be and that was the psychology in play.<br /><br />Another? foolish design decision I made, based on a shallow visual aesthetic of less LOC, was to parse unittest results in a way that only worked when there was no "test noise". What do I mean by test noise? print statments left in source code. C extensions that don't respect sys.stderr, sys.stdout redirection/supression.<br /><br />See below exhibit A, a specimen from a sunny day of testing.<br /><br />...............<br />---------------------------------------------------------------------<br />Ran 15 tests in 1.234s<br /><br />As the tests are running unittest prints to a stream, by default stderr, but it can be any file-like object of choice, either a dot an E or an F, mapping to pass, error or fail. I used a simple regex ^[.EF]*$ to find any "dots" in the return output. If there were any, I would take a slice from the length of the dots. From there I would take the first of a split at the "Ran xxx tests" boundary, defined as '%s\nRan' % (70 * '-'). In between the DOTS and the RAN_TEST_DIV (thus named) would lay the failures.<br /><br />To piece it all together as if it was the output of one run I would "join the dots" and join the failures. Then at the end count the total length of DOTS (., E, F combined), E)rrors and F)ailures. Voila. Worked a charm.<br /><br />What the hell was I thinking? The whole point of the exercise was to create a reliable test runner. I suppose I thought I was. I wrote a few tests for some spectacularly unimaginative cases. I compared output of single process mode and subprocess mode running some fake test suites, zero assertions, all passing, some failures, some errors. The subprocess mode was character for character perfect in its mime artistry. In fact it was for this easy, pull apart, bind together, compare automated testing that I did it in the first place. <br /><br />All was simple and peaceful, until I finally got a linux test box working again. (my laptop fan died) I used ssh to log in and run the tests from my friends windows machine. Of course one of the tests that required initiating the display failed.<br /><br />single process mode: 504 tests, FAIL (failures=1)<br />subprocess mode: 495 tests OK<br /><br />What the hell was going on? With horror I realized what I had done. Something was wreaking havoc with the fragile little regex. On failure a huge amount of debugging output was put out by one of the SDL functions interupting the DOTS. I thought about rewriting it using some more substantive regular expressions. I tossed up between doing that and redirecting sys.(stdout|stderr) and passing a StringIO to unittest for test results. I figured by doing that I would be able to keep the comparison tests I had in place, and for that matter the same degree of mimicry. I opted for redirecting std(err|out). I imagined other uses for this at the same time, none all that compelling upon reflection and only useful if implemented in another manner. (only show stdout/stderr on failure of test, can leave print statement debugging in there, I did global redirection)<br /><br />Of course to do that I needed to create a command line option for each individual test module to call from the "master" script in subprocess mode. Because unittest.main() is running with it's own getopt parsing, you can't just add an option and check sys.argv or use optparse. You have to do either and then clear those options from sys.argv which would otherwise cause unittest to error. So more fun hamfisting around with unittest. I realised that I would need to do that at some stage for profiling cmd line options so there was another push in that direction. All the time wondering whether I should just completely override the parseArgs method.<br /><br />I replaced the test_utils.get_fail_incomplete_option();unittest.main() in each module with a test_utils.get_command_line_options(). unittest.main() always calls sys.exit() on completion of tests so I had to subclass it, overriding one of it's methods. I did this because after catching the unittest result stream to a StringIO, I would restore stderr and write the results to it.<br /><br />I added in some test cases, print_stdout and print_stderr, comparing the results (I of course had to put a redirect mode onto single process mode for purposes of testing). Everything was OK again, until I ran it again on the the linux box through ssh.<br /><br />495 tests OK. (should have been 504 with one failure)<br /><br />Damn it! So it seems that some stderr, stdout is not redirected. I imagine it's mostly C extensions (or system calls) and the like that would do this but then that is pygame all over. Briefly I pondered printing results back out on stdout, and just PrayingTM that any such noise would always be stderr.<br /><br />So what did I do? What any fool, already invested would do. I decided to markup the results, with lines like.<br /><br /><pre class='blackboard'><!-- UNITTEST_RESULTS_START_HERE --!></pre><br /><br /> I created 3 sections using 2 divisors. The first is all the noise output, anything not respecting redirection. The second is the unittest results and the last is the multiplexed results, what you would see if running the script in a shell. I overrode the write method on a StringIO collecting unittest results and made it also write to (a previously redirected) stdout. using subprocess.Popen(...., stdout=subprocess.PIPE, stderr = subprocess.STDOUT) everything is muxed together. I then wrote a function that regex splits the 3, keeping the results for compiling DOTS. It's a long way from Kansas though isn't it Toto.<br /><br />What a PITA? That's not even the half of it. I ended up having to rewrite all the command lines I was passing to subprocess.Popen from string template to lists so it would work cross platform. Also, the way subprocess multiplexes stderr and stdout when you use the same file object for both is inconsistent cross platform. What you would see is not neccessarily what you get. On windows it would suffice to just "print compiled_test_results", but on linux had there was need to print >> sys.stderr.<br /><br />All in all, a lot of tipsy toeing around unittest. I really made a complete tangled webby mess of the whole job. A black comedy of errors. I'm not sure whether to remove the stderr/stdout redirection and replace the regexes with something less fragile. It's already been too much of a hole, sucking in time. I would have to update the run_tests__tests also.<br /><br />What would I do differently looking back? What would I do if I had no constraints? Unfortunately, probably two very different questions.<br /><br />** What I would do differently? **<br />==================================<br /><br />This much I do know, the build page and the test runner script require intersecting functionality. They both parse the results of a unittest TextTestRunner output to gather statistics on test results. I could have modularised this parsing functionality, sharing between the two of them. This really begs the question though, why parse something designed for human consumption at all? Why not pass a customised test runner class into unittest?<br /><br />Still there is the problem of communication across process boundaries, solved by using an asynchronous extension class of subprocess.Popen. Would you log the result of each processes output to a file using something like xml? Or maybe, pickling the results and then joining them back together? You could even have a client / server architecture, using sockets to transfer pickled test results as native python objects back to the server to piece together.<br /><br />As well as the requirement for isolation of tests, we are wanting to add profiling functionality and tagging to split tests into different groups.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-2568390533286232890?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-61272526752809886052008-07-08T19:58:00.000-07:002008-07-08T20:14:23.818-07:00killer reduxI was laboring under the bastard conception that when using subprocess.Popen(), shell=True is required for a subprocess executable to have access to the environment variables. Where the hell did I get that idea? Stupid unquestioned assumption that almost gave birth to a lasting bug. <br /><br />For the test runner I was using system calls to taskkill or pskill for process controll under windows. The idea was to try executing each and if one was on the %PATH% the return code would not be one of err. If this was the case then the search was over and a Popen wrapper of (taskkill|pskill) would suffice as an os.kill().<br /><br />This worked fine and dandy except that on windows98, there would be no error code if either of the task killers weren't on the path. It would define a useless os.kill.<br /><br />Lenard, the windows maintainer of PyGame questioned why use a hacky wrapper of pskill or one of it's ilk, when if there was already a reliance on pywin32, why not use win32api.TerminateProcess?<br /><br />That works fine but does not kill process trees, something I thought was a requirement due to using shell = True as a Popen constructor argument. Using shell = 1 calls cmd.exe etc which in turn calls the subprocess of choice.<br /><br />Realizing that there was only need to kill one process, and that it would also avoid problems with differing return codes on older versions of windows, TerminateProcess was given the job.<br /><br />Long live TerminateProcess.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-6127252675280988605?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-45042804558115082742008-07-04T23:28:00.000-07:002008-07-06T23:35:48.277-07:00dot points on build page extensionsI have been thinking about making some extensions to the build page.<br /><br /><h3>Raw Data<br /></h3><ul><li>Keep raw_data to process at any time. No need to discount old data collected from buggy analysis.</li></ul><br /><h3>Profiling<br /></h3><ul><li>Use function wrappers, that log profiling of each test and multiple calls. </li><br /><li> -p|--profile command line mode</li></ul><h3>Tests<br /></h3><ul><li> Use subprocess mode by default for run_tests.py </li><br /><li> Web interface for ticketing off tests </li><br /></ul><h3>Build information</h3><ul><li>Post compiler version</li><br /><li>Post complete Setup file</li><br /><li>Post complete build output</li><br /><li>Post complete test output</li><br /><li>Python sys.path</li><br /><li>Environment variables</li><br /><li>As much as possible, unprocessed for archives</li></ul><h3>Machine information<br /></h3><ul><li> Processor speed</li><br /><li>CDRom availability</li><br /><li>etc, etc.</li><br /></ul><h3>Breaking up tests </h3>Should the tests fail if a machine doesn't have a CD drive (assuming stubs were filled out) for example?<br /><br />Should tests that require Numeric or NumPy fail if neither available?<br /><br />There are some classes of tests that it seems to make sense to split apart from the main "base" group of tests. What should be the "base" group of tests to automate with the run_tests.py test runner?<br /><br />What about tests that require human verification? For the build page a "base" group of tests should be specified.<br /><br />What should be the requirements for machines sending results to the build page? Numeric, Numpy? win32 extensions on windows? A CD rom drive? 32 bit color display?<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-4504280455811508274?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-42043653576916167592008-07-03T21:10:00.000-07:002008-07-07T01:05:46.463-07:00test_not_implemented()<pre class='blackboard'>def test_get_arraytypes(self):<br /><br /> # __doc__ (as of 2008-06-25) for pygame.sndarray.get_arraytypes:<br /><br /> # pygame.sndarray.get_arraytypes (): return tuple<br /> # <br /> # Gets the array system types currently supported.<br /> # <br /> # Checks, which array system types are available and returns them as a<br /> # tuple of strings. The values of the tuple can be used directly in<br /> # the use_arraytype () method.<br /> # <br /> # If no supported array system could be found, None will be returned.<br /><br /> self.assert_(test_not_implemented()) <br /></pre><br /><br />test_not_implemented() will fail if any test suite is run with a "(-i|--incomplete)" command line option.<br /><br />As mentioned in previous posts, I developed a unittest stub generator that will output stubs for any untested units. It is supported by a naming scheme for the tests. The stubber will inspect the xxxx_test.py modules and based upon the names of the unittest.TestCase's and their children test_xxxx methods will determine what is already tested.<br /><br />For each public callable there is a corresponding test named test_$callable_name. Comments or descriptions will be appended to this separated by a double underscore.<br /><br /><pre class='blackboard'>test_quit__returns_None_if_not_already_init</pre><br /><br />What if there is a module.quit and a module.class.quit ? Each class has it's own TestCase (and thus namespace) named $classTypeTest. This is typically the case anyway with setUp()'s specific to the class tested.<br /><br /><pre class='blackboard'><span class='storage'>def</span> <span class='entity'>get_callables</span>(<span class='variable'>obj</span>, <span class='variable'>if_of</span> <span class='keyword'>=</span> <span class='constant'>None</span>, <span class='variable'>check_where_defined</span><span class='keyword'>=</span><span class='constant'>False</span>):<br /> publics <span class='keyword'>=</span> (<span class='support'>getattr</span>(obj, x) <span class='keyword'>for</span> x <span class='keyword'>in</span> <span class='support'>dir</span>(obj) <span class='keyword'>if</span> is_public(x))<br /> callables <span class='keyword'>=</span> (x <span class='keyword'>for</span> x <span class='keyword'>in</span> publics <span class='keyword'>if</span> <span class='support'>callable</span>(x) <span class='keyword'>or</span> isgetsetdescriptor(x))<br /><br /> <span class='keyword'>if</span> check_where_defined:<br /> callables <span class='keyword'>=</span> (c <span class='keyword'>for</span> c <span class='keyword'>in</span> callables <span class='keyword'>if</span> ( <span class='string'>'pygame'</span> <span class='keyword'>in</span> c.__module__ <span class='keyword'>or</span><br /> (<span class='string'>'__builtin__'</span> <span class='keyword'>==</span> c.__module__ <span class='keyword'>and</span> isclass(c)) )<br /> <span class='keyword'>and</span> REAL_HOMES.get(c, <span class='constant'>0</span>) <span class='keyword'>in</span> (<span class='constant'>0</span>, obj))<br /><br /> <span class='keyword'>if</span> if_of:<br /> callables <span class='keyword'>=</span> (x <span class='keyword'>for</span> x <span class='keyword'>in</span> callables <span class='keyword'>if</span> if_of(x)) <span class='comment'># isclass, ismethod etc<br /></span><br /> <span class='keyword'>return</span> <span class='support'>set</span>(callables)<br /></pre><br /><br />The script uses inspection to find all testables in pygame but there were a few complications, for example getter/setter properties and the fact that some objects need to be instantiated before inspection reveals their innards. Also, filtering out non-pygame callables and after that callables that appeared in more than one module.<br /><br />eg pygame.rect.Rect led a double life as pygame.sprite.Rect. Just check the __module__ attribute ?<br /><br /><pre class='blackboard'>In [<span class='constant'>4</span>]: pygame.sprite.Rect.__module__<br />Out[<span class='constant'>4</span>]: <span class='string'>'pygame'</span><br /></pre><br /><br />The workaround was to make a mapping of object to the place where it was defined. There were only 9 of these.<br /><br /><pre class='blackboard'>REAL_HOMES <span class='keyword'>=</span> {<br /> pygame.rect.Rect : pygame.rect,<br /> pygame.mask.from_surface : pygame.mask,<br /> pygame.time.get_ticks : pygame.time,<br /> .....<br /></pre><br /><br />On some of the classes the __module__ attribute was __builtin__ so I needed put an exception for them in the filtering out of non pygame callables.<br /><br /><pre class='blackboard'>In [<span class='constant'>7</span>]: pygame.cdrom.CDType.__module__<br />Out[<span class='constant'>7</span>]: <span class='string'>'__builtin__'</span><br /></pre><br /><br /><pre class='blackboard'><span class='storage'>def</span> <span class='entity'>module_stubs</span>(<span class='variable'>module</span>):<br /> stubs <span class='keyword'>=</span> {}<br /> all_callables <span class='keyword'>=</span> get_callables(module, <span class='variable'>check_where_defined</span> <span class='keyword'>=</span> <span class='constant'>True</span>) <span class='keyword'>-</span> IGNORES<br /> classes <span class='keyword'>=</span> <span class='support'>set</span> (<br /> c <span class='keyword'>for</span> c <span class='keyword'>in</span> all_callables <span class='keyword'>if</span> isclass(c) <span class='keyword'>or</span> c <span class='keyword'>in</span> MUST_INSTANTIATE<br /> )<br /><br /> <span class='keyword'>for</span> class_ <span class='keyword'>in</span> classes:<br /> base_type <span class='keyword'>=</span> class_<br /><br /> <span class='keyword'>if</span> class_ <span class='keyword'>in</span> MUST_INSTANTIATE:<br /> class_ <span class='keyword'>=</span> get_instance(class_)<br /><br /> stubs.update (<br /> make_stubs(get_callables(class_) <span class='keyword'>-</span> IGNORES, module, base_type)<br /> )<br /><br /> stubs.update(make_stubs(all_callables <span class='keyword'>-</span> classes, module))<br /><br /> <span class='keyword'>return</span> stubs<br /></pre><br /><br />The stubber finds all modules in the pygame package. For each module it uses inspection to create a set of all the callables minus those set in the IGNORE setting. This is here for any exceptions to the filtering and also for tests that have been grouped under one test name. These objects will not be stubbed.<br /><br /><pre class='blackboard'>IGNORES <span class='keyword'>=</span> <span class='support'>set</span>([<br /><br /> pygame.rect.Rect.h, pygame.rect.Rect.w,<br /> pygame.rect.Rect.x, pygame.rect.Rect.y,<br /><br /> pygame.color.Color.a, pygame.color.Color.b,<br /> pygame.color.Color.g, pygame.color.Color.r,<br /><br />......</pre><br /><br /><br />From that it creates a subset of "classes", the criteria being that for each element "inspect.isclass(element)" or that the element is in the manually set MUST_INSTANTIATE dict. This is a mapping of class to helper function, and instantiation args required to return an instance.<br /><br /><pre class='blackboard'>MUST_INSTANTIATE <span class='keyword'>=</span> {<br /> <br /> <span class='comment'># BaseType / Helper # (Instantiator / Args) / Callable<br /></span><br /> pygame.cdrom.CDType : (pygame.cdrom.CD, (<span class='constant'>0</span>,)),<br /> pygame.mixer.ChannelType : (pygame.mixer.Channel, (<span class='constant'>0</span>,)),<br /> pygame.time.Clock : (pygame.time.Clock, ()),<br /><span class="lineNumber"><br /><br />..<br /><br /></span>}</pre><br /><br />Inspecting the xxxxType would reveal no methods, and they needed to be instantiated, but then the object returned contained no other attributes; one example being __name__ needed later for determing the test name. Therefore the xxxxType was sent to the stub generation function as the "parent class" for each callable that was gathered by inspecting the instantiation.<br /><br />Any callables not in the "classes" set are assumed module level functions and a stub is created for each.<br /><br /><br />The test stubber is used from the command line:<br /> <br /><pre class='blackboard'>$ gen_stubs.py --help<br />Usage:<br />$ gen_stubs.py ROOT<br /><br />eg.<br /><br />$ gen_stubs.py sprite.Sprite<br /><br />def test_add(self):<br /><br /> # Doc string for pygame.sprite.Sprite:<br /><br /> ...<br /><br /><br />Options:<br /> -h, --help show this help message and exit<br /> -l, --list list callable names not stubs<br /> -t, --test_names list test names not stubs</pre><br /> <br /><pre class='blackboard'>$ gen_stubs.py pygame -l<br />pygame.base.error.args,<br />pygame.bufferproxy.BufferProxy.length,<br />pygame.bufferproxy.BufferProxy.raw,<br />pygame.event.Event,<br />pygame.image.tostring,<br />pygame.joystick.Joystick,<br />pygame.key.get_repeat,<br />pygame.mask.Mask,<br />pygame.mixer.Channel,<br />pygame.movie.Movie,<br />pygame.overlay.overlay.display,<br />pygame.overlay.overlay.get_hardware,<br />pygame.overlay.overlay.set_location,<br />pygame.pixelarray.PixelArray.surface,<br />pygame.sprite.AbstractGroup.add,<br />pygame.sprite.AbstractGroup.add_internal,<br />pygame.sprite.AbstractGroup.clear,<br />pygame.sprite.AbstractGroup.copy,<br />pygame.sprite.AbstractGroup.draw,<br />pygame.sprite.AbstractGroup.empty,<br />pygame.sprite.AbstractGroup.has_internal,<br />pygame.sprite.AbstractGroup.remove,<br />pygame.sprite.AbstractGroup.remove_internal,<br />pygame.sprite.AbstractGroup.sprites,<br />pygame.sprite.AbstractGroup.update,<br />pygame.sprite.collide_rect,</pre><br /><br />Commas are appended for easy copy/paste into IGNORE list.<br /><br />gen_stubs.py is an integral part of the plan to make it extremely easy for people to contribute to unittests. One man can only do so much.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-4204365357691616759?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-5311998778444709692008-06-30T20:05:00.000-07:002008-07-07T00:14:26.642-07:00subprocessedPyGame tests are structured in such a way that for each module in the pygame package (eg pygame.sprite, pygame.color) there is a test/xxxx_test.py file containing corresponding unittests. PyGame has an automated build page that shows build and test results for the latest svn version of PyGame on a variety of platforms and versions of python. It uses regular expressions to parse the results of the test runner script.<br /><br />The test runner script compiles tests from each of the xxxx_test.py files and runs them in a single process. Advantage: speed, disadvantage: instability. PyGame uses a lot of c code, and where there is c code there is potential for strange errors.<br />As one example, there was a test for the ability to save OpenGl surfaces which would segfault on windows. This would stop the test runner half way through, leaving it's output in a form the automated build page could not decipher. <br /><br />"Build Successful, Invalid Test Results"<br /><br />Other issues with running all tests in one process is the need to restore a "fresh" state for tests that rely on it. Conflicts can cause the test runner script to crash completely. On the other hand, some obscure bugs have been uncovered due to them.<br /><br />Besides writing tests for individual units I have recently been working on adding a subprocess mode to the python test runner script. It processes the output of each module's test script and outputs the results in the same form as the single process mode.<br /><br />There is a library called subunit that uses os.fork() to run unittest suites in subprocesses, that seemed like it would have been a perfect candidate for the job. Unfortunately windows doesn't have the fork system call so it was not an option. Windows python does not even provide os.kill().<br /><br />What good is running all tests in subprocesses if one of them hangs and python is using a blocking call to retrieve it's output?<br />As I was going to the trouble of making a subprocess mode, I realized I should deal with this possibility. Unfortunately the python subprocess module doesn't ship with async calls but I found a recipe on the ActiveState Python CookBook site.<br /><br />On windows it relies on win32pipe and win32file from the pywin32 package. I worked around the lack of os.kill on windows by using sytem calls to "taskkill" or "pskill". If a wayward test suite running in a subprocess doesn't finish up in a specified allowance of time then it will be os.kill'd. <br /><br /><pre class='blackboard'>COMPLETE_FAILURE_TEMPLATE <span class='keyword'>=</span> <span class='string'>"""<br />======================================================================<br />ERROR: all_tests_for (</span><span class='stringInterpolation'>%s</span><span class='string'>.AllTestCases)<br />----------------------------------------------------------------------<br />Traceback (most recent call last):<br /> File "test\</span><span class='stringInterpolation'>%s</span><span class='string'>.py", line 1, in all_tests_for<br /><br />subprocess completely failed with return code of </span><span class='stringInterpolation'>%s</span><span class='string'><br /><br />cmd: </span><span class='stringInterpolation'>%s</span><span class='string'><br /><br />return (abbrv):<br /></span><span class='stringInterpolation'>%s</span><span class='string'><br /><br />"""</span> <span class='comment'># Leave that last empty line else build page regex won't match<br /></pre><br /><br />Running each test suite in a subprocess is a huge performance hit. I think for the automated build page the performance hit won't really effect the experience as it's all running headlessly from cron jobs. Nevertheless, I added the ability to run subprocessed tests simultaneously in multiple threads. Also, the single process mode is still available as is running module specific tests suites.<br /><br />I wrote some tests comparing the output of (single|sub)process modes running a group of fake test suites, some all OK, some with errors and failures. <br /><br /><pre class='blackboard'>$ run_tests__test.py<br />all_ok suite OK<br />failures1 suite OK<br /><br />2/2 passes<br /><br />-h for help<br /><br />$ run_tests__test.py -h<br /><br />-v, to output diffs even on success<br />-u, to output diffs of unnormalized tests</pre><br /><br />The standard library module difflib is very good, and extremely well documented.<br /><br />Other than the obvious differences such as timing which are normalized before comparison, all is OK :)<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-531199877844470969?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-12749317879029600582008-06-25T18:48:00.000-07:002008-07-06T18:52:48.994-07:00TestingJust a quick note,<br /><br />Still waiting on my fan, it's getting shipped in from Amurricah.<br /><br />I wrote a run_tests_sub.py the other day that uses subprocess to run each xxxx_test.py in the trunk/test directory.<br /><br />It will run with an optional threads paramater: -t num_threads<br /><br />$ run_tests_sub.py -t 4<br /><br />Apparently this runs faster on mult-core.<br /><br /><br /><br />It should output results similar to run_tests.py though may need tweaking to get it run transparently in place of run_tests.py for the build page.<br /><br />Speaking of the build page, Rene and I have had a few ideas for a combined build / test web app that collected builds and test statistics ( profiling / passes etc). Also, a means to distribute the writing of tests. Many hands make light work.<br /><br />If it was possible to be assigned a stub of a test to fill out and then post it back painlessly we could quite quickly increase the coverage of our tests. If twenty people filled out 1 test a week, then over a month that would be 80 extra unit tests.<br /><br />ATM there are "FAILED (failures=232)", unimplemented tests and possibly that many again that haven't been stubbed out waiting to be written.<br /><br />$ run_tests.py -i<br /><br /><br />Will show tests that need fleshing out.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-1274931787902960058?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-86938707938391539372008-06-20T22:41:00.000-07:002008-06-20T22:51:12.152-07:00Aha!I realised why the change from CONSTANT = (expr) to CONSTANT = [expr] fixed the bug in the color_test.py<br /><br />A generator expression is only good for one iteration and after that it will act as an empty sequence. I thought it would be a reusable lazily evaluated simily of a list comp. Turns out I was dead wrong.<br /><br />I went over the stub generator recently and it's pretty much in it's finalized form as far as the naming scheme is concerned.<br /><br />Can't wait to get my own computer back in action.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-8693870793839153937?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com2tag:blogger.com,1999:blog-7254386064439146371.post-26712567944202132902008-06-18T03:26:00.000-07:002008-06-18T03:30:21.453-07:00FanDamn fan on my laptop packed it in. I will have to convince my friend to let me install linux on his windows box while I wait for a replacement. Can't get a windows build of development pygame at the moment due to failing tests... or can I? Temporarily disable the failing tests and let the build farm run?<br /><br />What a pain in the arse.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-2671256794420213290?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0tag:blogger.com,1999:blog-7254386064439146371.post-82556755548847866822008-06-12T21:05:00.000-07:002008-07-06T18:41:26.763-07:00HappeningsI have been writing unittests using the naming scheme (see below) keeping to it as much as possible.<br /><br />There have been a few modifications but that's fine as long as I am consistent. I haven't yet written the part of the test stub generator that filters from the generated tests any tests for units that have already been written. I am letting the writing of tests dictate the naming schemes evolution.<br /><br />Will post some more thoughts on the naming scheme in days to come. Also thoughts on one to one test names.<br /><br />Thoughts on speed of test suites, isolating "dangerous" tests that can crash the whole test suite.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7254386064439146371-8255675554884786682?l=akalias.blogspot.com'/></div>akaliasnoreply@blogger.com0