tag:blogger.com,1999:blog-26724274731199231092008-07-01T07:17:07.650-07:00Marco CraveiroMarco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comBlogger40125tag:blogger.com,1999:blog-2672427473119923109.post-5108280428745201822008-06-19T09:43:00.000-07:002008-07-01T07:17:07.709-07:00Nerd Food: On the UbuntuBox<div style="text-align: justify;"><span style="font-weight: bold;"><br />The Regular User</span><br /><br />As with many other geeks, I find myself in the unofficial position of "computer guy" for family and friends (F&F). This entails sorting out broken computers, providing advice on new purchases for a variety of devices and software packages, installing said devices and packages, doing security updates and giving security advice, providing mini-tutorials on applications, teaching basic programming - the list goes on and on. The funny thing is, much like all other nerds, I may moan about my duties but secretly I enjoy performing them. Sometimes I get to setup people with cheap Ubuntu boxen, which they may complain about a little in the beginning but eventually use in extremely productive ways; and even on windows setups, I get to understand what drives people to Microsoft and what its weak and strong points are. Its a very instructive job.<br /><br />Things got even more interesting since I bought my eee PC, a device that is selling like fire in the polar winter. The eee PC helped me understand a bit better how most people think. All along we, the Linux community, have focused on providing a user experience that is very similar to Windows: you may not have a Start Menu but you have the Ubuntu logo; toolbars and menus are very similar and so on. Even the newest eye-candy is similar to Mac and Vista's way of doing things (although there's always the chicken-and-egg problem). The end result is interesting: developers and regular Linux users are now convinced that no one should have any difficulties at all moving from Windows to KDE or Gnome; on the other hand, as soon as you sit a user down on a Linux box, he or she immediately tells you that something is not right. Objectively, the average user will probably not be able to point out what's wrong, if anything at all, other than "this is not Windows, can we not have Windows please".<br /><br />The fantastic thing about the eee PC was that, of all people I showed it to, not a single one said: bah, this is not Windows. Most of them got on with the user interface immediately, and found it really intuitive. In all my years of advocating Linux I never before seen a reaction like this. I did absolutely no advocating whatsoever, no mention about freedom or the superiority of free software. Just letting them play with it was enough. As an example, my girlfriend has been using Linux for over 5 years, and I get the periodic complains of "why can't we just use windows" whenever I have some difficulties installing a device, or I break the world on a dist-upgrade. But within minutes of playing with the eee, her reaction was: "I want one of these!!".<br /><br />The reactions I've seen towards the eee PC are almost the opposite of the few Vista users I've spoken to. Sure, Vista looks nice, but have you tried installing a one-click wireless router? That's when F&F call me out, when it all goes wrong with the "one-click" cheap product they bought. But thing is, I can't say much in Ubuntu's defense either. For example, I spent several days installing a Huawei e220 modem to provide 3G Internet access to my nephews, and let me just tell you, trivial would not have been a word one could apply to any part of the process. Vodaphone's new clever GUI may be good for Vodaphone users but I never got the damn thing to cooperate. True, the whole exercise wasn't taxing for a nerd - hey, its fun to look at AT commands now and then - but there is no way, just noooo waaaay a regular user would have gone through the pain, even with the brilliant Ubuntu forums to hand.<br /><br />Now, before we go any further, I can already hear the complaints: "so you've chatted to what, twenty people, and now you think you understand the market?". Well, that much is true, I cannot claim any statistical accuracy to my diagnostics. These are my opinions; the entire article is based on empiricism and small samples. However, if my line argumentation is done correctly and rightly interprets the success of the eee, then there must be some truth to my views because I've tried to align them with the eee. The market has given a verdict on this gadget, loudly and unequivocally.<br /><br />The eee PC is also a brilliant illustration of the huge divide between regular users and the developers who are tasked with providing software for them. At a moment in time where the Gnome community is yet again rethinking the future of Gnome, not a single regular user would find this debate interesting. This should send all the alarm bells ringing, but unfortunately that doesn't seem to be the case. The truth is, regular users don't want flashy 3D desktops, although they can eventually cope with them; they don't need spinning cubes although they may start using them once they understand it. What they really want is simplicity. They have a simple set of tasks to perform, and they want to do so cheaply and reliably, and they truly do not understand why everything has to be so complicated and why do computers have to change so much so often.<br /><br />So what made the eee popular? In my opinion, there are two key points:<br /><ul><li>Its cheap. No one would even have a look at it if it was 400 GBP</li><li>Its easy.<br /></li></ul>These are the key selling points to a regular user. To illustrate the second point, when I said to my girlfriend I was thinking about installing Ubuntu Hardy on the eee, she replied in dismay: "Why would you do that??".<br /><br /><span style="font-weight: bold;">The Regular User Use Cases</span><br /><br />The key thing to notice about the eee is that most users don't even know its not running Windows. Its just an appliance, a bit like a PlayStation, and thus there is no need to enquire about it's operative system. Like an appliance, it is also expected to be switched on and just work - the fast boot reinforces this idea. The interface provided is also designed for the tasks common to the vast majority of regular computer users, and allows them to find things fast. But, looking at the wider problem, what do our regular users do with their computers? I compiled a list of all use cases I found in my user base:<br /><ul><li>Internet: email, browsing, playing on-line games and youtube;</li><li>Listen to music, sync with their music player;</li><li>Watch local video content;</li><li>Talk with their friends: IM, VOIP</li><li>Play (basic) games: on all cases, real gaming is done on the PlayStation;</li><li>Work: word-processor by far, some spreadsheet use but "it's quite hard";</li><li>Burning and ripping;</li><li>Downloading: torrents, etc. Not very popular because "its complicated";</li><li>Digital photo management: storage, some very basic manipulation (make it smaller for emailing);</li><li>Printing: mainly for school/University; pictures in very few cases.<br /></li></ul>In addition to these, some additional requirements crop up:<br /><ul><li>Windows users all have proprietary firewalls and virus scanners;</li><li>All machines are multi-user, and data must be kept private - especially with the youngsters;</li><li>Machines must withstand battering: switched off at any point, banged about, dropped, etc;</li><li>Internet connectivity is vital, ADSL, cable and 3G are used. Computers are useless without the Internet;</li><li>Wireless around the house is vital. External wireless is nice, but not frequently used because "it's too complicated";</li><li>Costs must be kept exceedingly low as IT budget is normally very low;<br /></li></ul>That's it. You'd be amazed with the percentage of the market one covers with only these use cases; not just doing them, but doing them well, like a PlayStation plays games.<br /><br />And what are the biggest complaints about computers?<br /><ul><li> They're really hard. Installing hardware and software is a nightmare, and they'd be stuffed without the local nerd;</li><li>They break easily. One of my Vista users is still in disbelief that installing wireless drivers could cause the DVD drive to stop working;</li><li>They're expensive. Sure you can get a cheap'ish box but then everything else is expensive (software, peripherals, etc);</li><li>They change far too frequently. Most users just about got around XPs user interface just to see it all change again;</li><li>They're insecure. They don't know how or why but that's what they've heard. That and the constant popups that look like viruses.<br /></li></ul>On one hand, the regular user is quite advanced, making multi-user and networking a central part of its computer experience. On the other hand, he/she is very naive: the vast majority of computing power goes under-utilised - the OS gobbling most of the resources for no good reason - and the majority of software expenses easily avoidable by using freely available applications. Regular users haven't got nowhere near using Media Centres, "clever" media management software, or even connecting their PCs to the TVs. All these things they consider "advanced" and yet nerds and more savvy users have been doing it for years. One cannot help but feel that there is a massive market out there for the taking - a market that Vista cannot aim to grab because it's diametrically opposed to its needs - and yet, no one else seems to find the path to its door.<br /><br /><span style="font-weight: bold;">UbuntuBox: The Hardware Platform</span><br /><br />The rest of this article is an Ubuntero Gedankenexperiment: if I was a manufacturer, what sort of box would I like my F&F to have? What would make my life and their life easier? The short answer to that question is a PlayStation 2 like box but with PC-like functionality. The long answer is, well, long.<br /><br />I'm not going to bother with engineering reality here - I'm sure some requirements will be so conflicting they cannot possibly be implemented. However, I've got zero experience in hardware manufacturing, weights, cooling, large scale deployment and so on - so much so that I'm not even going to bother pretending; any assumptions I'd make would be wrong anyway. So, to make matters easy, I'll just ask for it all - impossible or not - and wait for the reality check to come in.<br /><br />The first, very different thing about our box is that it's not a computer. Well, inside it is a regular PC of course, but it doesn't look like one. It is designed to look exactly like a DVD player, and to fit your living room. A bog-standard black-box with a basic LED display would do. Inside, it has:<br /><ul><li>Multiple cores: four would be ideal, but at least two. They don't have to be particularly fast (1.x Ghz would do, but I guess 2 Ghz would be easier to find);</li><li>4 GB of RAM: can be the slowest around, but we need at least 4; the more the merrier, of course;</li><li>250 to 500 GB hard drive: the more the merrier. Doesn't have to be fast, we just need the space;</li><li>Average video card: key things are RGB/HDMI and TV out; resolution decent enough to play most games (not the latest);</li><li>Loads of USB ports;</li><li>RW DVD drive;</li><li>Analog TV + DVB card (for FreeView in England);</li><li>Wired and Wireless Ethernet;</li><li>Sound card with 5.1 surround sound: doesn't have to be a super card, just an entry level one would do;</li><li>SD card, compact flash readers;</li><li>Ability to control the box with a remote control;</li></ul>And now the key limiting factor:<br /><ul><li>The overall cost of the box must not exceed 200 GBP. This may require some tweaking, e.g. if raising it to 299 means we can put all features in, it may be worthwhile.</li></ul>Notice that all the hardware will be standard on all boxes of the same generation. This is all commodity hardware - certainly nothing proprietary - but without the heterogeneity that is associated with it. Note that control is a key feature - the limiting of user and vendor freedom to swap things at will. We'll return to the topic later on, as I'm sure it will prove controversial.<br /><br />Now, how does the box behave for the regular use case? Well, you buy it, plug it in, set all the cables up and start it up. You will see only two things on boot: the logo (say the Ubuntu logo) fading in and out, and the console password. That's it. No BIOS, no flashing X-Server, nothing else. Within a few seconds you'll be prompted for the console password and given an option of not needing a password in the future (Note: console is _not_ root). Lets leave the desktop at that for the moment as we'll cover it properly in the next section.<br /><br />What about Internet access, you ask? Well, you will need to buy one of the available modems:<br /><ul><li>3G;</li><li>ADSL;</li><li>Cable.<br /></li></ul>Each of these modems are made available at market prices (i.e. as cheap as possible); however, they will have been officially and exhaustively tested and stamped with a "UbuntuBox compliant vX" or some such, where vX is the box's generation. To be compliant means that your hardware has been throughly tested and is known to work with the hardware and software in a given generation. When you plug any of these devices after console login, a simple wizard will appear asking you to choose a provider. Each provider will have been also part of a certification program before inclusion.<br /><br />The other networking device is an Ethernet Switch. This is only required if your modem does not come with switching abilities (maybe in the 3G case). Network Manager already does a pretty good job of this, so all you'll need to do is setup the network on your console session (SSID, etc). You can use a USB keyboard for this or just endure typing from the remote control.<br /><br />Note that the certification requirement is extended to all hardware used with the box. In other words, there is a pretty draconian control on the hardware platform. Users are, of course, free to do as they wish with the device they bought, but if they go down the uncertified route, all support contracts are rendered void (more on this later). The truth is, its impossible to provide cost-effective support to all possible permutations of off-the-shelf hardware - a fact all Linux and Windows nerds are all too aware, as are Mac engineers. There will always be some weird combination that makes things break, and it can take many, many man-days to fix it; when you have 1M boxen out there, this cost would be prohibitive. The only way is to control the standard platform.<br /><br />For all of its closeness, the certification process is actually open when compared with other companies. All the criteria involved is made available in public websites, APIs with all the hooks required to extend wizards are public (with examples), companies are free to do public dry runs and any company can request a slot for validation. Perhaps some cost needs to be associated with the process (time is money after all, and we must discourage the less serious companies), but in general, the process is fair and public. The tests, however, are stringent; hardware that passes _cannot_ fail when deployed in the wild.<br /><br />One final note with regards to entry level hardware. Some people may not be aware, but the computing power available as standard today is incredibly high. For example, one of the PCs I maintain has a 1Ghz CPU, 512 MB of ram, 10 GB hard drive and an average ATI card; I bought it for 60 GBP. This machine runs Ubuntu Hardy and sometimes has to cope with as many as 3 users logged on. It doesn't do any of the 3D Compiz special effects due to the dodgy ATI card, but it does pretty much everything else. You'd be surprised on what you can do with the slowest RAM, cheapest sound-card and so on.<br /><br /><span style="font-weight: bold;">UbuntuBox: The Software Platform</span><br /><span style="display: block;" id="formatbar_Buttons"><span class="down" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"></span></span><br />By now you must have guessed that the box would be running Ubuntu; but this is not your average Ubuntu. Using an interface along the lines of <a href="http://www.markshuttleworth.com/wp-content/uploads/2008/06/nb-remix-launcher.png">Remix</a>, we would make a clear statement that this is an appliance - not a PC. As the eee has demonstrated, perceptions matter the most. Remix's <a href="http://www.markshuttleworth.com/archives/151">interface</a> will remind no one of Windows, whilst at the same time making the most common tasks really easy to locate.<br /><br />In addition to regular Ubuntu, the software platform would provide, out-of-the box, complete media support. This entails having GStreamer will all the proprietary plugins, Adobe's flash and any other plug-ins that may be required for it to play all the media one can throw at it.<br /><br />The UbuntuBox is mainly a clever Media Centre, and, as such, applications such as <a href="http://elisa.fluendo.com/home/">Elisa</a>, <a href="http://www.gnome.org/projects/rhythmbox/">Rhythmbox</a>/<a href="http://wiki.banshee-project.org/Main_Page">Banshee</a>, <a href="http://f-spot.org/Main_Page">F-Spot</a>, etc are at the core of the user experience. These applications would need to be modified slightly to allow for a better multi-user experience (e.g. shared photo/music collections, good PVR and DVB support, etc), but on the whole the functionality they already provide is more than sufficient for most users.<br /><br />As with the hardware side, the software platform is tightly controlled. Only official Ubuntu repositories are allowed, and all software is tested and known to work with the current generation of boxen. And, as with hardware, the software platform is made available for third-party who want to deploy their wares. An apt interface similar to click 'n run is made available so that commercial companies can sell their wares on the platform and charge for it. They would have to go through compliance first, of course, but if the number of boxes out there is large enough, there will be companies interested in doing so. This would mean, for example, that a games market could begin to emerge based on Wine; instead of having each user test each Windows application for their particular setup, with many users having mixed results, this would put the onus of the testing on the company owning the platform and on the software vendor. Games would have to be repackaged as debs and be made installable just like any other Debian package. Of course, the same logic could be applied to any windows Application.<br /><br />As I mentioned previously, boxen come with support contracts. A standard support contract should provide:<br /><ul><li>Access to all security fixes;</li><li>Troubleshooting of problems, including someone remotely accessing your machine to help you sort it out.<br /></li></ul>Due its homogeneity, UbuntuBox is very vulnerable to attacks. If an exploit is out in the wild, large number of boxen can be compromised very quickly. To make things a bit safer, the platform has the following features:<br /><ul><li>SELinux is used throughout;</li><li>All remote access is done via SSH and is only enabled on demand (e.g. when tech support needs access);</li><li>All users have passwords and must change them regularly;</li><li>There is an encrypted folder (or vault) for important documents, available from each user's desktop.<br /></li></ul>Finally, notice that binary drivers and proprietary applications are avoided when possible - e.g. Intel drivers would be preferable to nVidia, provided they have the same feature-set. However, where the proprietary solutions are technically superior, they should be used. Skype springs to mind.<br /><br /><span style="font-weight: bold;">UbuntuTerm</span><br /><br />Readers may be left wondering, "this is all very nice and dandy, but am I supposed to do my word processing using a TV?". Well, not quite. Whilst the TV is central, its use is focused on the gaming and Media Centre aspects of the box. If you want to use UbuntuBox as a regular PC, you will need to buy a UbuntuTerm. Just what is a UbuntuTerm? It is a dumb terminal of "old" in disguise (e.g. <a href="http://www.ltsp.org/">LTSP</a>). It is nothing but a LCD display of a moderately decent size (19" say), with an attached PC - the back of the monitor or the base would do, as the hardware is minimal. The PC has a basic single core chip with low power consumption to avoid fans and on-board video, sound and wireless Ethernet. It is designed to boot off the network if <a href="http://en.wikipedia.org/wiki/BOOTP">BOOTP</a> can be used over wireless; if not, from flash. Whichever way it boots, its configured to find the mothership and start an <a href="http://probing.csx.cam.ac.uk/about/xdmcp.html">XDMCP</a> session on it. Its price should hover around the 100 GBP mark.<br /><br />As with any decent terminal these days, UbuntuTerm is designed to fool you in believing you are sitting on the server. X already does most of the magic required, but we need to take it one level further: if you start playing music, the audio will come out of your local speakers via pulseaudio; if you plug your iPod via its USB port, the device will show up on your desktop; if you start playing a game, the FPSs you get remotely will comparable to playing it on the server. As with everything else mentioned in this article, all of these technologies are readily available on the wider community; its a matter of packaging them in a format that regular users can digest (see Dave's <a href="http://davelargo.blogspot.com/">blog</a> for example).<br /><br />The standard hardware on a UbuntuTerm is as follows:<br /><ul><li>Low RAM, basic video card;</li><li>Speakers attached to monitor;</li><li>SD Card, compact flash readers;</li><li>WebCam, headset;</li><li>Lots of USB ports<br /></li></ul>A house can have as many UbuntuTerms as required, and the server should easily cope with at least 6 of them without too much trouble, depending on what sort of activities the users get up to.<br /><br />Finally, in addition to the UbuntuTerm in hardware, there is also a UbuntuSoftTerm. This is nothing but a basic Cygwin install with X.org, allowing owners of PCs to connect to their UbuntuBox without having to buy an entire UbuntuTerm.<br /><br /><span style="font-weight: bold;">Conclusions</span><br /><br />UbuntuBox is an attempt to ride the wave of netbooks; it also tries to make strengths out of Linux's weaknesses. The box is not may not live up to everyone's ideals of Free Software, but its main objective is to increase Ubuntu's installed base, allowing us to start applying leverage against the hardware and software manufacturers. The design of the box takes into account the needs of a very large segment of the market which have basic computing needs, but don't want to became experts - just like a PlayStation owner does not want to know the ins-and-outs of the PowerPC chips.<br /><br />The UbuntuBox is an appliance, and as such is designed to be used in a fairly rigid number of ways, but that cannot be avoided if one wants to stay true to its nature. The more freedom one gives to users, the worse the end product will be for the Regular User, which cares not for intricate technical detail.<br /><br />Note also I haven't spent much time talking about business models for the company providing UbuntuBoxen. The opportunities should be there to create a sustainable business, based on revenue streams such as monthly payments for support, fees from OEMs, payments to access the platform (content providers). However, I don't know too much about making money so I leave that as an exercise to the reader. The other interesting aspect is comunity leverage. If managed properly, a project of this nature could enjoy large amounts of comunity participation: in testing, packaging, marketing, support - in fact, pretty much all areas can be shared with the comunity, reducing costs greatly.<br /><br />All and all, if there was an UbuntuBox out there for sale, I'd buy it. I think such a device would have a good chance of capturing this illusive segment of the market, giving Linux a foothold, however small, on the desktop.<br /><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-67247240727395610412008-06-11T14:18:00.000-07:002008-06-11T15:04:52.723-07:00Nerd Food: On Evolutionary Methodology<div style="text-align: right;">Unix's durability and adaptability have been nothing short of astonishing. Other technologies have come and gone like mayflies. Machines have increased a thousand-fold in power, languages have mutated, industry practice has gone through multiple revolutions - and Unix hangs in there, still producing, still paying the bills, and still commanding loyalty from many of the best and brightest software technologists on the planet.<span style="font-style: italic;"> -- ESR </span><br /></div><br /><div style="text-align: right;">Unix...is not so much a product as it is a painstakingly compiled oral history of the hacker subculture<span style="font-style: italic;">. -- Neal Stephenson </span><br /></div><div style="text-align: justify;"><br /><span style="font-weight: bold;">The Impossibly Scalable System </span><br /><br />If development in general is an art or a craft, its finest hour is perhaps the maintenance of existing systems which have high availability requirements but are still experiencing high rates of change. As we covered previously, maintenance in general is a task much neglected in the majority of commercial shops, and many products suffer from entropic development; that is, the piling on of changes which continuously raise the complexity bar, up to a point where it is no longer cost-effective to continue running the existing system. The word "legacy" is in itself filled with predestination, implying old systems cannot avoid time-decay and will eventually rot into oblivion.<br /><br />The story is rather different when one looks at a few successful Free and Open Source Software (FOSS) systems out there. For starters, "legacy" is not something one often hears on that side of the fence; projects are either maintained or not maintained, and can freely flip from one state to the other. Age is not only _not_ a bad thing, but, in many cases, it is a remarkable advantage. Many projects that survived their first decade are now stronger than ever: the Linux kernel, x.org, Samba, Postgresql, Apache, gcc, gdb, subversion, GTK, and many, many others. Some, like Wine, took a decade to mature and are now showing great promise.<br /><br />Each of these old timers has its fair share of lessons to teach, all of them incredibly valuable; but the project I'm particularly interested in is the Linux kernel. I'll abbreviate it to Linux or "the kernel" from now on. <br /><br />As published <a href="https://www.linux-foundation.org/publications/linuxkerneldevelopment.php">recently</a> in a study by Kroah-Hartman, Corbet and McPherson, the kernel suffers a daily onslaught of unimaginable proportions. Recent kernels are a joint effort of thousands of kernel hackers in dozens of countries, a fair portion of which working or well over 100 companies. On average, these developers added or modified around 5K lines per day during the 2.6.24 release cycle and, crucially, removed some 1.5K lines per day - and "day" here includes weekends too. Kernel development is carried out in hundreds of different kernel trees, and the merge paths between these trees obeys no strictly enforced rules - it does follow convention, but rules get bent when the situation requires it.<br /><br />It is incredibly difficult to convey in words just how much of a technical and social achievement the kernel is, but one is still compelled to try. The absolute master of scalability, it ranges from the tiniest embedded processor with no MMU to the largest of the large systems - some spanning as many as 4096 processors - and covering pretty much everything else in between: mobile phones, Set-Top Boxes (STBs), game consoles, PCs, large severs, supercomputers. It supports more hardware architectures than any other kernel ever engineered, a number which seemingly keeps on growing at the same rate new hardware is being invented. Linux is increasingly the kernel of choice for new architectures, mainly because it is extremely easy to port. Even real time - long considered the unassailable domain of special purpose - is beginning to cave in, unable to resist the relentless march of the penguin. And the same is happening in many other niches.<br /><br />The most amazing thing about Linux may not even be its current state, but its pace, as clearly demonstrated by Kroah-Hartman, Corbet and McPherson's analysis of kernel source size: it has displayed a near constant growth rate between 2.6.11 and 2.6.24, hovering at around 10% a year. Figures on this scale can only be supported by a catalytic development process. And in effect, that is what Linux provides: by getting better it implicitly lowers the entry barrier to new adopters, which find it closer and closer to their needs; thus more and more people join in and fix what they perceive to be the limitations of the kernel, making it even more accessible to the next batch of adopters.<br /><br />Although some won't admit it now, the truth is none of the practitioners or academicians believed that such a system could ever be delivered. After all, Linux commits every single schoolboy error: started by an "inexperienced" undergrad, it did not have much of an upfront design, architecture and purpose; it originally had the firm objective of supporting only a single processor on x86; it follows the age-old monolithic approach rather than the "established" micro-kernel; it is written in C instead of a modern, object-oriented language; its processes appear to be haphazard, including a clear disregard for Brook's law; it lacks a rigorous Q&A process and until very recently even a basic kernel debugger; version control was first introduced over a decade after the project was started; there is no clear commercial (or even centralised) ownership; there is no "vision" and no centralised decision making (Linus may be the final arbiter, but he relies on the opinions of a lot of people). The list continues ad infinitum.<br /><br />And yet, against all expert advice, against all odds, Linux is the little kernel that could. If one were to write a spec covering the capabilities of vanilla 2.6.25, it would run thousands of pages long; its cost would be monstrous; and no company or government department would dare to take on such an immense undertaking. Whichever way you look at it, Linux is a software engineering singularity.<br /><br />But how on earth can Linux work at all, and how did it make it thus far?<br /><br /><span style="font-weight: bold;">Linus' Way </span><br /></div><br /><div style="text-align: right;">I'm basically a very lazy person who likes to get credit for things other people actually do. <span style="font-style: italic;">-- Linus Torvalds </span><br /></div><div style="text-align: justify;"><br />The engine of Linux's growth is deeply rooted in the kernel's methodology of software development, but it manifests itself as a set of core values - a culture. As with any other school of thought, not all kernel hackers share all values, but the group as a whole displays some obvious homogeneous characteristics. These we shall call Linus' Way, and are loosely summarised below (apologies for some redundancy, but some aspects are very interrelated).<br /><br /><span style="font-style: italic;">Small is beautiful </span><br /></div><ul style="text-align: justify;"><li>Design is only useful on the small scale; there is no need to worry about the big picture - if anything, worrying about the big picture is considered harmful. Focus on the little decisions and ensure they are done correctly. From these, a system will emerge that _appears_ to have had a grand design and purpose.</li><li>At a small scale, do not spend too long designing and do not be overambitious. Rapid prototyping is the key. Think simple and do not over design. If you spend too much time thinking about all the possible permutations and solutions, you will create messy and unmaintainable code which will very likely going to be wrong. Best implement a small subset of functionality that works well, is easy to understand and can be evolved over time to cover any additional requirements.<br /></li></ul><div style="text-align: justify;"><br /><span style="font-style: italic;">Show me the Code </span><br /></div><ul style="text-align: justify;"><li>Experimentation is much more important than theory by several orders of magnitude. You may know everything there is to know about coding practice and theory, but your opinion will only be heard if you have solid code in the wild to back it up.</li><li>Specifications and class diagrams are frowned upon; you can do them for your own benefit, but they won't sell any ideas by themselves.</li><li>Coding is a messy business and is full of compromises. Accept that and get on with it. Do not search for perfection before showing code to a wider audience. Better to have a crap system (sub-system, module, algorithm, etc.) that works somewhat today than a perfect one in a year or two. Crap systems can be made slightly less crappy; vapourware has no redeeming features. </li><li>Merit is important, and merit is measured by code. Your ability to do boring tasks well can also earn a lot of brownie points (testing, documentation, bug hunting, etc.) and will have a large positive impact on your status. The more you are known and trusted in the community, the easier it will be for you to merge new code in and the more responsibilities you will end up having. Nothing is more important than merit as gauged by the previous indicators; it matters not what position you hold on your company, how important your company is or how many billions of dollars are at stake - nor does it matter how many academic titles you hold. However, past actions do not last forever: you must continue to talk sense to have the support of the community. </li><li>Testing is crucial, but not just in the conventional sense. The key is to release things into a wider population ("Release early, release often"). The more exposure code has the more likely bugs will be found and fixed. As ESR put it, "Given enough eyeballs, all bugs are shallow" (dubbed Linus' law). Conventional testing is also welcome (the more the merrier), but its no substitute for releasing into the wild.</li><li>Read the source, Luke. The latest code is the only authoritative and unambiguous source of understanding. This attitude does not in anyway devalue additional documentation; it just means that the kernel's source code overrides any such document. Thus there is a great impetus in making code readable, easy to understand and conformant to standards. It is also very much in line with Jack Reeve's view that source code is the only real specification a software system has.</li><li>Make it work first, then make it better. When taking on existing code, one should always first make it work as intended by the original coders; then a set of cleanup patches can be written to make it better. Never start by rewriting existing code.<br /></li></ul><div style="text-align: justify;"><span style="font-style: italic;">No sacred cows </span><br /></div><ul style="text-align: justify;"><li> _anything_ related to the kernel can change, including processes, code, tools, fundamental algorithms, interfaces, people. Nothing is done "just because". Everything can be improved, and no change is deemed too risky. It may have to be scheduled, and it may take a long time to be merged in; but if a change is of "good taste" and, when required, provided the originator displays the traits of a good maintainer, it will eventually be accepted. Nothing can stand on the way of progress.</li><li>As a kernel hacker, you have no doubts that you are right - but actively you encourage others to prove you wrong and accept their findings once they have been a) implemented (a prototype would do, as long as it is complete enough for the purpose) b) peer reviewed and validated. In the majority of cases you gracefully accept defeat. This may imply a turn-around of 180 degrees; Linus has one this on many occasions.</li><li>Processes are made to serve development. When a process is found wanting - regardless of how ingrained it is or how useful it has been in the past - it can and will be changed. This is often done very aggressively. Processes only exist while they provide visible benefits to developers or, in very few cases, due to external requirements (ownership attribution comes to mind). Processes are continuously fine-tuned so that they add the smallest possible amount of overhead to real work. A process that improves things dramatically but adds a large overhead is not accepted until the overhead is shaved off to the bare bone.<br /></li></ul><div style="text-align: justify;"><span style="font-style: italic;">Tools </span><br /></div><ul style="text-align: justify;"><li>Must fit the development model - the development model should not have to change to fit tools;</li><li>Must not dumb down developers (i.e. debuggers); a tool must be an aid and never a replacement for hard-thinking;</li><li>Must be incredibly flexible; ease of use can never come at the expense of raw, unadultered power;<br /></li><li>Must not force everyone else to use that tool; some exceptions can be made, but on the whole a tool should not add dependencies. Developers should be free to develop with whatever tools they know best.<br /></li></ul><div style="text-align: justify;"><span style="font-style: italic;">The Lieutenants: </span><br /><br />One may come up with clever ways of doing things, and even provide conclusive experimental evidence on how a change would improve matters; however, if one's change will disrupt existing code and requires specialised knowledge, then it is important to display the characteristics of a good maintainer in order to get the changes merged in. Some of these traits are:</div><ul style="text-align: justify;"><li>Good understanding of kernel's processes;</li><li>Good social interaction: an ability to listen to other kernel hackers, and be ready to change your code;</li><li>An ability to do boring tasks well, such as patch reviews and integration work;</li><li>An understanding of how to implement disruptive changes, striving to contain disruption to the absolute minimum and a deep understanding of fault isolation.<br /></li></ul><div style="text-align: justify;"><span style="font-style: italic;">Patches<br /><br /></span></div>Patches have been used for eons. However, the kernel fine-tuned the notion to the extreme, putting it at the very core of software development. Thus all changes to be merged in are split into patches and each patch has a fairly concise objective, against which a review can be performed. This has forced all kernel hackers to _think_ in terms of patches, making changes smaller and concise, and splitting scaffolding and clean up work and decoupling features from each other. The end result is a ridiculously large amount of positive externalities - unanticipated side-effects - such as technologies that get developed for one purpose but uses that were never dreamt of by their creator. The benefits of this approach are far too great to discuss here but hopefully we'll have a dedicated article on the subject. <div style="text-align: justify;"><br /><span style="font-style: italic;">Other </span><br /></div><ul style="text-align: justify;"><li>Keep politics out. The vast majority of decisions are taken on technical merits alone, and very rarely for political reasons. Some times the two coincide (such as the dislike for binary modules in the kernel), but one must not forget that the key driver is always the technical reasoning. For instance, the kernel uses the GNU GPL v2 purely because its the best way to ensure its openness, a key building block of the development process.</li><li>Experience trumps fashion. Whenever choosing an approach or a technology, kernel hackers tend to go for the beaten track rather than new and exciting ones. This is not to say there is no innovation in the kernel; but innovators have the onus of proving that their approach is better. After all, there is a solid body of over 30 years of experience in developing UNIX kernels; its best to stand on the shoulders of giants whenever possible.</li><li>An aggressive attitude towards bad code, or code that does not follow the standards. People attempting to add bad code are told so in no uncertain terms, in full public view. This discourages many a developer, but also ensures that the entry bar is raised to avoid lowering the signal-to-noise (S/N) ratio.<br /></li></ul><div style="text-align: justify;"><br />If there ever was a single word that could describe a kernel hacker, that word would have to be "pragmatic". A kernel hacker sees development as a hard activity tht should remain hard. Any other view of the world would result in lower quality code.<br /><br /><span style="font-weight: bold;">Navigating Complexity</span><br /><br />Linus has stated in many occasions he is a big believer of development by evolution rather than the more traditional methodologies. In a way, he is the father of the evolutionary approach when applied to software design and maintenance. I'll just call this the evolutionary methodology (EM) by want of a better name. EM's properties make it strikingly different from everything that has preceded it. In particular, it appears to remove most forms of centralised control. For instance:<br /><br /></div><ul style="text-align: justify;"><li>It does not allow you to know where you're heading in the long run; all it can tell you is that if you're currently on a favourable state, a small, gradual increment is _likely_ to take you to another, slightly more favourable state. When measured in a large timescale it will appear as if you have designed the system as a whole with a clear direction; in reality, this "clearness" is an emergent property (a side-effect) of thousands or small decisions.</li><li>It exploits parallelism by trying lots of different gradual increments in lots of members of its population and selecting the ones which appear to be the most promising.</li><li>It favours promiscuity (or diversity): code coming from anywhere can intermix with any other code.<br /></li></ul><div style="text-align: justify;"><br />But how exactly does EM work? And why does it seem to be better than the traditional approaches? The search for these answers takes us right back to the fundamentals. And by "fundamentals", I really mean the absolute fundamentals - you'll have to grin and bear, I'm afraid. I'll attempt to borrow some ideas from Popper, Taleb, and Dawkins to make the argument less nonsensical.<br /><br />That which we call reality can be imagined as a space with a really, really large number of variables. Just how large one cannot know, as the number of variables is unknowable - it could even be infinite - and it is subject to change (new variables can be created; existing ones can be destroyed, and so on). With regards to the variables themselves, they change value every so often but this frequency varies; some change so slowly they could be better describbed as constants, others so rapidly they cannot be measured. And the frequency itself can be subject to change.<br /><br />When seen over time, these variables are curves, and reality is the space where all these curves live. To make matters more interesting, changes on one variable can cause changes to other variables, which in turn can also change other variables and so on. The changes can take many forms and display subtle correlations.<br /><br />As you can see, reality is the stuff of pure, unadulterated complexity and thus, by definition, any attempt to describe it in its entirety cannot be accurate. However, this simple view suffices for the purposes of our exercise. <br /><br />Now imagine, if you will, a model. A model is effectively a) the grabbing of a small subset of variables detected in reality; b) the analysis of the behaviour of these variables over time; c) the issuing of statements regarding their behaviour - statements which have not been proven to be false during the analysis period; d) the validation of the models predictions against past events (calibration). Where the model is found wanting, it needs to be changed to accommodate the new data. This may mean adding new variables, removing existing ones that were not found useful, tweaking variables, and so on. Rinse, repeat. These are very much the basics of the scientific method.<br /><br />Model are rather fragile things, and its easy to demonstrate empirically why. First and foremost, they will always be incomplete; exactly how incomplete one cannot know. You never know when you are going to end outside the model until you are there, so it must be treated with distrust. Second, the longer it takes you to create a model - a period during which validation is severely impaired - the higher the likelihood of it being wrong when its "finished". For very much the same reasons, the larger the changes you make in one go, the higher the likelihood of breaking the model. Thirdly, the longer a model has been producing correct results, the higher the probability that the next result will be correct. But the exact probability cannot be known. Finally, a model must endure constant change to remain useful - it may have to change as frequently as the behaviour of the variables it models.<br /><br />In such an environment, one has no option but to leave certainty and absolutes behind. It is just not possible to "prove" anything, because there is a large component of randomness and unknown-ability that cannot be removed. Reality is a messy affair. The only certainty one can hold on to is that of fallibility: a statement is held to be possibly true until proven false. Nothing else can be said. In addition, empiricism is highly favoured here; that is, the ability to look at the data, formulate an hypothesis without too much theoretical background and put it to the test in the wild.<br /><br />So how does this relate to code? Well, every software system ever designed is a model. Source code is nothing but a set of statements regarding variables and the rules and relationships that bind them. It may model conceptual things or physical things - but they all inhabit a reality similar to the one described above. Software systems have become increasingly complex over time - in other words, taking on more and more variables. An operative system such as multics, deemed phenomenally complex for its time, would be considered normal by today's standards - even taking into account the difficult environment at the time with non-standard hardware, lack of experience on that problem domain, and so on.<br /><br />In effect, it is this increase in complexity that breaks down older software development methodologies. For example, the waterfall method is not "wrong" per se; it can work extremely well in a problem domain that covers a small number of variables which are not expected to change very often. You can still use it today to create perfectly valid systems, just as long as these caveats apply. The same can be said for the iterative model, with its focus on rapid cycles of design, implementation and testing. It certainly copes with much larger (and faster moving) problem domains than the waterfall model, but it too breaks down as we start cranking up the complexity dial. There is a point where your development cycles cannot be made any smaller, testers cannot augment their coverage, etc. EM, however, is at its best in absurdly complex problem domains - places where no other methodology could aim to go. <br /><br />In short, EM's greatest advantages in taming complexity are as follows:<br /></div><ul style="text-align: justify;"><li><span style="font-style: italic;">Move from one known good point to another known good point.</span> Patches are the key here, since they provide us with small units of reviewable code that can be checked by any experienced developer with a bit of time. By forcing all changes to be split into manageable patches, developers are forced to think in terms of small, incremental changes. This is precisely the sort of behaviour one would want in a complex environment.</li><li><span style="font-style: italic;">Validate, validate and then validate some more.</span> In other words, Release Early, Release Often. Whilst Linus has allowed testing and Q&A infrastructure to be put in place by interested parties, the main emphasis has always been placed in putting code out there in the wild as quickly as possible. The incredibly diverse environments on which the kernel runs provide a very harsh and unforgiving validation that brings out a great number of bugs that could not have possibly been found otherwise.</li><li><span style="font-style: italic;">No one knows what the right thing is, so try as many possible avenues as possible simultaneously. </span>Diversity is the key, not only in terms of hardware (number of architectures, endless permutations within the same architecture, etc.), but also in terms of agendas. Everyone involved in Linux development has their own agenda and is working towards their own goal. These individual requirements, many times conflicting, go through the kernel development process and end up being converted into a number of fundamental architectural changes (in the design sense, not the hardware sense) that effectively are the superset of all requirements, and provide the building blocks needed to implement them. The process of integrating a large change to the kernel can take a very long time, and be broken into a sequence of never ending patches; but many a time it has been found that one patch that adds infrastructure for a given feature also provides a much better way of doing things in parts of the kernel that are entirely unrelated.<br /></li></ul><div style="text-align: justify;"><br />Not only does EM manage complexity really well but it actually thrives on it. The pulling of the code base in multiple directions makes it stronger because it forces it to be really plastic and maintainable. It should also be quite clear by now that EM can only be deployed successfully under somewhat limited (but well defined) circumstances, and it requires a very strong commitment to openness. It is important to build a community to generate the diversity that propels development, otherwise its nothing but the iterative method in disguise done out in the open. And building a community entails relinquishing the traditional notions of ownership; people have to feel empowered if one is to maximise their contributions. Furthermore, it is almost impossible to direct this engine to attain specific goals - conventional software companies would struggle to understand this way of thinking.<br /><br />Just to be clear, I would like to stress the point: it is not right to say that the methodologies that put emphasis on design and centralised control are wrong, just like a hammer is not a bad tool. Moreover, its futile to promote one programming paradigm over another, such as Object-Orientation over Procedural programming; One may be superior to the other on the small, but on the large - the real world - they cannot <span style="font-style: italic;">by themselves</span> make any significant difference (class libraries, however, are an entirely different beast).<br /><br />I'm not sure if there was ever any doubt; but to me, the kernel proves conclusively that the human factor dwarfs any other in the production of large scale software.<br /><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-23001907110086737932008-01-28T22:43:00.000-08:002008-01-28T22:51:04.111-08:00Super Angola!!!<div style="text-align: justify;">Incredible. Amazing. We actually did it. We managed to beat Senegal. Our stars <span class="blsp-spelling-error" id="SPELLING_ERROR_0">Flavio</span> and specially the new Manchester United player <span class="blsp-spelling-error" id="SPELLING_ERROR_1">Manucho</span> did the job and the end result was an amazing 3-1. Now we're only one draw away from going past the group stages for the first time ever. So all fingers crossed for Thursday 17:00 UK time, when we face the very difficult obstacle of Tunisia.<br /></div><br /><br /><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_nEck2BGjqOg/R57LlQVDzuI/AAAAAAAAAE8/idXIRCngF3Y/s1600-h/flavio_manucho.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp0.blogger.com/_nEck2BGjqOg/R57LlQVDzuI/AAAAAAAAAE8/idXIRCngF3Y/s320/flavio_manucho.jpg" alt="" id="BLOGGER_PHOTO_ID_5160786063912062690" border="0" /></a><span style="font-style: italic;">(C) 2008 Associated Press</span><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-45886301220520326162008-01-27T04:54:00.000-08:002008-01-27T05:08:46.180-08:00Ghana 2008 - Forca Palancas!!<div style="text-align: justify;">The emotion is running high on the African Cup! Angola started well against our regional rivals South Africa, but yielded at the end. To be fair, South Africa was dominant for periods of the game, and did deserve the draw. Today we have a rather difficult game against Senegal (UK 17:00). The coverage in the UK has been superb, with all the games available on BBC interactive (on BBC1 just press the Red Button).<br /><br />A positive note for Ghana too: the stadiums are superb, and things have been rather well organised, if we ignore minor glitches (like the electricity going, or playing two games on the sames stadium without allowing the grass to recover or the disorganisation with regards to granting press credentials). The camera work has been top notch, at European level. The sound could perhaps be a bit better. All and all, the best CAN ever, methinks. One lesson Angola should learn for 2010 is to ensure all tickets get sold. Its much more important to have all stadiums full that to profit from the event.<br /><br />The main site for the event is <a href="http://www.ghanacan2008.com/">http://www.ghanacan2008.com/</a>. Not the best (can't find any pictures or live results, and the content is rather limited), but not the worst either, showing how far things have come and how much the quality bar has been raised.<br /></div><br /><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp2.blogger.com/_nEck2BGjqOg/R5x_XQVDztI/AAAAAAAAAE0/GKRqHt346Rs/s1600-h/palancas.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp2.blogger.com/_nEck2BGjqOg/R5x_XQVDztI/AAAAAAAAAE0/GKRqHt346Rs/s320/palancas.jpg" alt="" id="BLOGGER_PHOTO_ID_5160139310556761810" border="0" /></a><span style="font-style: italic;">(C) MTNFootball.com</span><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-72749906023899303302007-10-20T05:24:00.000-07:002007-10-20T14:22:24.061-07:00.signature<div style="text-align: right;"><span style="font-style: italic;">One man's constant is another man's variable. -- Alan Perlis</span><br /></div><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_nEck2BGjqOg/Rxnz1IogxVI/AAAAAAAAAEA/2wzuQkMUe28/s1600-h/alan_perlis.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp3.blogger.com/_nEck2BGjqOg/Rxnz1IogxVI/AAAAAAAAAEA/2wzuQkMUe28/s320/alan_perlis.jpg" alt="" id="BLOGGER_PHOTO_ID_5123394145286669650" border="0" /></a><br /><a href="http://en.wikipedia.org/wiki/Alan_Perlis">Alan Perlis</a> was one of the finest specimens of the Real Programmer breed. Back in the days where Computer Scientists didn't exist, he and his kind were responsible for making many of the decisions that shape our view of computers today. I'm particularly fond of Perlis because of his views on Compuer Science:<br /><br /><blockquote>I think that it's extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don't think we are. I think we're responsible for stretching them, setting them off in new directions, and keeping fun in the house. I hope the field of computer science never loses its sense of fun. Above all, I hope we don't become missionaries. Don't feel as if you're Bible salesmen. The world has too many of those already. What you know about computing other people will learn. Don't feel as if the key to successful computing is only in your hands. What's in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more.</blockquote><div style="text-align: right;"><span style="font-style: italic;">The Structure and Interpretation of Computer Programs</span> by Abelson, Sussman, and Sussman<br /></div><br />Unfortunately, things haven't quite turned out like Perlis would have wanted.<br /><br />Besides of his many contributions to Computer Science, such as his work on ALGOL, Perlis is very well known for his <a href="http://www.cs.yale.edu/quotes.html">Epigrams on Programming</a>, of which our quote is the first one. I like this quote because it reminds me that there can never be an ultimate truth in programming due to our human condition.Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-72657968651775732932007-10-03T15:16:00.000-07:002007-10-04T12:56:38.640-07:00Nerd Food: Interview with Federico Mena-Quintero<span style="FONT-STYLE: italic">Pretty much anyone who is involved with Free Software - even just as a lowly user like myself - has heard of Federico. His </span><a style="FONT-STYLE: italic" href="http://www.gnome.org/~federico/news.html">blog </a><span style="FONT-STYLE: italic">is a source of insightful ideas on Gnome, and lately, on performance - combined with a healthy dose of interest in politics and, more importantly,<a href="http://www.gnome.org/~federico/news-2007-08.html#cilantro-chutney"> good food</a>. I decided to send a few questions to Federico, mainly on the topics I was most curious about, and he kindly replied to my questions - and did so in record time! Many thanks to Federico for taking time off his busy hacking schedule for this interview. </span><br /><div style="TEXT-ALIGN: center"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.gnome.org/"><img id="BLOGGER_PHOTO_ID_5117239882353263938" style="DISPLAY: block; MARGIN: 0px auto 10px; CURSOR: pointer; TEXT-ALIGN: center" alt="" src="http://bp1.blogger.com/_nEck2BGjqOg/RwQWj4ogxUI/AAAAAAAAAD4/9n675li0amY/s320/Gnome-logo.jpg" border="0" /></a><span style="font-size:+0;">(C) Gnome Foundation</span><br /></div><span style="FONT-STYLE: italic"><br />1. You are one of the founders of the Gnome project, which is currently celebrating ten years of existence. On a recent interview you gave to Fosdem, you considered the platform to be maturing. However, as we all know, the last 10% normally take 90% of the time, and it's considered to be boring work. What do you think the Gnome project needs to do to get people to focus on those remaining 10%?</span><span style="FONT-STYLE: italic"> </span><br /><br />Basically, to provide an incentive to get that last 10% of the work done :) Instead of smacking people with a stick for not writing documentation, you could have a web page with a bar chart of "percentage of documentation coverage". Then it becomes a competition: use a carrot instead of a stick.<br /><br />I'd also like companies to get more involved in this. If they want to ship GNOME as a development platform they support, then they could very well employ people to do those missing bits.<br /><br /><span style="FONT-STYLE: italic">2. You have been one of the champions of performance in Gnome for a while now. As functionality increased, Gnome started suffering more and more from performance problems, particularly when looked at from a low end perspective. You have been trying to explain to the masses that performance work is interesting. What do you think can be done to increase developer focus on this neglected area?<br /></span><br />The thing about fixing performance problems is that nobody teaches you how to do it. There is very little documentation out there on how to generically approach an optimization problem (I intend to do something about this, but oh, time, time, time!) :)<br /><br />Also, sometimes you fix a performance problem, but it reappears in the future. This happens when you don't leave an infrastructure in place to let you run a benchmark periodically. You need to be able to see if there are performance regressions.<br /><br />Our tools are slowly getting better, but there are really very few people working on optimization and profiling tools. It takes a *ton* of time and skill to write a good tool; maybe that's why there are so few of them.<br /><br />Finally, profiling and optimizing is really about following the scientific method ("make a hypothesis, change one thing at a time, measure, confirm your hypothesis, etc."). This requires discipline and a lot of patience.<br /><br />Basically, it's a problem of education :)<br /><br /><span style="FONT-STYLE: italic">3. Earlier on this year, Gnome users and developers met for GUADEC. Did you find the conference as productive as in previous years? How important is GUADEC for the Gnome user and developer community?<br /><br /></span>Yes, this GUADEC was tremendously productive! I think the venue helped a lot; the Birmingham Conservatoire is rather compact and has nice practice rooms that anyone can use. So, you could grab a couple of hackers and go to a room to hack peacefully.<br /><br />GUADEC has always been important, even more so now that our community is large and widespread. It is about the only time in the year when most of the GNOME contributors get together in a single place and are able to talk in person. Do not underestimate the productivity of talking over a beer :)<br /><br /><span style="FONT-STYLE: italic">4. From the outside world, it appears Novell is a company who has regained it's soul and direction with Linux. How was the transition from Ximian into Novell?<br /></span><br />Like all acquisitions, it was a bit rought at first. It's what you get when you switch from being in a small company where you know all of the employees, to one with several thousands of people. You have to adjust to bigger processes, more layers of management, new locations, new paperwork...<br /><br />It has been very interesting to see the mindset of the old-time Novell people change over time. At first they seemed reluctant to touch Linux and free software, since they were of course Windows users. Then we had a period with lots of questions, lots of bugs that needed to be fixed, lots of re-training... and now we are in a very nice period, when people have accepted that we must all use our own free software. People seem to be productive with it and happy.<br /><br />I miss the monkeys, though.<br /><br /><span style="FONT-STYLE: italic">5. You are currently telecommuting from Mexico, a position envied by a most developers out there. Do you find that telecommuting helps improving your productivity? Are there any downsides to it?<br /><br /></span>It has good things and bad things. Good things: working in your pajamas if you feel like it, not having to commute, taking a pause when you are stuck in a hard problem to do a bit of gardening. Bad things: you can't talk to people in person. You must fix all your networking problems yourself. Sometimes, when you are uninspired, it's nice to be able to look over someone else's shoulder or talk to them.<br /><br /><span style="FONT-STYLE: italic">6. Can you describe your typical day at work?</span><br /><br />Well, since I work from home... :)<br /><br />I wake up. If my wife and I are hungry, we make breakfast while my email gets downloaded. If we are not hungry, I'll just check for super-urgent email and then start programming (fixing bugs, doing new development, reviewing patches, etc.).<br /><br />I usually try to get some programming done in the morning, while my brain is fresh. Processing your email in the morning is a really bad idea; it will take you up to the afternoon and by then you'll be tired to really write code.<br /><br />We have lunch at really irregular hours. Sometimes it's more like an early dinner. I have the bad habit of not stopping working until I'm exhausted or my wife is angry that we haven't gone out to the supermarket yet, but I'm trying to fix that :)<br /><br />In the afternoon I tend to do "light" work... maintaining wikis, answering email, coordinating people. I don't really have a fixed work schedule.<br /><br /><span style="FONT-STYLE: italic">7. Many developing countries are increasingly looking at Free Software as a way to bring down the digital divide. Do you find that Mexico is taking advantage of Free Software - particularly since it has two lead Free Software developers? Are there any lessons to be learned from Mexico's experience?</span><br /><br />Mexico is blessed and cursed to be so close to the USA. There is plenty of basic usage of free software by individuals (often enthusiastic students), but relatively little usage in the public and private sectors.<br /><br />People in Mexico get very impressed by rich people; most Mexicans want to be like the rich people from the USA they see on TV. It's very easy to woo us into accepting their ways.<br /><br />So, every time there has been some noise about using free software in the public sector, Bill Gates has flown down, organized a big business lunch with government officials, and made sure that they keep using Microsoft products. If you are an ignorant politician, you will love to gloat that you had lunch (imagine, lunch!) with Bill Gates, the richest man in the world --- and whatever he says must be correct, of course. The problem we have is that most of our politicians don't have the faintest idea of the economic and cultural implications of free software, unlike those in the European Union (see the recent report on the economic impact of free software there!).<br /><br />Thanks for the interview!Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-9273200286436151222007-09-29T12:47:00.000-07:002007-10-19T11:15:54.202-07:00.signature<div style="text-align: right;"><span style="font-style: italic;">"We must know, we shall know." -- David Hilbert</span><br /></div><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_nEck2BGjqOg/Rv6xkIlDizI/AAAAAAAAADw/vOROUFQaGUw/s1600-h/hilbert.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp3.blogger.com/_nEck2BGjqOg/Rv6xkIlDizI/AAAAAAAAADw/vOROUFQaGUw/s320/hilbert.jpg" alt="" id="BLOGGER_PHOTO_ID_5115721461075774258" border="0" /></a><br /><div style="text-align: justify;"><a href="http://en.wikipedia.org/wiki/David_Hilbert">David Hilbert</a> was a great German mathematician. What I appreciate the most about him is his quixotic personality and single-mindedness, going along with <a href="http://en.wikipedia.org/wiki/Bertrand_Russell">Bertrand Russel</a> on their impossible quest to clean mathematics of all doubt and uncertainty, always searching for strict solutions through pure thought. In 1900, Hilbert came up with a list of <a href="http://en.wikipedia.org/wiki/Hilbert%27s_problems">23 fundamental problems</a>, many of which are still being investigated to this day. In 1930, Hilbert finished a famous speech in Königsberg with the words "We must know, we shall know", a phrase that fits perfectly the life-long devotion he had for mathematics.<br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-3328279952179115942007-09-28T18:38:00.000-07:002007-10-20T11:25:33.644-07:00Mighty Monty is Down<div align="justify">We knew it had to happen one day, but never this soon. The day had started badly, a drizzly sort of day, greyness and cold everywhere. To make matters worse, London transport was yet again against me, trains were cancelled, trains were overflowing with people, the human drones bent on one thing only: to get to their destination at any cost. I was one of them. In the madness of rush hour, a distress called reach me: Shahin and Monty were in big trouble.<br /><br /></div><div align="justify"></div><div align="justify"></div><div align="justify">Monty, our faithful Rover Metro, has been with us for just under six months, and in this period, it has been the definition of reliability itself. Its name comes from the licence plate - who needs vanity plates when sheer randomness is trying to tell you something? - and it's character is as English as the brand: not particularly pretty but very functional and reliable. Never once did it broke down, never once did it chug - a real trooper, always ready for the next long haul trip. When we came back from Africa, Monty took us from London to Southampton and back several times a week. It took us from Hertfordshire to London almost weekly. And he took Shahin to work and back everyday. Ah, but not Friday.<br /><br /></div><div align="justify"></div><div align="justify">Shahin was driving Monty along on the motorway as usual, seventy, more, miles per hour, when Monty started to loose speed and make noises of all sorts; suddenly from the fast lane she had to move to the middle lane; soon after, from the middle lane down to the slow lane; and from the slow lane, having nowhere else to go, she had to get out of the motorway. She remembered the wise words of Jay to our friend Stacey, also involved in an unfortunate breakdown: "Whatever you do, get the hell out of the motorway!!!". The lights were flashing, smoke was coming out of the engine, Stacey was scared, but she managed to impose her will on the unruly metal. And so did Shahin, Inspired by Stacey's brave behaviour in combat, and by the heavy cost of towing cars off the motorway.<br /><br /></div><div align="justify"></div><div align="justify">Since, unwisely, we didn't have any coverage of any kind - we were going to do it, I swear! just never had the time! - we had no option but tow the car ourselves. Shahin first tried it with her sister and the brother-in-law, but their car didn't have the required apparatus. Then she rung Stacey for help, and her boyfriend Jay agreed to come to the rescue later on at night.<br /><br /></div><div align="justify"></div><div align="justify">Night came and we all met down at Stacey's house for the operation. In our innocence, we were entirely unconcerned - how difficult can it be right? Then Shahin had a warning call from her brother, telling her how hard towing would be, had we done it before and so on. Even then I still remained unconcerned. It was only when we got to Monty and Jay started giving us instructions, in that mellow but grave voice of his: "whatever you do, make sure you keep the rope taut or you'll end up running into the back of the van. And remember, I won't break so you have to break for me. If I break you won't have enough time to react and crash into me.". OK then, I thought, other than the fact that were going to die, it's a dead easy job.<br /><br /></div><div align="justify"></div><div align="justify">Taut was a word I learned then, but which will undoubtedly stay with me forever. The cars got hooked up just outside of Welwyn, our target being Arlesey - twenty minutes of straight driving at a good speed. Miles away. And that's when it dawned on me how hard this was going to be. Shahin was driving - I was nowhere near brave enough.<br /><br /></div><div align="justify"></div><div align="justify">We drove in the dark, cold English countryside lanes, barely able to see anything but the white van one meter in front of us and it's flashing lights. I thought ten miles or so per-hour was going to be our top speed, but the speedometer just wouldn't obey and kept on going higher and higher until it settled at thirty or so. It felt like the fastest ride we've had ever had. Trees were rushing by us, darkness was rushing fast. Like good soldiers, we focused on the rope and kept it tight as possible, as tight as it had ever been before. But to keep it tight, we had to break often; and knowing the precise amount of breakage required is nigh impossible. Every time Shahin pressed the breaks, time froze for a split of a second; then the van would yank us, making us bounce like a ball. We would then do the same to the van, pulling it backwards, until the whole process would settle and we'd be on a straight line again. Perfectly within the laws of physics, but extremely scary nonetheless.<br /><br /></div><div align="justify"></div><div align="justify">We stared intensely at the rope, to the exclusion of everything else. Not much we could see anyway. But then, breaking took its toll and a break pad died with an awful grinding noise - hell itself and its horsemen coming after us. We panicked with the noise, but kept on going straight on. The worse was still to come. As we past one strangely named locality after another, we suddenly noticed we weren't going the right way. It could be that Jay new a shortcut, or even a long cut, anything but just get us there. But no, we were really, truly lost. All cars stopped, maps were taken out. We had crossed the county border, and were now in the strange land of Bedfordshire - effectively, off the map. On the good side, it appeared we were not that far away.<br /><br /></div><div align="justify"></div><div align="justify">Eventually we settled on a plan of attack; but then, as we started the cars and went past a hump, the rope snapped. Jay kept on going, but we got left behind. I thought it was the end of our adventure, somewhere in the barren lands of Bedfordshire, all was lost and we'd have to call some towing company. But resourceful Jay got rid of the metal bits, tied a simple knot and we were on our way again. All the excitement was a bit too much for Shahin, she was getting really scared by this point, but kept on going. There was nothing we could do but keep on going till the end.<br /><br /></div><div align="justify"></div><div align="justify">It's a strange feeling, being behind a car, two meters or less, at thirty miles per hour; your brain is fully aware that any breaking, any breaking at all and you will crash. It's a simple equation really.<br /><br /></div><div align="justify"></div><div align="justify">Sometime later we found ourselves driving in town center Arlesey, past all the pubs, past all the shops, excitedly looking for the garage. Shahin spotted it, screamig. We had made it alive. But we learned our lesson. Next time, we'll pay the hundred pounds for towing gladly - and probably even add a tenner to the chap.<br /><br /></div><div align="justify"></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-6263520599840709492007-09-22T06:38:00.000-07:002007-09-22T07:34:37.868-07:00BlogosphereOK, it appears one of my favourite blogs has really ended: <a href="http://sdblog.wordpress.com/">Sem Destino</a>. This blog had a great atmosphere and was always the place to go to when one needed to get closer to Angola. All the best to miguel, and a speedy return to activity! In particular, we all want is photobook, as he has some incredible pictures on that blog.<br /><br />The good news is the crowd around the blog decided to create another blog, with the creative title of <a href="http://lifegoesonaguardandooregressodochefe.blogspot.com/">Life Goes On - Aguardando o Regresso do Chefe</a> (Waiting the Boss's Return) :-) it's a great read too. In particular, the posts about <a href="http://lifegoesonaguardandooregressodochefe.blogspot.com/2007/09/para-pp-uma-blue.html">Blue</a> and <a href="http://lifegoesonaguardandooregressodochefe.blogspot.com/2007/09/havemos-de-voltar-agostinho-neto.html">Agostinho Neto</a> made me homesick :-)<br /><br />Another blog that is always interesting to read is <a href="http://davelargo.blogspot.com/">Dave Richards</a>. Totally techie. It's great to see how a large scale linux desktop deployment looks like, the problems it faces, the solutions they come up with.<br /><br />I haven't had much time to read other people's blogs of late - other than the usual nerdy ones - but I will make it up this weekend...Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-429839260513564532007-09-22T04:52:00.000-07:002007-10-11T02:00:39.748-07:00Nerd Food: Take a Walk on the Server Side<div style="text-align: justify;">When it comes to programming, for me there isn't much of a choice: the place to be is the server side. I may work a lot on the client side these days, but GUIs and chrome never had much of an attraction for me. I do have a healthy dose of respect for those who love it: client side work is a mixture of coding mastery, design skills and a big dollop of human psychology. For some reason when I visualise the client side I always imagine nice, pristine offices with lots of light and huge amounts of human interaction between programmers as well as between programmers and users.<br /><br />The server side is a totally different beast. I always visualise it as the dark place of green terminals and server rooms, of never ending performance charts and monitor applications, the land of <span style="font-style: italic;">blinken</span> lights. Of course, these days we all have to share the same desks and deal with the same layers of managerial PHBs - and with HR and their latest social experiments - but the fundamental point is that these are two very different crafts.<br /><br />Thing is, I find that the server side is extremely misunderstood because the vast majority of developers out there come from a client background. When developers cross over, their bias introduces many, many problems on server side applications, simply because they are not used to the server way of thinking.<br /><br />This article covers many mistakes I've seen over the years, in the hope you may avoid them, offering some tentative solutions.<br /><br /><span style="font-weight: bold;">The Languages</span><br /><br />There really is only one language to do server side work: C++. Yes, I'm a zealot. Yes, I know that both .Net and Java are much easier to get along with, and have none of the tricky memory allocation problems that riddle C++ applications (those that haven't discovered shared pointers, at any rate). I agree that, in theory, both Java and C# are better options. In practice, however, they become problematic.<br /><br /><span style="font-weight: bold;">The right staff.</span> It's difficult to find a got Java/C# programmer, just like it was difficult to find a good VB programmer. The client side is a very forgiving land, and not only can bad programmers get away with it for years but you also have to remember that great client side programmers don't need to know their tools to the level of detail that server side programmers do. How many times do you need to read up on scheduling to do a GUI? Or on TCP flags? Not often, I'd wager. So the reality is, if you have been doing any of these languages for a while, you can talk all the right TLAs and describe all the right concepts with easiness and fly through most interviews. But when it comes to doing the job, you will probably be reading manuals for days trying to figure out which subset of technologies on your stack are good for server side and which ones are just plain evil performance killers. A good server side Java/C# programmer will use only the smallest set of features of the language when programming, knowing exactly the cost of those features.<br /><br />It is, of course, really hard to find a good C++ programmer too. But here, there are two things that help us. There are not that many left doing C++ work - most of them have migrated to higher pastures by now, in particular those that always felt uncomfortable with the language. The few that are left are doing server side work. The second thing is, due to C++'s lower level of abstraction, even a bad C++ programmer is well aware of the bare metal. It basically forces you to think harder, rather than just pickup a manual and copy an example.<br /><br /><span style="font-weight: bold;">Minimise layers of indirection.</span> Another problem I have with Java/C# is indirection, which is another way of saying performance. Now, I know all you Java and .Net heads have many benchmarks proving how your AOT compilers optimise on the fly and make them even faster than native code, or how your VM is much better at understanding application's run time behaviour and optimising itself for it. And the fact that you never worry about memory leaks goes without saying. Well, that's all fine and dandy as far as the lab is concerned.<br /><br />What I found out on the field is different. Resource allocation is still a massive problem, either due to complex cyclical referencing, or just plain programmer incompetence. Memory consumption is massive, because programmers don't really understand the costs involved in using APIs, and thus just use whatever is easier. This, of course, also impacts performance badly. And to make things even worse, you then have to deal with the non-deterministic behaviour of the VM. It's bad enough not knowing what the kernel will decide and when, but when you put in a VM - and god forbid, an application server! - then its nigh impossible. It could be a VM bug. Or it could be that you are not using certain API properly. Or it's just your complex code. Or it's the OS's fault. Who knows. That's when you have to fork out mega-bucks and pay an expensive Java/.Net consultant to sort it all out. And pray he/she knows what he/she is talking about.<br /><br />The truth is, I've never heard of a Java/.Net application on the field that was found to be more performant than it's C++ counterpart. In part, this is because we are comparing apples with oranges - the rewrites seldom cover the same functionality, adding large amounts of new features and making straight comparisons impossible. But there must be more to it too, since, from experience, Java/.Net engineers seem to spend an inordinate amount of time trying to improve performance.<br /><br />Now, before you go and start rewriting your apps in C++, keep this in mind: the biggest decision factor in deciding a language is the competence of your staff. If you have a Java/.Net house, and you ain't going to hire, don't use C++. It will only lead to tears and frustration, and in the end you will conclude C++ is crap. If you are really serious about C++, you will need a team of very strong, experienced C++ developers leading the charge. If you haven't got that, best use whatever language you are most competent at.<br /><br />Another very important thing to keep in mind is the greatest C++ shortcoming: its small standard class library. It is perhaps the language's biggest problem (and probably the biggest reason for Java/c#'s success). This means you either end up writing things from scratch, buying a third party product (vendor lock-in) or using one or several open source products, each with their own conventions, styles, etc. At present Boost is a must have in any C++ shop, but it does not cover the entire problem domain of server side development. These are the following things to look for in any library:<br /></div><ul style="text-align: justify;"><li>Networking</li><li>Database access</li><li>Threading</li><li>Logging</li><li>Configuration</li><li>Serialisation<br /></li></ul><div style="text-align: justify;"><span style="font-weight: bold;"><br />The Hardware Platform </span><br /><br />As far as the client side is concerned, platform is almost a non-issue: you will most likely only support Windows on x86. After all, Linux and Mac are so far behind in terms of market share it's not even funny. The cautious developer will point out that a Web application is a safer bet, although you may loose much richness due to the limitations of the technology. AJAX is nice, but not quite the same as a solid GUI. If kiosks and POS are some or all of your target market, you will be forced to look at cross-platform since Linux is making inroads in this market. And you can always use Java.<br /><br />With regards to the server side, one must look at the world in a totally different light. Because you never know what your scalability requirements are, there is no such thing as an ideal hardware platform. Today, one 32-bit Windows server with 2 processors and 4 gigs or RAM may be more than enough; tomorrow you may need to run apps that require 20 gigs of RAM and 16 processors, and big iron is your only option.<br /><br />So the most important aspect in terms of the hardware platform is this: whatever you do, _never_ commit yourself to one. Write a cross-platform application from the start, and ensure it remains one. Even on a Windows only shop, it's not hard to use a cross-platform toolkit and have a PowerPC Linux box on the side to run tests on. Its actually not much harder to write cross-platform _server side_ code, as long as you have a library you can trust to abstract things properly. And as long as you take cross-platform testing seriously.<br /><br />Think of it as an insurance policy. One day, when your boss asks you for a 10-fold increase in deal volume, you know you can always run to the shop and buy some really, really big boxen to do the job. Tying yourself to an hardware platform is like putting all of your eggs in one basket; better not drop it.<br /><br /><span style="font-weight: bold;">The Architecture</span><br /><br />The single most important lesson to learn on the server side is that architecture is everything. No server side project should start without first having a top notch architect, known to have built at least two large scale systems. You can always do it on the cheap, save the money and get more programmers instead, but remember: you will pay the cost later. Things would be different if <a href="http://mcraveiro.blogspot.com/2007/05/nerd-food-on-maintenance.html">maintenance</a> was taken seriously; but don't kid yourself, it's not.<br /><br />When the business suddenly tells you that you need to double up capacity, or support Asia and America, or add some products that are radically different from the ones your system now processes - that's when you'll feel the pain. And that's when you'll have to start designing v2.0 of your system, starting mainly from scratch.<br /><br />One of the key differences between client side and server side work is this focus on scalability. After all, there is only so much work a single person can do, so many simultaneous instances of a client side application that can be started on any one machine, and so many trades that can be loaded into a single PC. Not so with the server side. You may think that processing N trades is more than enough, but that is today; tomorrow, who knows, 10xN could be the average.<br /><br />A good architect will probably look at the problem and find ways to distribute it. That is, to design a very large number of small, well-defined servers, each of which with a small subset of responsibilities - all talking to each other over a messaging bus of some kind. The system will use a narrow point of access to the database, and huge amounts of caching on each server. This will allow the system to scale as demand grows, just by adding more servers. Hardware is cheap; software engineers are expensive.<br /><br />The ideal architect will also be clever enough to allow client tools to be written on Java or C#, and let someone with more experience on these matters lead its development.<br /><br />In summary, the key components of a system will be along these lines:<br /></div><ul style="text-align: justify;"><li>A solid, cross-platform, scalable relational database. Oracle and Sybase are likely candidates, and PostgreSQL on the free software side of things;</li><li>A solid, cross-platform, scalable messaging bus. Tibco, Talarian, etc. Choose something you have experience with. Never, ever, under any circumstances write your own. (at present, I'm not aware of any free software alternatives for messaging);<br /></li><li>A large number of small servers, communicating over the messaging bus.<br /></li></ul><div style="text-align: justify;">Getting the architecture right is essential; but once you're there, you must work hard to maintain it.<br /><br /><span style="font-weight: bold;">The Database</span><br /><br />Just as you need an architect, you also need a DBA. You may be a hotshot when it comes to databases, you think, but the truth is a good DBA will take your optimal code and optimise it ten times over. Minimum. It's what they do for a living. It's important to get the DBA early into the system design process to ensure no crass mistakes are made on the early stages. These are much harder to fix afterwards. And make sure the schema is designed by him/her, with large input from developers - minimising the impedance mismatch between the C++ datamodel and the database schema.<br /><br />If your DBA hasn't got the bandwidth to write all the stored procs directly, at least make sure he/she sets down the guide lines on how to write the stored procs, and if at all possible reviews code before check-ins.<br /><br />You should also create a repeatable testing framework for performance on all procs, to detect quickly when somebody makes a change that impacts performance. But a good DBA will tell you all about it, and many things more.<br /><br /><span style="font-weight: bold;">A Catalogue of Mistakes</span><br /><br />There are many small mistakes to be found on server side apps, some at the architectural level, others at the implementation. This is a summary of a few I've seen over the years.<br /><br /><span style="font-weight: bold;">Overusing XML.</span> Whilst XML is a brilliant technology to enable cross-platform communication, and it has many benefits for client side development, it is of very limited usage on the server side. Pretty much the only things it should be considered for are:<br /></div><ul style="text-align: justify;"><li>Allow Java / .Net clients to talk to the server side;</li><li>Allow external parties to send data into our system;</li><li>Save the configuration settings for servers.<br /></li></ul><div style="text-align: justify;">It should not be used for anything else. (And even then, you should still think really hard about each of these cases). It certainly should not be used for communication between servers within the server side, nor should it be used, god forbid, in any kind of way within the database. De-serialising XML in a stored proc is an aberration of server side nature.<br /><br />Bear in mind the following XML constraints:<br /></div><ul style="text-align: justify;"><li>The vast majority of the message is redundant information, making messages unnecessarily large. This will clog up your pipes, and have particularly nasty effects in terms of throughput on high-latency links (any large message will).</li><li>XML messages normally have associated a schema or DTD. Servers that you yourself wrote will use the same serialisation code, so there shouldn't be any need to validate these messages against a DTD/schema (you will of course have some sanity checks on C++).</li><li>Serialising and de-serialising from XML is horrendously expensive. In particular, if all your servers are running on the same hardware platform, there are absolutely no benefits - and the costs are massive.</li><li>Compressed XML is a solution in need of a problem. You may save costs on transport, but these have been transferred to an intensive CPU bound process (decompressing and compressing).<br /></li></ul><div style="text-align: justify;">In conclusion, XML is not cheap. As your deal volumes increase, you will find that you're spending more and more of your absolute time transporting, serialising, de-serialising and validating. It's fine for one-offs, for sure, but not for volume.<br /><br />The only type of serialisation permitted on the server room is binary serialisation. You can make it cross-platform using something along the lines of XDR or X.409.<br /><br />The lesson we learn from XML is applicable everywhere else on the server side: always evaluate cautiously a technology and make sure you fully understand its costs - in particular with regards to increases in volume.<br /><br />XML is a brilliant technology, and fit for purpose; that purpose is not efficiency.<br /><br /><span style="font-weight: bold;">Cool technologies.</span> If you didn't listen to my point on how C++ is the only option and insisted in using Java or C# - or, god forbid, you found a way of doing it in C++ - you may have started using reflection. This, and many other technologies are utterly forbidden on the server side.<br /><br />Very much like XML, the problem with such technologies is that in 99% of cases they are used to solve problems that never existed in the first place. I mean, do you really need to dynamically determine the database driver you are going to use? How often do you change relational database providers without making any code changes? Of course, those calls would be cached, but still, it's the principle that matters. And does it really help application design to determine at run-time which method to call, and its parameters and their types? This is several orders of magnitude more expensive than virtual functions. Does it really make coding any simpler? Because the cost is huge, and the scalability is poor. If you are using reflection because there is large amount of repetitive code, which can be factored out with reflection, consider using a text processing language to generate the repetitive code. This is a clean, maintainable and performant solution.<br /><br />Another pet peeve are components and distributed technologies. Do you really need complex technologies such as (D)COM and CORBA? Components are nice in theory, but in reality they add huge amounts of maintenance problems, configuration costs, debugging becomes much harder and performance is hindered in mysterious ways.<br /><br />In the vast majority of cases, you can create your own little messaging layer in extremely simple C++ - code that anyone understands and can debug in seconds - built on top of a serialisation framework such as Boost.Serialisation. Whilst Boost.Serialisation is not the most performant of them all, nor does it have great support for cross-platform binary serialisation, it is good enough for a large number of cases; and you can extend its binary serialisation to fit your needs.<br /><br />The server side is not the place to experiment. Cool and hip are bad. Pretty much all technologies that are required to make large-scale, scalable applications have been invented decades ago - they just need to be used properly. When choosing a server side technology, always go down the proven path.<br /><br /><span style="font-weight: bold;">Performance testing.</span> One thing many people do is to create servers that can only be loaded up from a database or another server, and can only send their results to a database or another server. This is a crushing limitation, introduced for no reason other than laziness or bad project planning ("test tools? no time for them!"). The whole point of server side development is to be able to offer guarantees in terms of scalability. Those guarantees can only be offered if there is a reliable way of stress testing your components independently, and create a baseline of such tests so that regressions can be found quickly.<br /><br />Having to setup an entire environment to test a given server is not just troublesome, it hinders fault isolation. It may also mean that there are only a few test systems available. Each developer should be able to have their own development environment.<br /><br />Of course, don't take me wrong: one should have system-wide performance tests; but these are only relevant if all components passed their individual load tests.<br /><br /><span style="font-weight: bold;">GUI tools.</span> One thing you should consider from the beginning is the ecosystem of GUI tools that are required to manage your system, ideally written in a high-level language such as Java/C#. Here, in the vast majority of cases, usability is more important than performance, and this is where Java/C# are at their best.<br /><br />The GUI tools should focus on things like:<br /></div><ol style="text-align: justify;"><li>Account administration: adding new users, deleting them, etc.</li><li> Monitoring and diagnostics: graphs on deal volume, health checks to ensure servers are still alive, memory usage, cpu usage.</li><li>Maintenance, deployment, configuration: restarting servers when they die, easy deployment and configuration of servers.</li><li>Data administration: special functions to perform on the data to resolve cases where duff data was inserted, etc. This is sort of a client for power users.<br /></li></ol><div style="text-align: justify;">The biggest problem of not having a good ecosystem of GUI management tools is that your development work will became more and more operational, since the system is too complex to give it to real operators.<br /><br /><span style="font-weight: bold;">Database Serialisation.</span> This is one of the most important aspects of any server side system, and has to be carefully thought out. You should keep it to a bare minimum the number of servers that touch the database directly, and make sure they are physically located as close as possible to the database - but no closer; never on the same machine. All other servers must go to these data servers to read and write to the database.<br /><br />The second important point is to try to "automate" the serialisation as much as possible. All objects that are serialisable to the database should have auto-generated code (never reflection!) responsible for reading/writing the data. They should also interface with the database via stored procs - never reading tables directly - all making sensible use of transactions.<br /><br /><span style="font-weight: bold;">Keep it simple and Know Your Costs.</span> Optimal code is normally very simple; sub-optimal code is non-performant due to its complexity. This simple truism underlies very much all performance work. It's very rare that one needs to increase complexity to improve performance. In the majority of cases, the easiest way is to ask the simple question: do we really need to do this? And when you decide you really need to do something, make sure you are fully aware of its O cost. Choosing a O(N) approach (or worse) should never be taken lightly because it's a scalability time bomb and it will always blow up when you need it the least - i.e. when the system is overloaded.<br /><br />I found that Object Orientation is in many cases detrimental to performance, because people are so focused in API's and abstraction that they forget about the hidden costs. For instance, it's common to see a call-stack five levels deep (or more) just to do something as simple as changing the value of a variable. Inheritance is particularly evil due to its encapsulation breaking and tight-coupling. When you think in terms of algorithms and data structures, the costs are much more obvious.<br /><br />In designing a modern OO system, it's best to:<br /></div><ul style="text-align: justify;"><li>keep inheritance to an absolute minimum, using either interfaces or client-supplier relationships;</li><li>keep behaviour to a minimum in the objects of your data model - probably best if they are but glorified data structures with getters/setters, on which other, more specialised classes operate on.<br /></li></ul><div style="text-align: justify;"><span style="font-weight: bold;">Do not optimise early.</span> One classic case of early optimisation in C++ is not using virtual functions because of performance. This may be true in certain cases, but you need to be coding really close to the metal to start suffering from it. However, many programmers refuse to consider inheritance or interfaces at design-time - even in systems where microsecond performance will never be an issue - limiting their options dramatically, for no real gain whatsoever. There are many, many other such examples - like designing your own string class before you proved it to be a bottleneck.<br /><br /><span style="font-weight: bold;">Misuse of threads.</span> Another classic case in server side programming is thread misuse. Many developers look at every bit of code and think: "I'll stick a thread pool in there; this will scale really neatly when we have more processors". The end result of this sort of thinking was apparent at one customer site, where they had over 170 threads (!!!) for one single server application. This application was running in boxes with 64 processors, and sharing them with other instances as well as other servers which also made liberal use of threads.<br /><br />The problem with this approach is obvious:<br /></div><ul style="text-align: justify;"><li>very rarely is there a need to have more threads than processors (unless you're doing IO bound work; and even then, threading may not be the best solution; consider multiplexing);</li><li>really thread-safe code requires lots of locking; when you finally make your code multithread-safe you may find it performs as badly as single threaded code - if not worse!</li><li>having ridiculous amounts of threads hinders performance even if they are doing nothing (as it was the case with our application above) because threads consume resources and take time to construct and destroy.<br /></li></ul><div style="text-align: justify;">Server side and threading go hand-in-had, like bread and butter. But they should only be used in cases where few or no locking is required - and that requires large amounts of experience in application design.<br /><br /><span style="font-weight: bold;">Conclusion</span><br /><br />Designing large-scale, server side systems is a very difficult job and should not be taken lightly. Lack of experience normally leads to using the wrong technologies and making wrong fundamental architectural decisions, which cannot be fixed at a later date. When designing a large system from scratch, one should always prefer the proven approaches to the new ideas the market keeps on churning.<br /><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-48956076848933958672007-08-27T13:50:00.000-07:002007-08-27T13:57:29.339-07:00Afrobasket: Angola wins yet again!<div style="text-align: justify;">Incredible. Angola wins Afrobasket for an amazing ninth time. We're incredibly proud of each and everyone of you boys! And of the organisers, who demonstrated the capabilities of our country.<br /></div><br /><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_nEck2BGjqOg/RtM6KVgWSHI/AAAAAAAAADo/IbtDO2QZEu8/s1600-h/final.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp0.blogger.com/_nEck2BGjqOg/RtM6KVgWSHI/AAAAAAAAADo/IbtDO2QZEu8/s320/final.JPG" alt="" id="BLOGGER_PHOTO_ID_5103486751986829426" border="0" /></a><span style="font-weight: bold;">Angola v Cameroon, Afrobasket Final. (C) 2007 Afrobasket.com</span><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-16935488351415132722007-08-24T12:33:00.001-07:002007-08-24T12:44:16.546-07:00Afrobasket 2007<div style="text-align: justify;">Oh. My. God. What can I say about Afrobasket 2007 in Angola. Perhaps just: <span style="font-weight: bold;">WE ROCK!!!!</span> :-) Not only did the stadiums get finished on time (incredible, since some of them started less than 6 months before the beginning of the competition), but they actually look pretty good, and stood the test of the first few games without falling over :-) In addition, although there have been a few glitches, and whilst the <a href="http://afrobasket2007.com/">website</a> is definitely not the fastest or the most professional in the world, it does the job. For all of its faults, this has been one of the most organised events in Africa, and comments like <a href="http://www.fiba.com/pages/eng/fc/news/lateNews/fibaEven/fibaAfriCham/p/newsid/21462/arti.html">these</a> are extremely encouraging. Maybe one day we will actually see the World Cup in Angolan soil.<br /><br />I'm extremely proud of the work the entire country has put in, and I think that every Angolan shares the same feeling too. Now all we need is for the boys to bring it back home tomorrow.<br /></div><br /><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_nEck2BGjqOg/Rs8yyVgWSGI/AAAAAAAAADg/_igXZlF3tvM/s1600-h/Angola.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp0.blogger.com/_nEck2BGjqOg/Rs8yyVgWSGI/AAAAAAAAADg/_igXZlF3tvM/s320/Angola.jpg" alt="" id="BLOGGER_PHOTO_ID_5102352743181731938" border="0" /></a><span style="font-weight: bold;">(C) 2007 Afrobasket.com</span><br /></div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-45939798421204925262007-07-24T13:47:00.000-07:002007-07-24T13:54:36.115-07:00Interview with Con Olivas<div style="text-align: justify;">As everyone knows, Ingo's scheduler is now in mainline. Many have been curious as to why CFS made it when SD didn't. This and much more is now explained <a href="http://apcmag.com/6735/interview_con_kolivas">here</a>, and even though its a one sided account of the events, one cannot but feel that most of it is a truthful representation of what happened. A must read for anyone interested in the kernel, and free software development in general. The great thing about this interview is, perhaps, the braveness and frankness in which Con expresses himself - as well as how he conveys the dog-eat-dog world of kernel hacking.<br /><br />Perfect read for those days when you get depressed about working for a bespoke company. Free software hacking has its downsides too.</div>Marco Craveirohttp://www.blogger.com/profile/01039195055988254979noreply@blogger.comtag:blogger.com,1999:blog-2672427473119923109.post-78389963851123012232007-07-08T11:06:00.000-07:002007-07-08T12:30:03.743-07:00Interesting...<div style="text-align: justify;"><span style="font-weight: bold;">Books</span><br /><br /><span class="sans"><a href="http://www.amazon.co.uk/Capitalist-Nigger-Success-Spider-Doctrine/dp/0967846099">Capitalist Nigger: The Road to Success: a Spider Web Doctrine</a>: Didn't really need to swear on its cover; and is a bit patchy in parts, as if the author started with the intention of writing a business book but ended up in a different direction; but, for all of it's faults, it's still a book worth reading. It's an attempt to sound the wake-up call the black race needed for the 21th century, and it goes a long way towards doing that. If the author had spent more time polishing the structure of the book, and made the name and the style less antagonistic, this could be Walter Rodney's <a href="http://www.amazon.co.uk/Europe-Underdeveloped-Africa-Walter-Rodney/dp/0882580965">successor</a>. As it is, it's not up to the standard, and you may find it a bit crude in places. </span><br /><span class="sans"></span><br /><span class="sans"></span><span id="lblTitulo" class="bigGrayText"><a href="http://oficinadolivro.pt/site/bookDetails.aspx?BookID=345">África Acima</a>: (Portuguese) After attempting to write a travel book (admittedly only for friends, but nevertheless...), I began to understand a little bit the difficulties involved. This made me appreciate all the more Gonçalo's book. It does a great job in transporting the reader to Africa, and taking us along with him. </span><span id="lblTitulo" class="bigGrayText">Gonçalo does sound a bit like a public school boy at times, but overall he does a remarkable job of presenting Africa.</span><br /><span id="lblTitulo" class="bigGrayText"></span><br /><span id="lblTitulo" class="bigGrayText"><a href="http://www.amazon.co.uk/Bang-bang-Club-Making-South-Africa/dp/009928149X">The Bang-Bang Club</a>: I have no words to describe this book, other than absolutely brilliant. Narrates the painful birth of the new South-Africa, from the perspective of the lens of four photographers. Amazing.</span><br /><span id="lblTitulo" class="bigGrayText"></span><br /><span id="lblTitulo" class="bigGrayText"><span style="font-weight: bold;">Movies</span></span><br /><span id="lblTitulo" class="bigGrayText"></span><br /><span id="lblTitulo" class="bigGrayText"><a href="http://www.google.co.uk/movies/reviews?cid=b08f0ce8eae5fe0c&fq=rise+of+the+silver+surfer&sa=X&oi=showtimes&ct=reviews&cd=1">Fantastic Four: Rise of the Silver Surfer</a>: Hated it. I was a big fan of comic books when I was a kid, and I still buy the occasional Neil Gaiman boo