<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss'><id>tag:blogger.com,1999:blog-6193377</id><updated>2009-07-01T15:44:02.979+02:00</updated><title type='text'>Rainer's Blog</title><subtitle type='html'>This Blog is about many things Rainer is interested in. This happens to include syslog, astronomy and other fun things.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://blog.gerhards.net/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default?start-index=26&amp;max-results=25'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>317</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6193377.post-1491916093326901043</id><published>2009-06-15T16:33:00.007+02:00</published><updated>2009-06-15T17:29:47.412+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>high-peformance, low-precision time API under linux?</title><content type='html'>This time, I raise a question in my blog. Suggestions, tips and full answers are very welcome.&lt;br /&gt;&lt;br /&gt;In &lt;a href="http://www.rsyslog.com/"&gt;rsyslog&lt;/a&gt;, there are various situations where I only need low resolution timestamps. With low resolution, I precise within a second. Of course, this thing is provided by the time() API. However, time() is very slow - far too slow for many things I do in rsyslog. So far, I have been able to work around this problem by doing a time() call only every n-th time where I run in tight loops and know that this will not bring me outside of me 1-second window (well, to be precise, this is at least very unlikely and thus acceptable).&lt;br /&gt;&lt;br /&gt;However, this approach does not work for all work that I am doing. Now I am facing the challenge once gain, but this time in an area where the "query only n-th time" approach does not work. I need the time in order to schedule asynchronous activities (like writing so far unwritten buffers to disk). With them, there  is no tight loop that provides me with some sense of timing, and so I simply do not know if half a second or half an hour has elapsed between calls - except when I do one of these costly time() calls.&lt;br /&gt;&lt;br /&gt;A good work-around would be to define my own interval timer, awaking me e.g. every seconds. So I would not need absolute time but could do things based on these timer ticks. &lt;b&gt;However&lt;/b&gt;, there is lot of evil in this approach, too: most importantly: this means rsyslogd will be active whenever the system is up, and running on a tick will prevent the operating system from switching the CPU to power saving modes. So this option looks very dirty, too.&lt;br /&gt;&lt;br /&gt;So what to do now? Is there any (decently portable) way to get a second-resolution current timestamp (or a tick counter) &lt;b&gt;without&lt;/b&gt; actually running on a tick?&lt;br /&gt;&lt;br /&gt;If I don't find a better solution, I'll probably be forced to run rsyslogd on a tick, which would not be a good thing from a power consumption point of view.&lt;br /&gt;&lt;br /&gt;As I already said, feedback is greatly appreciated...&lt;br /&gt;&lt;br /&gt;Edit: in case my description was a bit unclear: it is not so important that the timestamp is of low resolution. Of course, I prefer higher resolution, but I would be OK with lower resolution if that is faster.&lt;br /&gt;&lt;br /&gt;The problem with time() and gettimeofday() is that they are quite slow. As an example, I can only do around 250,000 time()/gettimteofday() calls per second on my current development system. So each API call takes around 4ms on that system. While this sounds much, it adds considerable runtime to each messages being processed - especially if multiple calls are required thanks to modular structure.&lt;br /&gt;&lt;br /&gt;I have also thought about a single "lowres system time getter" inside rsyslog. However, that brings up problems with multi-threading. If one would like to be on the safe side, its entry points need to be guarded by mutexes, another inherently slow operation (depending on circumstances, overhead can be even worse then time()). With atomic operations, things may improve. But even then, we run into the issue that we do not know if the last call was half a second or half an hour ago...&lt;br /&gt;&lt;br /&gt;Another edit:&lt;br /&gt;This is a recording from a basic test I did on one lab system:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[rgerhards@rf10up tests]# cat timecaller.c&lt;br /&gt;#include &lt;stdio.h&gt;&lt;br /&gt;#include &lt;time.h&gt;&lt;br /&gt;#include &lt;sys/time.h&gt;&lt;br /&gt;&lt;br /&gt;int main(int argc, char* argv[])&lt;br /&gt;{&lt;br /&gt; time_t tt;&lt;br /&gt; struct timeval tp;&lt;br /&gt; int i;&lt;br /&gt;&lt;br /&gt; for(i = 0 ; i &lt; atoi(argv[1]) ; ++i) {&lt;br /&gt; // time(&amp;tt);&lt;br /&gt;  gettimeofday(&amp;tp, NULL);&lt;br /&gt; }&lt;br /&gt;}&lt;br /&gt;[rgerhards@rf10up tests]# cc timecaller.c&lt;br /&gt;[rgerthards@rf10up tests]# time ./a.out 100000&lt;br /&gt;&lt;br /&gt;real 0m0.309s&lt;br /&gt;user 0m0.004s&lt;br /&gt;sys 0m0.294s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The runtime for the time() call is roughly equivalent (especially giving the limited precision of the instrumentation). Please also note that we identified the slowness of the time() calls in autumn 2008, when doing performance optimization with the help of David Lang. David was the first to point to the time-consuming time() calls in strace. Reducing them made quite a difference.&lt;br /&gt;&lt;br /&gt;Since them, I try to avoid time() calls at all costs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-1491916093326901043?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/1491916093326901043/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=1491916093326901043' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1491916093326901043'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1491916093326901043'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/06/high-peformance-low-precision-time-api.html' title='high-peformance, low-precision time API under linux?'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-1866860941714760787</id><published>2009-05-29T10:16:00.002+02:00</published><updated>2009-05-29T11:23:23.016+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>introducing rsyslog v5</title><content type='html'>&lt;span style="font-weight: bold;"&gt;A new v5 version of &lt;/span&gt;&lt;a style="font-weight: bold;" href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt;&lt;span style="font-weight: bold;"&gt; will be released today.&lt;/span&gt; Originally, I did not plan to start the v5 version before the end of the year (2009). But then we received sponsorship to enhance queue performance. And then we saw that an audit-grade queue subsystem was needed (audit-grade means that no message is ever lost, not even in fatal failure cases like sudden power loss).&lt;br /&gt;&lt;br /&gt;Especially the audit-grade queue subsystem resulted in very large design changes to the queue engine. Their magnitude is so large that I assume we need some time to stabilize it. Thus, I have decided to start a new v5 branch, which will feature the redesigned queue engine.&lt;br /&gt;&lt;br /&gt;When we introduced the queue engine in early 2008 (in rsyslog v3), it took roughly three to five month until it got decently stable. With the magnitude of changes we have done now, it will probably take some time, again. It depends a bit on the actual feedback we receive from practice. Also, this time I have added lots of automated tests, so a lot of bugs should already have been caught. Also, during the next weeks I will focus on actual deployment scenarios, rather than things that theoretically may happen (the testbench covers many of those). So, all in all, I expect that the new queue engine will become production-ready faster than the v3 engine.&lt;br /&gt;&lt;br /&gt;Still, I think it is desirable to create a new major version branch for this change. So here we are, at v5. &lt;span style="font-weight: bold;"&gt;I will continue to develop functionality that does not necessarily need the new queue engine inside the v4-devel.&lt;/span&gt; That way, we will have this functionality available both with the proven queue engine as well as with the new experimental one. Note that I can  not do this with a stable branch: per definition, stable branches never receive enhancements (as that would potentially destabilize the branch). So, for the time being and probably a couple of month, &lt;span style="font-weight: bold;"&gt;we will have two development branches&lt;/span&gt;: the v4 as well as the v5 branch. With that v5 will focus on the new queue engine plus any other additions, which are done in v4.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-1866860941714760787?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/1866860941714760787/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=1866860941714760787' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1866860941714760787'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1866860941714760787'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/05/introducing-rsyslog-v5.html' title='introducing rsyslog v5'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-4829049384061517363</id><published>2009-05-19T17:38:00.007+02:00</published><updated>2009-05-19T18:42:48.363+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog queue enhancements  - status report</title><content type='html'>I thought I post a few thoughts about how far the &lt;a href="http://www.rsyslog.com/"&gt;rsyslog&lt;/a&gt; queue enhancements have evolved.&lt;br /&gt;&lt;br /&gt;We started with the goal to increase performance, especially for database outputs. As part of that endeavor, we designed and implemented message batches as the new processing entity. This approach was suggested by David Lang, who also offered very valuable feedback, suggestions and review of the relevant papers (not to mention actual testing) during the process. Then, we came to the conclusion that we need to have a truly ultra-reliable queue. One that does not even lose messages in case of a sudden fatal failure (like a power fail without a UPS - or a failing UPS!). That lead to further redesign and a lot of design work. All of this is very exciting.&lt;br /&gt;&lt;br /&gt;Since last Friday, I have now worked on the actual code. I do now have updated for queue, the queue storage drivers and action processing. Most importantly, the rsyslog testbench does once again successfully run, even in DA queue mode. There are still a couple of things that need to be looked at, but I think most of the bulk work is done. What now follows is careful looking at the open issues plus a LOT more of testing.&lt;br /&gt;&lt;br /&gt;The testbench has improved much in the past three month, but it still is far from covering even the most important code areas. Especially the various queueing scenarios are not very well covered by it, mostly because it is rather complex to do so. Anyhow, I will now try not to do so many ad-hoc manual tests but rather see that I can create more automated tests. While this is a lot more of work, even the current testbench has been proven to be extremely valuable during this mayor code change effort (which, let me re-iterate, is far from being fully completed). Without it, it would have been much harder to find those bugs that came up during the testbench run. I think that the time I invested into it already has payed back.&lt;br /&gt;&lt;br /&gt;Let me end with a list of things I need to look at. That will at least help me keep focused and let you know what is extremely weak right now:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;more tests&lt;/li&gt;&lt;li&gt;so far, the last batch is not freed until at least one more message comes in (permit DeleteProcessedBatch() to be called de-coupled)&lt;/li&gt;&lt;li&gt;cancel processing cleanup, decision if we should still support cancel processing entry points&lt;/li&gt;&lt;li&gt;configured discarding of messages on queue-full condition [at least add extra nElem counter]&lt;br /&gt;&lt;/li&gt;&lt;li&gt;make output actions support message-permanent failures (at least PostgreSQL output plugin) [also think about test cases for this]&lt;/li&gt;&lt;li&gt;double-check of action and action unit state processing&lt;/li&gt;&lt;li&gt;persisting of messages from memory queues during shutdown (testing)&lt;/li&gt;&lt;li&gt;Think about a new way of handling iDeqSlowdown (maybe during batch processing?)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-4829049384061517363?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/4829049384061517363/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=4829049384061517363' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/4829049384061517363'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/4829049384061517363'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/05/rsyslog-queue-enhancements-status.html' title='rsyslog queue enhancements  - status report'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-2175649153294986358</id><published>2009-05-13T10:48:00.002+02:00</published><updated>2009-05-13T10:56:16.740+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>ultra-reliable queueing in rsyslog</title><content type='html'>As part of the ongoing mailing list discussion on ultra-reliable queueing in &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt;, I'd like to create another blogpost from discussion content (again, I hope this reference is handy in the future).&lt;br /&gt;&lt;br /&gt;The key point with ultra-reliable queues is that no message can be lost once it has been enqueued. In the current (v2,v3,v4 &lt;= 4.1.2) releases of rsyslog, this is ensured as long a the system is guarded against a sudden loss of power (or similar disaster) and even then all but the last messages dequeued are save.&lt;br /&gt;&lt;br /&gt;To make queue operations ultra-reliable in that case, the queue needs to be run as a pure disk queue and a checkpoint interval of one needs to be used. This makes the queue reliable at the expense of performance. Note also that with a disk queue only a single queue worker is permitted.&lt;br /&gt;&lt;br /&gt;Now let's look at a simplified scenario:&lt;br /&gt;&lt;br /&gt;input -&gt; queue -&gt; output&lt;br /&gt;&lt;br /&gt;This is not correct in that inputs never connect directly to outputs, but this detail is irrelevant for what I intend to say (replace "input" by "producer" and "output" by "consumer" if you'd prefer to have a fully consistent version).&lt;br /&gt;&lt;br /&gt;Let's say the processing time is the cost we incur. If we look at it, the queue's cost dominates by far the combined cost of input and output. In most cases, it dominates input+output cost so much, that you can express the total cost as just the cost of the queue operation, without looking at anything else.&lt;br /&gt;&lt;br /&gt;So the input needs to wait until the queue is ready to accept a new message. Once it has done so, the output is notified and immediately acquires the queue lock and begins the dequeue operation. At the same time, the input has already finished input processing (as I said, this happens in virtually "no time" compared to the queue operation). So it needs to wait for the queue lock. Once the dequeue operation is finished, the output releases the lock, and processes the message in virtually no time, too. The input acquired the queue lock, and the whole story begins right from the start.&lt;br /&gt;&lt;br /&gt;A small queue may build up depending on the OS scheduler, but I think most often, input and output will just wait for the queue to complete. In that sense, this mode is similar to DIRECT mode, except that a queue can build up when the action needs to be retried.&lt;br /&gt;&lt;br /&gt;So to optimize such a scenario, the best thing to do is a totally new queue storage driver for such cases. Sequential files do not really work well if we have multiple producers running.&lt;br /&gt;&lt;br /&gt;This is a major effort and even then we need to think about the implications I raised in regard to processing cost above.&lt;br /&gt;&lt;br /&gt;First of all, rsyslog was never designed for this use case (preserve every message EVEN in case of sudden power fail). When I introduced purely disk-based queues, this was done to support disk-assisted mode. I needed a queue type to permit me store things on disk, if we run out of memory. As a "side-effect", a pure disk mode was available also (I'd never implemented it for the sake of itself). As it was there, I decided to expose this mode and made it user-configurable. I thought (probably correct) that it could solve some need - a need that I'd consider "very exotic" (think about the reliance on a audit-grade protocol for this to really make sense). And I added the checkpoint capability because it seemed useful, even with disk-based queues, which could be guarded from total loss of messages by using a reasonable checkpoint interval. Again, a checkpoint interval of one is permitted just because this capability came "for free" and could be handy in some use cases. &lt;br /&gt;&lt;br /&gt;The kiosk example we discussed 2008 (?) on the mailing list looked like a good match for such an exotic environment. Sudden power loss was an option, and we had low traffic volume. Bingo, perfect match.&lt;br /&gt;&lt;br /&gt;However, I'd never thought about a reasonable high-volume system using disk-only queues. Think about the cost functions, such a system boils down to a DIRECT mode queue which just takes an exceptional lot of time for processing messages.&lt;br /&gt;&lt;br /&gt;So probably the best approach for this situation would be to run the queue actually in direct mode. That removes the overwhelming cost of queue operations. Direct mode also ensures that the input receives an ack from the output [but there may be subtle issues which I need to check to make sure this is always the case, so do not take this for granted - but if it is not yet so, this should not be too complex to change]. With this approach, we have two issues left:&lt;br /&gt;&lt;br /&gt;a) the output action may be so slow, that it actually is the dominating cost factor and not disk queue operation&lt;br /&gt;&lt;br /&gt;b) the output action may block for an extended period of time (e.g. during a retry)&lt;br /&gt;&lt;br /&gt;In case a), a disk-queue makes sense, because it's cost is irrelevant in this scenario. Indeed, it is irrelevant under all circumstances. As such, we can configure a disk-only action queue in that case. Note that this implies a *very* slow output.&lt;br /&gt;&lt;br /&gt;Case b) is more complicated. We do NOT have any proper way to address it with current code. The solution IMHO is to introduce a new queue mode "Disk Queue on Delay" which starts an ultra-reliable disk queue (preferably with a faster queue store driver) if and only if the action indicates that it will need extended processing time. This requires some changes to action processing, but the action state machine should be capable to handle that with relatively slight modification [again, an educated guess, not a guarantee]). &lt;br /&gt;&lt;br /&gt;In that scenario, we run the action immediately whenever possible. Only if that take the (considerable) extra effort of buffering messages into a much-slower on disk queue. Note that such a mode makes only sense with audit-grade protocols and senders (which hold processing until the ACK has been received). As such, a busy system automatically slows down to the rate that the queue writer can handle. In this sense, the overall system (e.g. a financial trading system!) may be slowed down by the unavailability of a failing output (which in turn causes the extra and very high cost of disk queue operations). It needs to be considered if that is an acceptable price.&lt;br /&gt;&lt;br /&gt;The faster an ultra-reliable queue disk store driver performs, the more cases we can handle in the spirit of a) above. In theory, this can lead to elimination of b) cases. &lt;br /&gt;&lt;br /&gt;Nevertheless, I hope I have shown that re-designing the queue (drivers) to support high throughput AND ultra-reliable operations AT THE SAME TIME is far from being a trivial task. To do it right, it involves some other changes too.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-2175649153294986358?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/2175649153294986358/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=2175649153294986358' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/2175649153294986358'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/2175649153294986358'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/05/ultra-reliable-queueing-in-rsyslog.html' title='ultra-reliable queueing in rsyslog'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-8794538698384152330</id><published>2009-05-11T17:43:00.010+02:00</published><updated>2009-05-11T17:58:27.770+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog configuration graphs</title><content type='html'>&lt;b&gt;I worked today on adding a configuration graphing capability to &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt;.&lt;/b&gt; This was inspired by many discussions about how the rule engine works. From a high-level perspective, rsyslog is "just" a configurable message router, that routes messages from a set of inputs to a set of outputs, potentially with transformations doing to the messages. Rsyslog does so via the rule set, which is the most important part of the configuration file. In that sense, rsyslog is a configurable state machine and the rule set is its configuration.&lt;br /&gt;&lt;br /&gt;While typical syslog configurations are rather simple and easy to understand, complex ones can be challenging. The graphing capability we now have provide a high-level, human-readable representation of rsyslogd's internal control structures. The beauty with that is that every user can create an exact right diagram from his own configuration.&lt;br /&gt;&lt;br /&gt;I hope this is a useful tool for documenting a system setup, but I also think it is a very valuable tool for learning to understand rsyslog as well troubleshooting problems with message processing.&lt;br /&gt;&lt;br /&gt;With that said, I now send you to the new&lt;a href="http://www.rsyslog.com/doc-rsconf1_generateconfiggraph.html"&gt; graphing feature manual page&lt;/a&gt;, which I hope provides sufficient insight into how this feature is used.&lt;br /&gt;&lt;br /&gt;But... here is a sample graph to whet your appetite:&lt;br /&gt;&lt;center&gt;&lt;img src="http://www.rsyslog.com/modules/Static_Docs/data/rsyslog_confgraph_complex.png"&gt;&lt;/center&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-8794538698384152330?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/8794538698384152330/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=8794538698384152330' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/8794538698384152330'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/8794538698384152330'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/05/rsyslog-configuration-graphs.html' title='rsyslog configuration graphs'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-1449341093625132791</id><published>2009-05-08T13:34:00.004+02:00</published><updated>2009-05-08T14:16:37.070+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>Can "more reliable" actually mean "less reliable"?</title><content type='html'>On the rsyslog mailing list, we currently have a discussion about how reliable rsyslog should be. It circles about a small potential window of message loss in the case of sudden power failure. Rsyslog can be configured to put all messages into a disk queue (instead of main memory), so these messages survive such a powerfail condition. However, messages dequeued and scheduled for processing during the power outage may be lost. &lt;br /&gt;&lt;br /&gt;I now consider a case where we have bursty UDP traffic and rsyslog is configured to use a disk-only queue (which obviously is much slower than an in-memory queue). Looking at processing speeds, the max burst rate is limited by using an ultra-reliable queue. To avoid using UDP messages, a second instance could be run that uses an in-memory queue and forwards received messages to the one in ultra-reliable mode (that is with the disk-only queue). So that second instance queues in memory until the (slower) reliable rsyslogd can now accept the message and put it into the reliable queue. Let's say that you have a burst of r messages and that from these burst only r/2 can be enqueued (because the ultra reliable queue is so slow). So you lose r/2 messages.&lt;br /&gt;&lt;br /&gt;Now consider the case that you run rsyslog with just a reliable queue, one that is kept in memory but not able to cover the power failure scenario. Obviously, all messages in that queue are lost when power fails (or almost all to be precise). However, that system has a much broader bandwidth. So with it, there would never have been r messages inside the queue, because that system has a much higher sustained message rate (and thus the burst causes much less of trouble). Let's say the system is just twice as fast in this setup (I guess it usually would be *much* faster). Than, it would be able to process all r records.&lt;br /&gt;&lt;br /&gt;In that scenario, the ultra-reliable system loses r/2 messages, whereas the somewhat more "unreliable" system loses none - by virtue of being able to process messages as they arrive. &lt;br /&gt;&lt;br /&gt;Now extend that picture to messages residing inside the OS buffers or even those that are still queued in their sources because a stream transport blocked sending them.&lt;br /&gt;&lt;br /&gt;I know that each detail of this picture can be argued at length about.&lt;br /&gt;&lt;br /&gt;However, my opinion is that there is no "ultra-reliable" system in life, only various probabilities in losing messages. These probabilities  often depend on each other, what makes calculating them very hard to impossible. Still, the probability of message loss in the system at large is just the product of the probabilities in each of its  components. And reliability is just the inverse of that probability.&lt;br /&gt;&lt;br /&gt;This is where *I* conclude that it can make sense to permit a system to lose some messages under certain circumstances, if that influences the overall probability calculation towards the desired end result. In that sense, I tend to think that a fast, memory-queuing rsyslogd instance can be much more reliable compared to one that is configured as being ultra-reliable, where the rest of the system at large is badly influenced by this (the scenario above).&lt;br /&gt;&lt;br /&gt;However, I also know that for regulatory requirements, you often seem to need to prove that a system may not lose messages once it has received them, even at the cost of an overall increased probability of message loss.&lt;br /&gt;&lt;br /&gt;My view of reliability is much the same as my view of security: there is no such thing as "being totally secure", you can just reduce the probability that something bad happens. The worst thing in security is someone who thinks he is "totally secure" and as such is no longer actively looking at potential issues.&lt;br /&gt;&lt;br /&gt;The same I see for reliability. There is no thing like "being totally reliable" and it is a really bad idea to think you could ever be. Knowing this, one may begin to think about how to decrease the overall probability of message loss AND think about what rate is acceptable (and what to do with these cases, e.g. "how can they hurt").&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-1449341093625132791?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/1449341093625132791/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=1449341093625132791' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1449341093625132791'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1449341093625132791'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/05/can-more-reliable-actually-mean-less.html' title='Can &quot;more reliable&quot; actually mean &quot;less reliable&quot;?'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-3995873715899678004</id><published>2009-04-30T16:51:00.003+02:00</published><updated>2009-04-30T17:15:26.718+02:00</updated><title type='text'>A batch output handling algorithm</title><content type='html'>With this post, I'd like to reproduce &lt;a href="http://lists.adiscon.net/pipermail/rsyslog/2009-April/002003.html"&gt;a posting from David Lang on the rsyslog mailing list&lt;/a&gt;. I consider this to be important information and would like to have it available for easy reference.&lt;br /&gt;&lt;br /&gt;Here we go:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;br /&gt;the company that I work for has decided to sponser multi-message queue &lt;br /&gt;output capability, they have chosen to remain anonomous (I am posting from &lt;br /&gt;my personal account)&lt;br /&gt;&lt;br /&gt;there are two parts to this.&lt;br /&gt;&lt;br /&gt;1. the interaction between the output module and the queue&lt;br /&gt;&lt;br /&gt;2. the configuration of the output module for it's interaction with the &lt;br /&gt;database&lt;br /&gt;&lt;br /&gt;On for the first part (how the output module interacts with the queue), the &lt;br /&gt;criteria are that&lt;br /&gt;&lt;br /&gt;1. it needs to be able to maintain guarenteed delivery (even in the face &lt;br /&gt;of crashes, assuming rsyslog is configured appropriately)&lt;br /&gt;&lt;br /&gt;2. at low-volume times it must not wait for 'enough' messages to &lt;br /&gt;accumulate, messages should be processed with as little latency as &lt;br /&gt;possible&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;to meet these criteria, what is being proposed is the following&lt;br /&gt;&lt;br /&gt;a configuration option to define the max number of messages to be &lt;br /&gt;processed at once.&lt;br /&gt;&lt;br /&gt;the output module goes through the following loop&lt;br /&gt;&lt;br /&gt;X=max_messages&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;if (messages in queue)&lt;br /&gt;   mark that it is going to process the next X messages&lt;br /&gt;   grab the messages&lt;br /&gt;   format them for output&lt;br /&gt;   attempt to deliver the messages&lt;br /&gt;   if (message delived sucessfully)&lt;br /&gt;     mark messages in the queue as delivered&lt;br /&gt;     X=max_messages (reset X in case it was reduced due to delivery errors)&lt;br /&gt;   else (delivering this batch failed, reset and try to deliver the first half)&lt;br /&gt;     unmark the messages that it tried to deliver (putting them back into the status where no delivery has been attempted)&lt;br /&gt;     X=int(# messages attempted / 2)&lt;br /&gt;     if (X=0)&lt;br /&gt;       unable to deliver a single message, do existing message error &lt;br /&gt;process&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;this approach is more complex than a simple 'wait for X messages, then &lt;br /&gt;insert them all', but it has some significant advantages&lt;br /&gt;&lt;br /&gt;1. no waiting for 'enough' things to happen before something gets written&lt;br /&gt;&lt;br /&gt;2. if you have one bad message, it will transmit all the good messages &lt;br /&gt;before the bad one, then error out only on the bad one before picking up &lt;br /&gt;with the ones after the bad one.&lt;br /&gt;&lt;br /&gt;3. nothing is marked as delivered before delivery is confirmed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;an example of how this would work&lt;br /&gt;&lt;br /&gt;max_messages=15&lt;br /&gt;&lt;br /&gt;messages arrive 1/sec&lt;br /&gt;&lt;br /&gt;it takes 2+(# messages/2) seconds to process each message (in reality the &lt;br /&gt;time to insert things into a database is more like 10 + (# messages / 100) &lt;br /&gt;or even more drastic)&lt;br /&gt;&lt;br /&gt;with the traditional rsyslog output, this would require multiple output &lt;br /&gt;threads to keep up (processing a single message takes 1.5 seconds with &lt;br /&gt;messages arriving 1/sec)&lt;br /&gt;&lt;br /&gt;with the new approach and a cold start you would see&lt;br /&gt;&lt;br /&gt;message arrives (Q=1) at T=0&lt;br /&gt;om starts processing message a T=0 (expected to take 2.5)&lt;br /&gt;message arrives (Q=2) at T=1&lt;br /&gt;message arrives (Q=3) at T=2&lt;br /&gt;om finishes processing message (Q=2) at T=2.5&lt;br /&gt;om starts processing 2 messages at T=2.5 (expected to take 3)&lt;br /&gt;message arrives (Q=4) at T=3&lt;br /&gt;message arrives (Q=5) at T=4&lt;br /&gt;message arrives (Q=6) at T=5&lt;br /&gt;om finishes processing 2 messages  (Q=4) at T=5.5&lt;br /&gt;om starts processing 4 messages at T=5.5 (expected to take 4)&lt;br /&gt;message arrives (Q=5) at T=6&lt;br /&gt;message arrives (Q=6) at T=7&lt;br /&gt;message arrives (Q=7) at T=8&lt;br /&gt;message arrives (Q=8) at T=9&lt;br /&gt;om finishes processing 4 messages  (Q=4) at T=9.5&lt;br /&gt;om starts processing 4 messages at T=9.5 (expected to take 4)&lt;br /&gt;&lt;br /&gt;the system is now in a steady state&lt;br /&gt;&lt;br /&gt;message arrives (Q=5) at T=10&lt;br /&gt;message arrives (Q=6) at T=11&lt;br /&gt;message arrives (Q=7) at T=12&lt;br /&gt;message arrives (Q=8) at T=13&lt;br /&gt;om finishes processing 4 messages  (Q=4) at T=13.5&lt;br /&gt;om starts processing 4 messages at T=13.5 (expected to take 4)&lt;br /&gt;&lt;br /&gt;if a burst of 10 extra messages arrived at time 13.5 this last item would &lt;br /&gt;become&lt;br /&gt;&lt;br /&gt;11 messages arrive at (Q=14) at T=13.5&lt;br /&gt;om starts processing 14 messages at T=13.5 (expected to take 9)&lt;br /&gt;message arrives (Q=15) at T=14&lt;br /&gt;message arrives (Q=16) at T=15&lt;br /&gt;message arrives (Q=17) at T=16&lt;br /&gt;message arrives (Q=18) at T=17&lt;br /&gt;message arrives (Q=19) at T=18&lt;br /&gt;message arrives (Q=20) at T=19&lt;br /&gt;message arrives (Q=21) at T=20&lt;br /&gt;message arrives (Q=22) at T=21&lt;br /&gt;message arrives (Q=23) at T=22&lt;br /&gt;om finishes processing 14 messages (Q=9) at T=22.5&lt;br /&gt;om starts processing 9 messages at T=22.5 (expected to take 6.5)&lt;br /&gt;&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-3995873715899678004?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/3995873715899678004/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=3995873715899678004' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/3995873715899678004'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/3995873715899678004'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/04/batch-output-handling-algorithm.html' title='A batch output handling algorithm'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-5575130864634860229</id><published>2009-04-27T09:53:00.003+02:00</published><updated>2009-04-27T10:42:21.584+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='reliable'/><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>Levels of reliabilty</title><content type='html'>We had a good discussion about reliability in &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; this morning. On the mailing list, it started with a question about the dynafile cache, but quickly morphed into something else. As the &lt;a href="http://lists.adiscon.net/pipermail/rsyslog/2009-April/002082.html"&gt;mailing list thread&lt;/a&gt; is rather long, I'll try to do a quick excerpt of those things that I consider vital.&lt;br /&gt;&lt;br /&gt;First a note on RELP, which is a reliable transport protocol. This was the relevant thought from the discussion:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;i&gt;I've got relp set up for transfer - but apparently I discovered&lt;br /&gt;that relp doesnt take care of a "disk full" situation on the receiver&lt;br /&gt;end? I would have expected my old entries to come in once I had cleared the disk space, but no... I'm not complaining btw - just remarking that this was an unexpected behaviour for me.&lt;/i&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;That has nothing to do with RELP. The issue here is that the file output writer (in v3) uses the sysklogd concept of "if I can't write it, I'll throw it away". This is another issue that was "fixed" in v4 (not really a fix, but a conceptual change).&lt;br /&gt;&lt;br /&gt;If RELP gets an ack from the receiver, the message is delivered from the RELP POV. The receiving end acks, so everything is done for RELP. Some thing if you queue at the receiver and for some reason lose the queue.&lt;br /&gt;&lt;br /&gt;RELP is reliable transport, but not more than that. However, if you need reliable end-to-end, you can do that by running the receiver totally synchronous, that is all queues (including the main message queue!) in direct mode. You'll have awful performance and will lose messages if you use anything other than RELP for message reception (well, plain tcp works mostly correct, too), but you'll have synchronous end-to-end. Usually, reliable queuing is sufficient, but then the sender does NOT know when the message was actually processed (just that the receiver enqueued it, think about the difference!).&lt;br /&gt;&lt;br /&gt;&lt;i&gt;This explanation triggered further questions about the difference in end-to-end reliability between direct queue mode versus disk based queues:&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The core idea is that a disk-based queue should provide sufficient reliability for most use cases. One may even question if there is a reliability difference at all. However, there is a subtle difference:&lt;br /&gt;&lt;br /&gt;If you don't use direct mode, than processing is no longer synchronous. Think about the street analogy:&lt;br /&gt;&lt;br /&gt;&lt;a href=" http://www.rsyslog.com/doc-queues_analogy.html"&gt;&lt;br /&gt;http://www.rsyslog.com/doc-queues_analogy.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For synchronous, you need the u-turn like structure.&lt;br /&gt;&lt;br /&gt;If you use a disk-based queue, I'd say it is sufficiently reliable, but it is no longer an end-to-end acknowledgement. If I had this scenario, I'd go for the disk queue, but it is not the same level of reliability. &lt;br /&gt;&lt;br /&gt;Wild sample: sender and receiver at two different geographical locations. Receiver writes to database, database is down. &lt;br /&gt;&lt;br /&gt;Direct queue case: sender blocks because it does not receive ultimate ack (until database is back online and records are committed).&lt;br /&gt;&lt;br /&gt;Disk queue case: sender spools to receiver disk, then considers records committed. Receiver ensures that records are actually committed when database is back up again. You use ultra-reliable hardware for the disk queues.&lt;br /&gt;&lt;br /&gt;Level of reliability is the same under almost all circumstances (and I'd expect "good enough" for almost all cases). But now consider we have a disaster at the receiver's side (let's say a flood) that causes physical loss of receiver.&lt;br /&gt;&lt;br /&gt;Now, in the disk queue case, messages are lost without the sender knowing. In direct queue case we have no message loss.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;And then David Lang provided a perfect explanation (to which I fully agree) why in practice a disk-based queue can be considered mostly as reliable as direct mode:&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;i&gt;&lt;br /&gt;&gt; Level of reliability is the same under almost all circumstances (and I'd&lt;br /&gt;&gt; expect "good enough" for almost all cases). But now consider we have a&lt;br /&gt;&gt; disaster at the receiver's side (let's say a flood) that causes physical loss&lt;br /&gt;&gt; of reciver.&lt;br /&gt;&lt;br /&gt;no worse than a disaster on the sender side that causes physical loss of the sender.&lt;br /&gt;&lt;br /&gt;you are just picking which end to have the vunerability on, not picking if you will have the vunerability or not (although it's probably cheaper to put reliable hardware on the upstream reciever than it is to do so on all senders)&lt;br /&gt;&lt;br /&gt;&gt; Now, in the disk queue case, messages are lost without sender knowing. In &lt;br /&gt;&gt; direct queue case we have no message loss.&lt;br /&gt;&lt;br /&gt;true, but you then also need to have the sender wait until all hops have been completed. that can add a _lot_ of delay without nessasarily adding noticably to the reliability. the difference between getting the message stored in a disk-based queue (assuming it's on redundant disks with fsync) one hop away vs the message going a couple more hops and then being stored in it's final destination (again assuming it's on redundant disks with fsync) is really not much in terms of reliability, but it can be a huge difference in terms of latency (and unless you have configured many worker threads to allow you to have the messages in flight at the same time, throughput also drops)&lt;br /&gt;&lt;br /&gt;besides which, this would also assume that the ultimate destination is somehow less likely to be affected by the disaster on the recieving side than the rsyslog box. this can be the case, but usually isn't.&lt;br /&gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;That leaves me with nothing more to say ;)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-5575130864634860229?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/5575130864634860229/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=5575130864634860229' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/5575130864634860229'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/5575130864634860229'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/04/levels-of-reliabilty.html' title='Levels of reliabilty'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-9004294394299253933</id><published>2009-04-08T14:00:00.002+02:00</published><updated>2009-04-08T14:05:22.061+02:00</updated><title type='text'>what is "nextmaster" good for?</title><content type='html'>People that looked at &lt;a href="http://git.adiscon.com/?p=rsyslog.git;a=summary"&gt;rsyslog's git&lt;/a&gt; may have wondered what the branch "nextmaster" is good for. This actually is an indication that the next &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; stable/beta/devel rollover will happen soon. With it, the current beta becomes the next v3-stable. At the same time, the current (v4) devel becomes the next beta (which means there won't be any beta any longer in v3). In order to facilitate this, I have branched of "nextmaster", which I will currently work on. The "master" branch will no longer be touched and soon become beta. Then, I will merge "nextmaster" back into the "master" branch and continue to work with it.&lt;br /&gt;&lt;br /&gt;The bottom line is that you currently need to pull nextmaster if you would like to keep current on the edge of development. Sorry for any inconvenience this causes, but this is the best approach I see to go through the migration (and I've done the same in the past with good success, just that then nobody noticed it ;)).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-9004294394299253933?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/9004294394299253933/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=9004294394299253933' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/9004294394299253933'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/9004294394299253933'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/04/what-is-nextmaster-good-for.html' title='what is &quot;nextmaster&quot; good for?'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-4966254493738283843</id><published>2009-04-01T14:24:00.008+02:00</published><updated>2009-04-01T15:04:58.060+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog going to outer space</title><content type='html'>&lt;span style="font-weight:bold;"&gt;&lt;a href="http://www.ryslog.com"&gt;Rsyslog&lt;/a&gt; was designed to be a flexible and ultra-reliable platform for demanding applications.&lt;/span&gt; Among others, it is designed to work very well in occasionally connected systems. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;There are some systems that are inherently occasionally connected - space ships&lt;/span&gt;. And while we are still a bit away from the Star Trek way of doing things, current space technology needs a "captain's star log". Even for spacecraft, it is important when and why systems were powered up, over- or under-utilized or malfunction (for example, due to "attack" not of a Klingon, but a cosmic ray). And all of this information needs to be communicated back to earth, where it can be reviewed and analyzed. For all of this, systems capable of reliable transmission in a disconnected environment are needed.&lt;br /&gt;&lt;br /&gt;Inspired by NASA's needs, the &lt;a href="http://www.irtf.org/"&gt;Internet Resarch Task Force&lt;/a&gt; (the research branch of the &lt;a href="http://www.ietf.org"&gt;IETF&lt;/a&gt;) is working on a protocol named &lt;a href="http://www.irtf.org/charter?gtype=rg&amp;group=dtnrg"&gt;DTN&lt;/a&gt;, usually called the interplanetary Internet.&lt;br /&gt;&lt;br /&gt;As we probably all know, &lt;a href="http://spacelaunch.gerhards.net/2007/10/windows-xp-will-go-into-space.html"&gt;Microsoft Windows flies on the Space Shuttle&lt;/a&gt;. And, more importantly, &lt;a href="http://www.linuxjournal.com/article/2186"&gt;Linux also did&lt;/a&gt;. With the growing robustness of Open Source, future space missions will most probably contain more Linux components.&lt;br /&gt;&lt;br /&gt;This overall trend will also be present in &lt;a href="http://www.jpl.nasa.gov/news/features.cfm?feature=2035"&gt;NASA's and ESA's future Jupiter mission&lt;/a&gt;. There is a lot of information technology on the upcoming spacecraft, and so there is a lot of things worth logging. While specialized software is usually required for spacecraft operations, it is considered the rsyslog as the leading provider of reliable occasionally connected logging infrastructures may extend its range into the solar system. It only sounds logical to use all the technology we already have in place for reliable logging even under strange conditions (see "&lt;a href="http://www.rsyslog.com/doc-rsyslog_reliable_forwarding.html"&gt;reliable forwarding&lt;/a&gt;"). Of importance is also rsyslog's speed and robustness.&lt;br /&gt;&lt;br /&gt;As a consequence, we have today begun to implement the DTN protocol for the interplanetary Internet. That will be "omdtn" and is available as part of the &lt;a href="http://git.adiscon.com/?p=rsyslog.git;a=shortlog;h=refs/heads/spaceship"&gt;rsyslog spaceship git branch&lt;/a&gt;. This branch is available as of now from the public git repository.&lt;br /&gt;&lt;br /&gt;We could also envision that mission controllers will utilize &lt;a href="http://www.phplogcon.org"&gt;phpLogCon&lt;/a&gt; to help analyze space craft malfunction. A very interesting feature is also rsyslog's modular architecture, which could be used to radiate a new communication plugin up to the space ship, in case this is required to support some alien format. This also enables the rsyslog team to provide an upgrade to the Interstellar Internet, should this finally be standardized in the IETF. If so, and provided the probe has enough consumables, it may be in the best spot to work as a stellar relay between us and whoever else.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-4966254493738283843?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/4966254493738283843/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=4966254493738283843' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/4966254493738283843'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/4966254493738283843'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/04/rsyslog-going-to-outer-space.html' title='rsyslog going to outer space'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-3303127260369453605</id><published>2009-03-27T10:34:00.003+01:00</published><updated>2009-03-27T10:45:38.054+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>is freshmeat now dead?</title><content type='html'>I used &lt;a href="http://www.freshmeat.net"&gt;freshmeat.net&lt;/a&gt; -both as an user and a project author- for several years and like the clean and efficient interface. Now, they have revamped the whole thing and I have to admit I personally think they screwed up while doing so.&lt;br /&gt;&lt;br /&gt;First of all, a project has a structure that consists of various branches, each of them coming in different versions (see my post on the &lt;a href="http://blog.gerhards.net/2009/03/rsyslog-family-tree.html"&gt;rsyslog family tree&lt;/a&gt;. In the old interface, you had branches and versions, and everyone could clearly see what belonged to where. In the new interface (as I understand it), you have a bunch of links that you can label. So I now have to deal with a flat structure and labels. This is NOT how software grows. And as this no longer is a real-world abstraction, it has become quite complicated to assign meaningful values. Not to mention that the big bunch of links is probably quite confusing to users.&lt;br /&gt;&lt;br /&gt;I'll probably deal with that by removing all but the development branches. Better to have consistent information than to have everything...&lt;br /&gt;&lt;br /&gt;I also miss the statistics counters. They provided some good insight into what users where interested in and what effect releases had. Very valuable for me as an author, but also valuable for me as a user, for example, when I want to judge how active a project is. Freshmeat promised (on March, 15th) to bring back statistics "in a few days", but today (March, 27th), they are still missing. And if they eventually appear and follow the rest of the design paradigm, I am skeptical if there is really value in them.&lt;br /&gt;&lt;br /&gt;All in all, I am very dissatisfied. I am sad to have lost a valuable open source resource. So what to do now? Sourceforge again - don't like that either. Ohloh? Maybe. Probably it's best to concentrate on our own web resources... But first of all, I'll wait a couple of days/weeks and hope that freshmeat will become usable again. But please don't expect too many announcements on freshmeat from me for the time being.&lt;br /&gt;&lt;br /&gt;There is also an interesting &lt;a href="http://freshmeat.net/articles/welcome-to-freshmeatnet-30"&gt;discussion thread on the new freshmeat design&lt;/a&gt;, I suggest to give it a read (you'll also find that others like it!)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-3303127260369453605?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/3303127260369453605/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=3303127260369453605' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/3303127260369453605'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/3303127260369453605'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/is-freshmeat-now-dead.html' title='is freshmeat now dead?'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-1974693206174217812</id><published>2009-03-23T18:34:00.006+01:00</published><updated>2009-03-27T13:05:35.993+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog "family tree"</title><content type='html'>I have created a &lt;a href="http://www.rsyslog.com"&gt;rsyslog &lt;/a&gt;"family tree" showcasing how the various branches and versions go together. It is a condensed graph of the git DAG and shows a few feature branches as an example. I personally think it provides a good overview of how rsyslog work progresses (click picture for larger version).&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.gerhards.net/Gallery-rsyslog-picfull-rsyslog_vers.phtml"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 384px; height: 640px;" src="http://www.gerhards.net/albums/rsyslog/rsyslog_vers.sized.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;In red is the git master branch, blue are currently supported stable branches. Branch head "v1-stable" is dotted, because it is no longer officially supported. Dashed nodes are versions on feature branches, solid nodes are versions on main branches. Solid lines are direct ancestors, dashed lines indicate that there are some versions in between. Lots of feature branches have not been show. Bug fixes are typically applied to the oldest code experiencing the problem and then merged into the more recent versions, thus the code flow for bug fixes is kind of reverse. This bug fixing code flow is not shown inside the graph.&lt;br /&gt;&lt;br /&gt;Note that you can use gitk to create the full DAG from the git archive. The purpose of my effort is to show the relationships that are well-hidden in gitk's detailled view.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;I have written a much more elaborate post about the "&lt;a href="http://www.wissenslogs.de/wblogs/blog/mehr-als-bits-und-bytes/allgemein/2009-03-24/software-evolution"&gt;evolution of software&lt;/a&gt;", unfortunately, it is available currently only in German (with very questionable results by &lt;a href="http://translate.google.com/translate?prev=hp&amp;hl=en&amp;js=n&amp;u=http%3A%2F%2Fwww.wissenslogs.de%2Fwblogs%2Fblog%2Fmehr-als-bits-und-bytes%2Fallgemein%2F2009-03-24%2Fsoftware-evolution&amp;sl=de&amp;tl=en"&gt;Google Translate&lt;/a&gt;).&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-1974693206174217812?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/1974693206174217812/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=1974693206174217812' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1974693206174217812'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1974693206174217812'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/rsyslog-family-tree.html' title='rsyslog &quot;family tree&quot;'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-6603434572766362948</id><published>2009-03-17T12:39:00.006+01:00</published><updated>2009-03-17T13:02:41.204+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='syslog'/><category scheme='http://www.blogger.com/atom/ns#' term='ietf'/><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>Why is there still PRI in a syslog message?</title><content type='html'>This is the first of a couple of blog posts I intend to do in response to Raffy's post on &lt;a href="http://raffy.ch/blog/2008/12/18/comments-on-the-syslog-protocol-internet-draft/"&gt;syslog-protocol&lt;/a&gt;. I am very late, but better now than never. Raffy raised some good points. To some I agree, to some not and for some others it is probably interesting to see why things are as they are.&lt;br /&gt;&lt;br /&gt;The bottom line is that this standard - as probably every standard - is a compromise of what could be agreed on by a larger group of people and corporate interests. Reading the IETF mailing list archives will educate much about this process, but I will dig out those interesting entry points into the mass of posts for you.&lt;br /&gt;&lt;br /&gt;I originally thought I reply with a single blog post to Raffy. However, this tends to be undoable - every time I intend to start, something bigger and more important comes into my way. So I am now resorting to more granualar answers - hopefully this work.&lt;br /&gt;&lt;br /&gt;Enough said, on the the meat. Raffy said:&lt;br /&gt;&lt;i&gt;&lt;blockquote&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Syslog message facility&lt;/strong&gt;: Why still keeping this? The only reason that I see people using the facility is to filter messages. There are better ways to do that. Some of the pre-assigned groups are fairly arbitrary and not even really implemented in most OSs. UUCP subsystem? Who is still using that? I guess the reason for keeping it is backwards compatibility? If possible, I would really like this to be gone.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Priority&lt;/strong&gt; calculation: The whole priority field is funky. The priority does not really have any meaning. The order does not imply importance. Why having this at all?&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;/i&gt;&lt;br /&gt;And I couldn't agree more with this. In my personal view, keeping with the old-style facility is a large debt, but it was necessary to make the standard happen. Over time, I have to admit, I even tend to think it was a good idea to stick with this format, it actually eases transition.&lt;br /&gt;&lt;br /&gt;Syslog-protocol has a long history. We thought several times we were done, and the first time this happened was in November, 2005. Everything was finalized and then there was a quite unfortunate (or fortunate, as you may say now ;)) IETF meeting. I couldn't attend (too much effort to travel around the world for a 30-minute meeting...) and many other WG participants also could not.&lt;br /&gt;&lt;br /&gt;It took us by surprise that the meeting agreed the standard was far from ready for publishing (read the &lt;a href="http://www.mail-archive.com/syslog@lists.ietf.org/msg00099.html"&gt;meeting minutes&lt;/a&gt;). The objection raised a very long (and productive, I need to admit) WG maling list discussion. To really understand the spirit of what happened later, it would be useful to read &lt;a href="http://www.mail-archive.com/syslog@lists.ietf.org/mail7.html"&gt;mailing list archives&lt;/a&gt; starting with November, 14th. &lt;br /&gt;&lt;br /&gt;However, this is lots of stuff, so let me pick out some posts that I find important. The most important fact is that &lt;a href="http://www.mail-archive.com/syslog@lists.ietf.org/msg00226.html"&gt;backward compatibility became the WG charter's top priority&lt;/a&gt; (&lt;a href="http://www.mail-archive.com/syslog@lists.ietf.org/msg00222.html"&gt;one more post to prove the point&lt;/a&gt;). Among others, it was strongly suggested that both the PRI as well as the RFC 3164 timestamp be preserved. Thankfully, I was able to proof that there was &lt;a href="http://www.syslog.cc/ietf/existing-syslog.html"&gt;no common understanding on the date part in different syslog server&lt;/a&gt; (actually, the research showed that nothing but PRI is common among syslogds...). So we went down and decided that PRI must be kept as is - to favor compatibility.&lt;br /&gt;&lt;br /&gt;As I said, I did not like the decision at that time and I still do not like the very limited number of facilities that it provides to us (actually, I think facility is mostly useless). However, I have accepted that there is wisdom in trying to remain compatible with existing receivers - we will stick with them for a long time.&lt;br /&gt;&lt;br /&gt;So I have to admit that I think it was a good decision to demand PRI begin compatible. With structured data and the other header fields, we do have ways of specifying different "facilities", that is originating processes. Take this approach: look at facility as a down-level filtering capability. If you have a new syslogd (or write one!) make sure you can filter on all the other rich properties and not just facility.&lt;br /&gt;&lt;br /&gt;In essence, I think this is the story why, in 2009, we still have the old-style PRI inside syslog messages...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-6603434572766362948?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/6603434572766362948/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=6603434572766362948' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/6603434572766362948'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/6603434572766362948'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/why-is-there-still-pri-in-syslog.html' title='Why is there still PRI in a syslog message?'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-988231397021302838</id><published>2009-03-12T18:22:00.000+01:00</published><updated>2009-03-12T18:22:47.890+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>How Software gets stable...</title><content type='html'>&lt;span style="font-weight: bold;"&gt;I have received a couple of questions the past days if this or that &lt;a href="http://www.rsyslog.com/"&gt;rsyslog&lt;/a&gt; feature can be introduced into the stable branch soon.&lt;/span&gt; So I thought it is time to blog about what makes software stable - and what not...&lt;br /&gt;&lt;br /&gt;But let me first start by something apparently unrelated: let me confess that, from time to time, I like to enjoy some good wine (Californian Merlot and Cabernet especially - ask my for my mailing address if you would like to contribute some! ;)). And at some special occasions, I spend way to much money just to get the "old stuff": those nice wines that have aged in oak barriques. To cut a long story short, those wines are stored in barrels not only for storage, but because the exposure to the oak, as well as some properties of the storage container, interact with the wine and make it taste better. Wikipedia has the &lt;a href="http://en.wikipedia.org/wiki/Oak_%28wine%29"&gt;full story&lt;/a&gt;, and also this interesting quote:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-style: italic;"&gt;The length of time that a wine spends in the barrel is dependent on the varietal and style of wine that the winemaker wishes to make. The majority of oak flavoring is imparted in the first few months that the wine is in contact with oak but a longer term exposure can affect the wine through the light aeration that the barrel allows which helps to precipitate the phenolic compounds and quickens the aging process of the wine.[8] New World Pinot noir may spend less than a year in oak. Premium Cabernet Sauvignon may spend two years. The very tannic Nebbiolo grape may spend four or more years in oak. High end Rioja producers will sometimes age their wines up to ten years in American oak to get a desired earthy, vanilla character.&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;Read it again: "&lt;i&gt;High end Rioja producers will sometimes age their wines up to &lt;b&gt;ten years&lt;/b&gt; in American oak to get a desired earthy, vanilla character.&lt;/i&gt;"&lt;br /&gt;&lt;br /&gt;So what would the Riojan winemaker probably say if you asked him for a great 2008 wine (we are in early 2009 currently, just for the records)? How about "&lt;span style="font-style: italic;"&gt;Be patient, my friend - wait another 9 years, and you can enjoy it!&lt;/span&gt;" And what if you begged him you need it now, immediately? "&lt;span style="font-style: italic;"&gt;I am sorry, but I can't accelerate time...&lt;/span&gt;". And if you told him you really, really need it because otherwise you can not close an important business deal? Maybe he says "&lt;span style="font-style: italic;"&gt;Listen my friend. Some things simply need time. You can't hurry them. But if you need to have something that can't really exist, I can get you a bottle of that wine and label it as 'Famos Riojan 10-year aged Wine from 2008' - but we both know what is in the bottle!&lt;/span&gt;". Technically speaking, the winemaker is not even cheating - he claims that the wine is from 2008, and so how can it be aged 10 years? If anyone buys that (today), the onlooker is probably very much in fault.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;As a side-note, all too often our society works in that way: someone requests something that is impossible to do, someone begs long enough until someone else cheats, everybody knows - and we all are happy&lt;/span&gt; (at least up to the point where the cheat gets us into real trouble... - pick your favorite economic crisis to elaborate).&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The moral from the story? Some things need time&lt;/span&gt;. And you can't replace time by anything else. If you want to have the real taste of a wine aged 10 years in oak... you need 10 years.&lt;br /&gt;&lt;br /&gt;By now &lt;span style="font-weight: bold;"&gt;you probably wonder what all of this has to do with software. A lot!&lt;/span&gt; Have you ever thought what makes software stable? In closed source, you hopefully have a large testing department that helps you nail down bugs. In open source, you usually do not have many of these folks, but you have something much better: a community of loyal users eager to break their systems with the latest and greatest of what you happen to have thrown together ;)&lt;br /&gt;&lt;br /&gt;In either case, you start with a relatively unstable program and with each bug report (assuming you fix it), the software gets more stable. While fixing bugs, however, you may introduce new instabilities. The larger the fix, the larger the risk. So the more you change, the larger the need to re-test and the larger the probability that while one issue is fixed one (or more!) issues have been newly created. For very large fixes, you may even end with a much worse version of the software than you had before.&lt;br /&gt;&lt;br /&gt;Thankfully, a patch to fix a bug is usually much smaller than what was fixed. Often, it is just a few lines of code, so the risk to worsen things is low. Why is the patch usually just a few lines long? Simply because you fix some larger thing that usually works quite well. So you need to change some details which were not properly thought out and thus resulted in wrong behavior (if you made a design error, that's a different story...).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;So the more bug reports you get, and the more of them you fix, the more stable a software gets.&lt;/span&gt; You may have seen some formal verifications in computer science, but in practice, for most applications, this is the simple truth on how things work.&lt;br /&gt;&lt;br /&gt;Now to new features: features are usually the opposite from a bugfix: introducing a new feature tends to be a larger effort, touching much more code and adding code where code never has been ;) If you add new features, chances are great that you introduce new bugs. So &lt;span style="font-weight: bold;"&gt;with each feature added, you should expect that the stability of your code decreases&lt;/span&gt; (and, oh boy, it does!). So how to iron out these newly introduced bugs? Simply wait for bug reports, fix them, wait for more - until you have reached at least a decent level of stability (aka "no new/serious bug reports received for a period of n days, whatever you have n defined to be).&lt;br /&gt;&lt;br /&gt;And what if you then introduce a new feature? I guess by now you know: that'll decrease stability so you need to iterate through the bugfixing process ... and so on.&lt;br /&gt;&lt;br /&gt;But, hey, we are doing open source. I *love* to add features every day! Umm... I guess my program will never reach a decent level of stability. Bad...&lt;br /&gt;&lt;br /&gt;What to do? Taking a long vacation (seducing...) is not a real solution. Who will fix bugs while I am away (shame on me for mentioning this...)? But a pattern appears if you follow this thought: &lt;span style="font-weight: bold;"&gt;what you need to do to make a program stable is fix bugs for a period of time but refrain from adding new features!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Thanks to &lt;a href="http://en.wikipedia.org/wiki/Git_%28software%29"&gt;git&lt;/a&gt;, this can easily be done: you simply create one code branch for a version that shall become stable, and create another branch for the version where you create new features (the development branch). With a bit of git vodoo, you can even import fixes from your stabilizing branch to the development branch. Once you are happy with the stability of your code (in the stabilizing branch), you are ready to declare it to be stable! For that, you'll probably have a separate branch. Then, you can start the game again: copy the state of your development branch to the stabilizing branch, do not touch that branch except for bug fixes and continue adding new features to the development branch. Iterate this as long as you are interested in your project.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;This, in short form, is how rsyslog is created.&lt;/span&gt; Currently, there are four main branches, plus a number of utility branches that aid the development of specific features (let's ignore them in this context here): we have the development (also called "master") branch which equates to the ... yes... development branch from the sample above;). The stabilizing branch is called "beta" in rsyslog terms. Then, we have a v2-stable and a v3-stable branch. Both are actually stable, but v2-is probably even more stable because it has - except for bug fixes - not been touched for many months more. It also has the fewest features, so it is probably the best choice if you are primarily interested in stability and do not need any of the new features. As rsyslog is further developed, we will add extra stable branches (e.g. there will probably be a v4- and v5-stable branch - but we may also no longer maintain v2-stable at this point because nobody uses it any longer [just like dinosaurs are no longer maintained ;)]).&lt;br /&gt;&lt;br /&gt;Did you read carefully? Did you get the message?  So let me ask:&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;What makes software stable?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Bug fixes? Testing? Money (yes, yes, please throw at me!)?&lt;br /&gt;&lt;br /&gt;REALLY? Let me repeat:&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;WHAT MAKES SOFTWARE STABLE?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There is only one real ingredient and that is: &lt;span style="font-weight: bold;"&gt;TIME!&lt;/span&gt; &lt;span style="font-weight: bold; font-style: italic;"&gt;Just like good wine, software needs to age. &lt;/span&gt;Thankfully, age, for software, is defined in number of different test cases. So money can accelerate aging of software (as some chemistry guru may be able for wine, probably with the same side-effects...). But for the typical open source project, stability simply goes along with the rate at which the community adopts new releases, tests them AND submits bugs, so that the authors can work on fixing broken things.&lt;br /&gt;&lt;br /&gt;And what is the moral of the story? Finally, I am coming back to the opening questions: there is nothing but time that make rsyslog stable. &lt;span style="font-weight: bold;"&gt;So if you ask me to add a feature today, and I do, you can not expect it to be immediately stable - simply because this is not how things work&lt;/span&gt; (thanks, btw, for trusting so much in my programming abilities ;)). The new feature needs to go through all the stages, that is it must be applied to the current development build (otherwise we would de-stabilize the current beta, what is not desirable). Then, this is migrated to the stable build over time, where it can finally fully stabilize and, whenever the bug rate seems to justify this, it can move on to the stable build. For rsyslog, this typically means between three to four, sometimes more month are needed before a new feature hits the stable branches. And there is little you can do against that.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;"But... hey, I need a stable version of that cool feature now! My manager demands it. Hey, I'll also pay you for it..."&lt;/span&gt; Guess what? I can do the same the winemaker did. Of course, and if you ask really nicely, I can create a v3-stable-cool version for you, which is a version with the cool feature that I have declared immediately stable (btw, it's mostly the same thing that all others just cal l "the beta").  If that satisfies your boss, I'll happy to do. But we both know what you have gotten... ;)&lt;br /&gt;&lt;br /&gt;Of course, I am exaggerating a bit here: in software, we can somewhat increase the speed of stabilizing by adding testers. Money (and even more motivation) can do that. &lt;span style="font-weight: bold;"&gt;We can also backport single new features to so-far stable branches&lt;/span&gt; (note the fine print!). This reduces the stability a bit, but obviously not as much as for the development version. However, this requires effort (read: time and/or money) and it may be impractical for many features. Some features simply rely on others that were newly introduced in that development version and if you backport the whole bunch of them, you'll have something as much changed as the development version, but in an environment where the component integration is not as well tested and understood. &lt;span style="font-weight: bold;"&gt;Of course, some company policies (seem to) force you to do that. If so, the end result is that you have a system that is much less stable than the development version, but has a seemingly "stable" label. &lt;/span&gt;Wow, how cool! As the common sense says says: "everyone gets what one asks for" ;)&lt;br /&gt;&lt;br /&gt;So what is the bottom line? &lt;span style="font-weight: bold;"&gt;Good software and good wine has something in common: time to ripen!&lt;/span&gt; Think about this the next time to ask me to offer a new feature as part of a stable branch. Its simply impossible. But, of course, you can bribe me to stick that "stable" label onto a mangled-with version...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-988231397021302838?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/988231397021302838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=988231397021302838' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/988231397021302838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/988231397021302838'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/how-software-gets-stable.html' title='How Software gets stable...'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-1585178233767050883</id><published>2009-03-12T17:09:00.006+01:00</published><updated>2009-03-12T18:05:22.572+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iss'/><category scheme='http://www.blogger.com/atom/ns#' term='space'/><title type='text'>ISS unter debris hit threat!</title><content type='html'>In case you have not yet heard it on the twittersphere, here is something you should really look into: there is a so-called "red" threat that the ISS is being hit by debris. The ISS crew is currently closing hatches and preparing to move to the attached Sojuz return vehicle, in case this should be required. The full story is at &lt;a href="http://www.nasaspaceflight.com/2009/03/threat-to-iss-crew-soyuz/"&gt;nasaspaceflight.com&lt;/a&gt;. I also strongly recommend to dial in to&lt;a href="http://www.nasa.gov/multimedia/isslivestream.asx "&gt; NASA mission audio&lt;/a&gt;. The critical time is 5 minutes around 11:39am CDT.&lt;br /&gt;&lt;br /&gt;I think I found the following two interesting links to track the &lt;a href="http://www.n2yo.com/?s=25090" target="PAM-D25090"&gt;debris&lt;/a&gt; and the &lt;a href="http://www.n2yo.com/?s=25544" target="iss"&gt;International space station&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Thankfully, the event is now over and nothing happend (no news is good news :-)).&lt;br /&gt;&lt;br /&gt;Here is a picture of the two satellite trackers around the time of the close encounter. Have a look at latidue, longitude and elevation in the trackers.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_SY7HcOPO2vE/SblAic9kOhI/AAAAAAAAAEY/Y8npL7h_24I/s1600-h/issDebris20090312.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 324px;" src="http://1.bp.blogspot.com/_SY7HcOPO2vE/SblAic9kOhI/AAAAAAAAAEY/Y8npL7h_24I/s400/issDebris20090312.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5312348196094360082" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-1585178233767050883?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/1585178233767050883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=1585178233767050883' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1585178233767050883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/1585178233767050883'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/iss-unter-debris-hit-threat.html' title='ISS unter debris hit threat!'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_SY7HcOPO2vE/SblAic9kOhI/AAAAAAAAAEY/Y8npL7h_24I/s72-c/issDebris20090312.jpg' height='72' width='72'/><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-2980159807245150638</id><published>2009-03-12T12:50:00.002+01:00</published><updated>2009-03-12T13:11:58.099+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>Platform importance for rsyslog</title><content type='html'>If you follow my blog or the &lt;a href="http://lists.adiscon.net/mailman/listinfo/rsyslog"&gt;rsyslog mailing list&lt;/a&gt;, you probably already know that rsyslog is available on a number of platforms. Thanks to contributors, &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; runs on BSD and is seen on Solaris and HP-UX too. The later two are not real ports yet and each of them has their restrictions. Also, I'd like to see support for AIX, but was not even able yet to obtain a compile platform.&lt;br /&gt;&lt;br /&gt;HOWEVER... as much as I desire multi-platform support, it is the truth that rsyslog stems from and is fueled by the Linux community. This is where the major contributions come from and this is also where the major interest originates. Plus, this is the only truly free platform, so it lives up to the same spirit that rsyslog has.&lt;br /&gt;&lt;br /&gt;When it comes to putting effort into the project, I have limited resources. Naturally, I put those resources to where they create the most effect. For that reason, most of the development is focused towards Linux (followed by BSD, where there is also an active community). Solaris and friends live mostly in the corporate world and so questions asking for rsyslog on these platforms mostly come from for-profit organizations. And there are very few of these requests. So I can not give them priority, because they do not benefit the project sufficiently large. HOWEVER, if the corporations put some money up and &lt;a href="http://www.rsyslog.com/Article312.phtml"&gt;sponsor development&lt;/a&gt;, that is definitely in the interest of the project, because it allows us to grow and the sponsorship will probably allow us to do other things as well. Everyone benefits.&lt;br /&gt;&lt;br /&gt;Once a platform is implemented, it must be maintained. Obviously, there is little point in orphaning a platform that we already run on. But for platforms with little interest, it is probably not justified to test each and every new release (just think of the testing time required). I'd call those platforms "tier 2" platforms and think I can look at them only in response to a problem report. Of course, we offer &lt;a href="http://www.rsyslog.com/doc-professional_support.html"&gt;rsyslog support contracts&lt;/a&gt; and if a sufficiently large number of users decide to purchase these contracts (extremely low numbers today, to phrase it politely) and these purchasers are interested in e.g. Solaris, we will most probably change priorities and all out of sudden Solaris will become "tier 1". Of course, this may push away some community-requested work, but again I think this is in the overall interest of the project: if we can secure continuous funding, not only from one source (&lt;a href="http://www.adiscon.com"&gt;Adiscon&lt;/a&gt;), but many, we can be much more sure we can implement more and more cool things in the future.&lt;br /&gt;&lt;br /&gt;I hope this clarifies my position on the importance of the various platforms for rsyslog and how I will handle them. &lt;br /&gt;&lt;br /&gt;Oh, and one final note: if a platform requires me to even purchase hardware (Solaris/Sparc for example), I will not do that unless someone donates a machine (NOT LEND it, but donate, so that at least for the next three years I can ensure maintaining rsyslog on it - a virtual machine, of course, is sufficient if you happen to have some inside a cloud ;)). It would be just plainly silly to put real money at supporting a community that does not contribute back ;)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-2980159807245150638?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/2980159807245150638/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=2980159807245150638' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/2980159807245150638'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/2980159807245150638'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/platform-importance-for-rsyslog.html' title='Platform importance for rsyslog'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-5978727225349319712</id><published>2009-03-12T10:36:00.003+01:00</published><updated>2009-03-12T10:47:31.267+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog video tutorials...</title><content type='html'>I started thinking about video tutorials a few days ago. Videos are cool and more and more people use them. So why not create a couple of them for rsyslog?&lt;br /&gt;&lt;br /&gt;The idea is simple and I think it will work equally well for teaching both conceptual topics as well as practical "how to" types of problems. The later probably works even better...&lt;br /&gt;&lt;br /&gt;I could investigate, design and build my tutorial in a perfect way. The result would obviously be very useful and perfect - but most probably there never would be any result due to time constraints and priorities. With this on my mind, I created a very first trial tutorial this morning, all in all in less than an hour. It took me some more minutes to get it up on the web site, but this effort will never again be required.&lt;br /&gt;&lt;br /&gt;The question this trial shall answer is: is it possible to create something useful (not perfect) in little time? My personal feeling is mixed. I think one notices quickly that the material is not as much organized as you would expect from a talk. Also, some additional slides would definitely have enhanced the usefulness - but also increased production time very much. On the other hand, I think some information is conveyed by the presentation. And, even better, information that you can not obtain with reasonable effort from any other place.&lt;br /&gt;&lt;br /&gt;So: is it useful or not? What could improve the usefulness without causing a large increase in production time? Does it make sense to create sub-optimal content but be able to create it as it can quickly be done? If so, which other topics would you like to see covered?&lt;br /&gt;&lt;br /&gt;Please have a look at the &lt;a href="http://www.rsyslog.com/Article350.phtml"&gt;rsyslog message flow video tutorial&lt;/a&gt; and let me know your thoughts!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-5978727225349319712?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/5978727225349319712/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=5978727225349319712' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/5978727225349319712'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/5978727225349319712'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/rsyslog-video-tutorials.html' title='rsyslog video tutorials...'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-5557292714523753758</id><published>2009-03-06T15:28:00.005+01:00</published><updated>2009-03-06T16:01:21.228+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog and solaris</title><content type='html'>This week, I had the opportunity to work a bit on &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; on Solaris. Most importantly, I could set up a compile and test environment (*not* that easy if you don't know your way around Solaris...) and have integrated those patches that folks have sent over time (unfortunately I have lost many of the contributor names, so if you are among them please let me know for proper credits!).&lt;br /&gt;&lt;br /&gt;I was able to integrate those patches and make sure that they don't break the linux build (I am still a bit in the verification process, but it looks good). I have created a solaris branch in git and will in the future keep solaris-specific additions in that branch. I will merge that branch back into the master branches every time I am confident enough that it doesn't break anything in the main stream build.&lt;br /&gt;&lt;br /&gt;I was satisfied to see that not that many changes were required for a Solaris build. So the initial effort, some month ago, seems to have paid well. I have seen that the solaris git branch compiles, but I have not done any serious testing on Solaris. Still, I am short on time and I have to admit I have spent more time on it this week than I should. So testing is off-limits for now...&lt;br /&gt;&lt;br /&gt;However, I got some good impression on what it takes to make rsyslog really run on Solaris. First of all, even gcc4 does not provide the atomic instructions that it is used to provide on Linux. This case is not really handled in the code, so the end result is that the binary will be racy. I guess it will run, but it will have subtle issues on high-volume log servers and/or serves that run asynchronous action queues. Especially if the later is used, I'd expect rsyslogd to segfault every now and then (but without async actions it should not be that bad, at least I think).&lt;br /&gt;&lt;br /&gt;There also still does no kernel input plugin exist (or an imklog driver). I also guess there may be issues with the local log socket. I'd still caution everybody to be very, very careful when experimenting with the local log socket. I remember earlier testing where rsyslogd simply destroyed the socket but never was able to re-create it. Some other tweaks are probably required to core and runtime files. Some compiler messages point into that direction (and part of that may even be nasty).&lt;br /&gt;&lt;br /&gt;I have compiled only the bare essentials, without TLS, database drivers or anything else fancy. I expect some mild to moderate problems with them, too.&lt;br /&gt;&lt;br /&gt;So in short, the current code base is probably be used to run a relatively stable syslog relay or file-only receiver. I wouldn't put it in too much production, though. For folks interested in rsyslog on Solaris, we now at least have a version again that can be build and serve as a basis for extension. I am glad I could do that.&lt;br /&gt;&lt;br /&gt;As a side-note, I am still looking for sponsors of a full &lt;a href="http://www.rsyslog.com/Article313.phtml"&gt;rsyslog Solaris porting effort&lt;/a&gt;. If you would like to sponsor (or know someone who does), just &lt;a href="mailto:rgerhards%40adiscon.com"&gt;mail me&lt;/a&gt; and I'll help settle the dirty details ;)&lt;br /&gt;&lt;br /&gt;I hope this update - and the progress made - on rsyslog on Solaris is useful for a couple of folks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-5557292714523753758?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/5557292714523753758/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=5557292714523753758' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/5557292714523753758'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/5557292714523753758'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/rsyslog-and-solaris.html' title='rsyslog and solaris'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-6725664015373927655</id><published>2009-03-03T16:35:00.027+01:00</published><updated>2009-03-03T18:49:18.615+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cologne'/><category scheme='http://www.blogger.com/atom/ns#' term='building collapse'/><title type='text'>cologne municipal archive building collapsed</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_SY7HcOPO2vE/Sa1P20XPStI/AAAAAAAAAEA/OGPMLdjuQk8/s1600-h/CologneArchive.jpg"&gt;&lt;img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_SY7HcOPO2vE/Sa1P20XPStI/AAAAAAAAAEA/OGPMLdjuQk8/s200/CologneArchive.jpg" alt="" id="BLOGGER_PHOTO_ID_5308987338927327954" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;In Cologne, Germany, the municipal archive collapsed today at around 2pm. People believed to be trapped in building. It is feared that lives have been lost (according to Cologne newspaper Express, no known death at this time [7:10p], but 9 people missed [5:40p]). It was a typical business day and there were both clerks as well as customers inside the building. However, no official statement yet exists. According to Reuters (4:55p), official said at least one person injured, possibly others trapped in collapsed building.&lt;br /&gt;&lt;br /&gt;As some people told German media, there have been subway construction works close to the collapsed buildings. Sources say subway workers ran out of the construction site and yelled. That lead to some people fleeing the building. According to one eyewitness, some other, smaller buildings have also collapsed in the mean time (4:40p). The witness says the road sag. According to cologne radio station WDR, the building actually collapsed into a newly-build subway tunnel. The road shall be wide open, also collapsed into it (~5p).&lt;br /&gt;&lt;br /&gt;While this is speculation, it looks like the subway construction seems to have caused shifts of earth masses, which ultimately resulted in the collapse of the building. Cologne subway operator KVB says there were no larger construction work at this moment below the building. If that is true, it may probably be the result of a larger chain of events (and hopefully the last in that chain...).&lt;br /&gt;&lt;br /&gt;On German radio station SWR3, a neighbor said that a close-by church was close to collapese due to subway work. This situation seems to have been solved in the mean time.&lt;br /&gt;&lt;br /&gt;Last week, the site was part of the large cologne carnival parade. One can not imaging what might have been caused if the collapse had happened at that time.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Some picture of the site before the incident:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The webcam I quote below seems to have been right inside the collapsed building (speculation on my part). I was able to connect to the web cam server five times now, the picture is always the one below. I guess that was the last picture the webcam ever made. If so, the collapse was closely after 2:20pm:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_SY7HcOPO2vE/Sa1qjCpC-UI/AAAAAAAAAEQ/TGBoODlOnx8/s1600-h/Cologne-Severinstrasse-Before.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://2.bp.blogspot.com/_SY7HcOPO2vE/Sa1qjCpC-UI/AAAAAAAAAEQ/TGBoODlOnx8/s400/Cologne-Severinstrasse-Before.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5309016685976680770" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Google maps for orientation (you see it happend right in a densly populated area):&lt;br /&gt;&lt;iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps/ms?msa=0&amp;amp;msid=101651981393006609179.00046437e418e9d707673&amp;amp;ie=UTF8&amp;amp;t=h&amp;amp;ll=50.930921,6.957149&amp;amp;spn=0.000906,0.001717&amp;amp;output=embed&amp;amp;s=AARTsJr1n-TQw7_b0EAUxCbZJgL9cLblUA"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;small&gt;&lt;a href="http://maps.google.com/maps/ms?msa=0&amp;amp;msid=101651981393006609179.00046437e418e9d707673&amp;amp;ie=UTF8&amp;amp;t=h&amp;amp;ll=50.930921,6.957149&amp;amp;spn=0.000906,0.001717&amp;amp;source=embed" style="color:#0000FF;text-align:left"&gt;View Larger Map&lt;/a&gt;&lt;/small&gt;&lt;br /&gt;&lt;br /&gt;&lt;iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=stadtarchiv,+k%C3%B6ln+severinstrasse,+germany&amp;amp;sll=37.0625,-95.677068&amp;amp;sspn=37.273371,56.25&amp;amp;ie=UTF8&amp;amp;ll=50.938311,6.961298&amp;amp;spn=0.004287,0.008583&amp;amp;t=h&amp;amp;z=14&amp;amp;iwloc=A&amp;amp;cid=1965398558302046233&amp;amp;output=embed&amp;amp;s=AARTsJq5cEVELX8v0Tbotj9TJdxumFxu8w"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;small&gt;&lt;a href="http://maps.google.com/maps?f=q&amp;amp;source=embed&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=stadtarchiv,+k%C3%B6ln+severinstrasse,+germany&amp;amp;sll=37.0625,-95.677068&amp;amp;sspn=37.273371,56.25&amp;amp;ie=UTF8&amp;amp;ll=50.938311,6.961298&amp;amp;spn=0.004287,0.008583&amp;amp;t=h&amp;amp;z=14&amp;amp;iwloc=A&amp;amp;cid=1965398558302046233" style="color:#0000FF;text-align:left"&gt;View Larger Map&lt;/a&gt;&lt;/small&gt;&lt;br /&gt;&lt;br /&gt;The municipal archives was not only a historical building, it also held important historical documents (see description below). I guess that many of these documents have been lost, but hope that many can be recovered. According to Cologne's official web site, it was one of the largest municipal archives in Germany, holding original documents from over thousand years of history. As it looks, there seem to have not been any roman artifacts inside the building.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Correction:&lt;/b&gt; the building itself was not historical, it was erected in 1971. There is a picture of it available at the &lt;a href="http://www.spiegel.de/fotostrecke/fotostrecke-40276-10.html"&gt;German news site Spiegel online&lt;/a&gt; (you may need to go back and forth as they add pictures - this does not look like a permanent link).&lt;br /&gt;&lt;br /&gt;Links:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.wdr.de/mediathek/html/regional/2009/03/03/wdr-extra-stadtarchiv-einsturz.xml"&gt;German Television&lt;/a&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.spiegel.de/fotostrecke/fotostrecke-40276.html#backToArticle=611081"&gt;German news Site (Picture Story)&lt;/a&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.flickr.com/photos/cedricmay/"&gt;Flickr photo stream&lt;/a&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://stadtbahn.relaunch.net/webcams/kamera4.html"&gt;Webcam&lt;/a&gt; - currently not operational...&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.wdr.de/themen/panorama/26/koeln_hauseinsturz/index.jhtml"&gt;Cologne Radio Station site&lt;/a&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.kingg.de/?p=1341"&gt;Blogger in a building opposite the site&lt;/a&gt; (in German)&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://translate.google.com/translate?prev=_t&amp;hl=en&amp;ie=UTF-8&amp;u=http%3A%2F%2Fwww.stadt-koeln.de%2F5%2Fkulturstadt%2Fhistorisches-archiv%2F&amp;sl=de&amp;tl=en&amp;history_state0=&amp;swap=1"&gt;Description of Cologne historical archives&lt;/a&gt; (translated via Google Translate)&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;I stop compiling news now (6:20p), nothing really new appeared the past hour. I guess the situation must clear up. Mainstream media will probably have good coverage tomorrow. If you hear anything interesting, please let me know (e.g. by commenting).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-6725664015373927655?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/6725664015373927655/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=6725664015373927655' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/6725664015373927655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/6725664015373927655'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/cologne-municipal-archive-collapsed.html' title='cologne municipal archive building collapsed'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_SY7HcOPO2vE/Sa1P20XPStI/AAAAAAAAAEA/OGPMLdjuQk8/s72-c/CologneArchive.jpg' height='72' width='72'/><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-784946317464884609</id><published>2009-03-02T10:43:00.004+01:00</published><updated>2009-03-02T11:04:01.979+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog doc - state of the art...</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Most people agree that &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; is a decent and useful piece of software. However, most people (including me) also agree that the &lt;a href="http://www.rsyslog.com/doc"&gt;rsyslog documentation&lt;/a&gt; is, ahem, sub-optimal.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;When I code, I always think "I'll do the doc soon". But when "soon" arrives, something else is in the way. Yet another (justified) feature request, articles and other projects (yes, they exist ;)). At least I try to convey the important concepts and backgrounds here in the blog, but you have a hard time if you intend to extract a specific feature from the blog. So: the doc is in a bad shape. &lt;br /&gt;&lt;br /&gt;I just got an offer from an volunteer who would like to help with the doc. That may even be the start of a rsyslog doc team. In any case, that's a fantastic opportunity. First of all, more doc means more and happier users. Secondly, I think it is very useful when someone other than me writes user doc. I can't even envision the questions that a regular user may ask, and this is a problem for any manual I write.&lt;br /&gt;&lt;br /&gt;I hope this collaboration manifests. In order to aid it, let me briefly describe what currently exists: &lt;a href="http://www.rsyslog.com"&gt;www.rsyslog.com&lt;/a&gt; is driven by &lt;a href="http://www.postnuke.com/"&gt;Postnuke&lt;/a&gt; for various reasons, the most important one that I have a postnuke wiz at hand, so I do not need to dig in any dirty details if I need something extra ;) Postnuke is a CMS, so dynamic content can be added and is easy to edit by anyone else. So far, we use the web site itself primarily for news announcements. &lt;br /&gt;&lt;br /&gt;The real doc set is kept as HTML. We use a Postnuke module to integrate that static html into the CMS. The HTML doc set exists only once, right inside the rsyslog git tree. When I make changes, they automatically go into git, go into the tarball and I also copy them over to the web site. All of this is without any effort, which is good. The bottom line is that the HTML doc set needs to be modified by patches or me pulling from someone else's git archive (both of which I will happily do). I think it is good to have the html pages available in the tarball, previous discussion on the rsyslog mailing list showed that package maintainers think so, too.&lt;br /&gt;&lt;br /&gt;There exists two man pages. They are extremely bad. They need to be hand-synced with the html pages and I almost always forget to do so. Man pages do not go onto the web (besides some very old copies I produced via a clumsy way). But the live in git and the tarball, too.&lt;br /&gt;&lt;br /&gt;A partial effort was done to internationalize the doc set, based on the usage of docbook. I think this is a good approach and the work done so far is kept in the &lt;a href="http://git.adiscon.com/?p=rsyslog.git;a=shortlog;h=refs/heads/docbook"&gt;rsyslog docbook branch&lt;/a&gt;. However, the approach currently focuses on the man pages. I do not know if it will work for the HTML doc, too.&lt;br /&gt;&lt;br /&gt;I find docbook a very interesting concept, but the learning curve is steep. I simply had not enough time yet to dig deeply into it to start any serious work with it (html and &lt;a href="http://www.latex-project.org/"&gt;LaTeX&lt;/a&gt; are still king for me ;)).&lt;br /&gt;&lt;br /&gt;We have also a few places of obviously user-contributed content, the most important one being the &lt;a href="http://wiki.rsyslog.com"&gt;rsyslog wiki&lt;/a&gt;. It contains many useful things, among others config samples. The bad thing about the wiki is that there is only a single one. So it probably is not the place to describe things that are very version dependent. Or is it and I have just the wrong approach - correct me!&lt;br /&gt;&lt;br /&gt;Worth mentioning is also the &lt;a href="http://kb.monitorware.com/rsyslog-f40.html"&gt;rsyslog knowledge base&lt;/a&gt;, which primarily focuses dynamic content and discussions. But the search function is a very useful tool. Also, part of the larger knowledge base is devoted to gather information on how to configure syslog devices, how to best react to messages and how to consolidate e.g. Windows events. This obviously is not direct rsyslog documentation, but I hope it is useful and will continue to grow even more useful.&lt;br /&gt;&lt;br /&gt;Finally, there is the mailing list and most importantly the &lt;a href="http://lists.adiscon.net/pipermail/rsyslog/"&gt;mailing list archive&lt;/a&gt;. While this is definitely not considered a documentation resource, the archive has a lot of valuable information and it may even be a starting point for creating "real" doc.&lt;br /&gt;&lt;br /&gt;I hope this is a good and complete wrap-up of the doc situation. If I have forgotten anything or you'd like to tell me your thoughts: just use the comment function! :)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-784946317464884609?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/784946317464884609/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=784946317464884609' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/784946317464884609'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/784946317464884609'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/03/rsyslog-doc-state-of-art.html' title='rsyslog doc - state of the art...'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-6960739687227903359</id><published>2009-02-19T14:13:00.005+01:00</published><updated>2009-02-19T14:49:40.109+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='syslog'/><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>calling for log samples!</title><content type='html'>Now I join those mass of people who are asking for log samples. But I do for a good reason :) Also, I do not need a lot, a single log message works well for my needs. I need them to improve rsyslog so that the parser can even better handle exotic message formats. So the short story is if you have a syslog message, please provide it to me. &lt;br /&gt;&lt;br /&gt;And here is the long story:&lt;br /&gt;&lt;br /&gt;One of the strength of &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; is that it is very much focused on standards. That also means it tries to parse syslog messages according to the relevant RFCs. Unfortunately, syslog has been standardized only recently and so there is no real standard for what to expect inside the header. So rsyslog strength is also its weakness: if messages are ill-formed, results are often suboptimal.&lt;br /&gt;&lt;br /&gt;I am working around this by doing smart guesswork inside the legacy syslog parser. However, every now and then some folks pop up with problems. And, more importantly, some others do not even ask. &lt;a href="http://twitter.com/rgerhards"&gt;On my twitter account&lt;/a&gt;, I recently saw one such frustration. In that case, timestamps were duplicated. I guess that was caused by something unexpected inside the timestamp. However, I was not able to get down to the real problem, because I did not have access to the raw message. That's an important point: I need the &lt;b&gt;raw message content&lt;/b&gt;, not what happens to usually be in the logfile. The later is already parsed, processed and recombined, so it does not tell me what the actual message is. But I need the actual message to improve the parser.&lt;br /&gt;&lt;br /&gt;What I would like to do is create a very broad test suite with a vast amount of real-life syslog formats. The message text itself is actually not so important to me at this stage. It is the header format. If I get this, I'd like to analyze the different ways in which the format is malformed and then try to find ways to implement it inside the parser. If I find out that I can not detect the right format in all cases automatically, I may find ways to configure the different formats. The end result, I hope, will be far more plug-and-play message detection, something that should be of great benefit for all users.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Please contribute your logs!&lt;/span&gt; I need logs from many different devices, with many different versions. But I need only a few lines from each one. For each individual contributor, there is not a lot of effort required. Even a single log line would be great (ten or so be even greater). Just please don't mangle the logs and provide me with &lt;span style="font-weight:bold;"&gt;raw&lt;/span&gt; log messages. That's probably the hardest part. One way to do it is to sniff them off the wire, for example with &lt;a href="http://www.wireshark.org/"&gt;WireShark&lt;/a&gt;. Another way is to use rsyslog itself. All you need is a special template and an output file using it:&lt;br /&gt;&lt;br /&gt;$template rawmsg,"%rawmsg%\n"&lt;br /&gt;*.* /path/to/raw-file.log&lt;br /&gt;&lt;br /&gt;Add this to your rsyslog.conf, restart rsyslog, make the device emit a few lines and mail me the result to &lt;a href="mailto:rgerhards@gmail.com"&gt;rgerhards@gmail.com&lt;/a&gt;. You may also simply post the log sample to the &lt;a href="http://kb.monitorware.com/syslog-log-samples-t8934.html"&gt;sample log thread on the rsyslog forum&lt;/a&gt; - whatever you prefer. After you have done that, you can remove the lines from rsyslog.conf again. Before you mail me, it is a good idea to check if there is any sensitive information inside the log file. Feel free to delete any lines you have, but I would appreciate if you do not modify line contents. Also, it would be useful for me if you let me know which device, vendor and version produced the log.&lt;br /&gt;&lt;br /&gt;I hope that you can help me improve the rsyslog parser even more. Besides, it will probably be a very interesting experiment to see how different syslog messages really are.&lt;br /&gt;&lt;br /&gt;Thanks in advance for all contributions. Please let them flow!&lt;br /&gt;&lt;br /&gt;Rainer&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-6960739687227903359?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/6960739687227903359/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=6960739687227903359' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/6960739687227903359'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/6960739687227903359'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/02/calling-for-log-samples.html' title='calling for log samples!'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-3979909149478231562</id><published>2009-02-16T18:39:00.003+01:00</published><updated>2009-02-16T18:50:18.765+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>rsyslog now default on stable Debian</title><content type='html'>Hi all,&lt;br /&gt;&lt;br /&gt;good news today. Actually, the good news already happened last Saturday. The &lt;a href="http://www.debian.org"&gt;Debian&lt;/a&gt; project &lt;a href="http://www.debian.org/News/2009/20090214"&gt;announced the new stable Debian 5.0&lt;/a&gt; release.&lt;br /&gt;&lt;br /&gt;Finally having a new stable Debian is very good news in itself - congrats, Debian team. You work is much appreciated!&lt;br /&gt;&lt;br /&gt;But this time, this was even better news for me. Have a look at the &lt;a href="http://debian.org/releases/lenny/i386/release-notes/ch-whats-new.en.html#system-changes"&gt;detail release notes&lt;/a&gt; and you know why: Debian now comes with a new syslogd, finally replacing sysklogd. And, guess what - &lt;a href="http://www.rsyslog.com"&gt;rsyslog&lt;/a&gt; is the deamon of choice! So it is time to celebrate for the rsyslog community, too. &lt;br /&gt;&lt;br /&gt;There were a couple of &lt;a href="http://wiki.debian.org/Rsyslog"&gt;good reasons for Debian to switch to rsyslog&lt;/a&gt;. Among others, an "active upstream" was part of the sucess, thanks for that, folks (though I tend to think that after the more or less unmaintained sysklogd package it took not much to be considered "active and responsive" ;)).&lt;br /&gt;&lt;br /&gt;Special thanks go to Michael Biebl, who worked really hard to make rsyslog available on Debian. It is one thing to write a great syslogd, it is a totally different one to integrate it into an distro's infrastructure. Michael has done a tremendous job, and I think this is his success at least as much as it mine. He is very eager to do all the details right and has provided excellent advise to me very often. Michael, thanks for all of this and I hope you'll share a virtual bottle of Champagne with me ;)&lt;br /&gt;&lt;br /&gt;Also, the rsyslog community needs sincere thanks. Without folks that spread word and help others get rsyslog going this project wouldn't see the success it experiences today.&lt;br /&gt;&lt;br /&gt;I am very happy to have rsyslog now running by default on Fedora and Debian, as well as a myriad of derivates. Thanks to everyone who helped made this happen. So on to a nice, little celebration!&lt;br /&gt;&lt;br /&gt;Thanks again,&lt;br /&gt;Rainer&lt;br /&gt;&lt;br /&gt;PS: promise: we'll keep rsyslog in excellent shape and continue in our quest for a world-class syslog and event processing subsystem!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-3979909149478231562?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/3979909149478231562/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=3979909149478231562' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/3979909149478231562'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/3979909149478231562'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/02/rsyslog-now-default-on-stable-debian.html' title='rsyslog now default on stable Debian'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-571568708191721389</id><published>2009-02-10T16:36:00.004+01:00</published><updated>2009-02-10T16:45:07.302+01:00</updated><title type='text'>screwed up on LinkedIn ;)</title><content type='html'>A couple of days ago, I created a &lt;a href="http://www.linkedin.com/groups?gid=1761607&amp;trk=hb_side_g"&gt;rsyslog group on LinkedIn&lt;/a&gt;. Then I was curios what happened. Well, nothing. Nothing at all. So I thought it was probably not the right time for such a thing.&lt;br /&gt;&lt;br /&gt;And, surprise, surprise, I today browsed through LinkedIn and saw there were 16 join requests. Oops... there seem to be no email notifications for them. Bad... Well, I approved all folks. If you were one of them and now read this blog post: please accept my apologies! Obviously, this was just another time I screwed up on the Internet...&lt;br /&gt;&lt;br /&gt;To prevent any further such incidents, I have now set the group to automatically approve everyone who is interested in joining. That's great for this type of group, actually I am happy for everyone who comes along ;)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-571568708191721389?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/571568708191721389/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=571568708191721389' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/571568708191721389'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/571568708191721389'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/02/screwed-up-on-linked-in.html' title='screwed up on LinkedIn ;)'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-8748328467642291416</id><published>2009-02-06T11:54:00.002+01:00</published><updated>2009-02-06T12:00:54.102+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>When does rsyslog close output files?</title><content type='html'>I had an interesting question on the rsyslog mailing list that boils down to &lt;span style="font-weight: bold;"&gt;w&lt;/span&gt;&lt;span style="font-weight: bold;"&gt;hen &lt;a href="http://www.rsyslog.com/"&gt;rsyslog&lt;/a&gt; closes output files&lt;/span&gt;. So I thought I talk a bit about it in my blog, too.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;What we need to look at is when a file is closed.&lt;/span&gt; It is closed when there is need to. So, when is there need? There are currently three cases where need arises&lt;br /&gt;&lt;br /&gt;a) HUP or restart&lt;br /&gt;b) output channel max size logic&lt;br /&gt;c) change in filename (for dynafiles, only)&lt;br /&gt;&lt;br /&gt;I think a) needs no further explanation. Case b) should also be self-explanatory: if an output channel is set to a maximum size, and that size is reached, the file is closed and a new one re-opened. So for the time being let's focus on case c):&lt;br /&gt;&lt;br /&gt;I simplified a bit. Actually, the file is not closed immediately when the file name changes. The file is kept open, in a kind of cache. So when the very same file name is used again, the file descriptor is taken from the cache and there is no need to call open and close APIs (very time consuming). The usual case is that something like HOSTNAME or TAG is used in dynamic filename generation. In these cases, it is quite common that a small set of different filenames is written to. So with the cache logic, we can ensure that we have good performance no matter in what order messages come in (generally, they appear random and thus there is a large probability that the next message will go to a different file on a sufficiently busy system). A file is actually closed only if the cache runs out of space (or cases a) or b) above happen).&lt;br /&gt;&lt;br /&gt;Let's look at how this works. We have the following message sequence:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Host Msg&lt;br /&gt;A    M1&lt;br /&gt;A    M2&lt;br /&gt;B    Ma&lt;br /&gt;A    M3&lt;br /&gt;B    Mb&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and we have a filename template, for simplicity, that consists of only %HOSTNAME%. What now happens is that with the first message the file "A" is opened. Obviously, messages M1 and M2 are written to file "A". Now, Ma comes in from host B. If the name is newly evaluated, Ma is written to file B. Then, M3 again to file A and Mb to file B.&lt;br /&gt;&lt;br /&gt;As you can see, the messages are put into the right files, and these files are only opened once. So far, they have not been closed (and will not until either a)  happens), because we have just two file descriptors and those can easily be kept in cache (the current default for the cache size, I think, 100).&lt;br /&gt;&lt;br /&gt;I hope this is useful information.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-8748328467642291416?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/8748328467642291416/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=8748328467642291416' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/8748328467642291416'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/8748328467642291416'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/02/when-does-rsyslog-close-output-files.html' title='When does rsyslog close output files?'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6193377.post-2885120729147265382</id><published>2009-02-05T18:45:00.003+01:00</published><updated>2009-02-05T18:52:11.225+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='reliable'/><category scheme='http://www.blogger.com/atom/ns#' term='logging'/><category scheme='http://www.blogger.com/atom/ns#' term='rsyslog'/><title type='text'>On the reliable plain tcp syslog issue ... again</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Today, I thought hard about the reliable plain TCP syslog issue.&lt;/span&gt; Remeber? I have ranted numerous times on why "&lt;a href="http://blog.gerhards.net/2008/04/on-unreliability-of-plain-tcp-syslog.html"&gt;plain tcp syslog is not reliable&lt;/a&gt;" (this link points to the initial entry), and I have shown that by design it is not possible to build a 100% reliable logging system without application level acks.&lt;br /&gt;&lt;br /&gt;However, it hit me during my morning shower (when else?) that &lt;span style="font-weight:bold;"&gt;we can at least reduce the issue we have with the plain TCP syslog protocol&lt;/span&gt;. At the core of the issue is the local TCP stack's send buffer. It enhances performance but also causes our app to not know exactly what has been transmitted and what not. The larger the send buffer, the larger our "window of uncertainty" (WoU) about which messages made it to the remote end. So if we are prepared to sacrifice some performance, we can shrink this WoU. And we can simply do that by shrinking the send buffer. It's so simple that I wonder a shower was required...&lt;br /&gt;&lt;br /&gt;In any case, I'll follow that route in rsyslog in the next days. &lt;span style="font-weight:bold;"&gt;But please don't get me wrong: plain TCP syslog will not be reliable if the idea works.&lt;/span&gt; It will just be less unreliable - but much less ;)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6193377-2885120729147265382?l=blog.gerhards.net'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.gerhards.net/feeds/2885120729147265382/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=6193377&amp;postID=2885120729147265382' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/2885120729147265382'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6193377/posts/default/2885120729147265382'/><link rel='alternate' type='text/html' href='http://blog.gerhards.net/2009/02/on-reliable-plain-tcp-syslog-issue.html' title='On the reliable plain tcp syslog issue ... again'/><author><name>Rainer</name><uri>http://www.blogger.com/profile/12765720626924376847</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='03130076873660943451'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry></feed>