tag:blogger.com,1999:blog-62318743736201874182008-10-01T01:09:02.662-04:00ASA9adj. - of or related to 9 donkeys and/or David Donohue, MDDavid Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-6231874373620187418.post-33408276715354269632008-10-01T00:47:00.004-04:002008-10-01T01:09:02.674-04:00INQLE 0.1.4Finally uploading the next release. Skipped 0.1.3 to emphasize the grandeur of 0.1.4. And doesn't it have an air of glory about it? "0.1.4".<br /><br />Bugs notwithstanding, this version has quite an advanced spreadsheet importer. I have not seen a RDF importer wizard of this sophistication. It can do all of the following things:<br /><br /><ul><li>It captures as RDF not only explicit data (i.e., that which sits in the spreadsheet) but also implicit data (i.e., data about the spreadsheet as a whole, like "this spreadsheet is a bunch of data collected in Wilmington, Delaware by so-and-so organization"). </li><li>It does not require users to enter in RDF URIs, as I considered this to be a major usability impediment. </li><li>It has a lookup capability for finding subject classes. </li><li>It automatically retrieves known properties for the subject in question and provides form elements for entering literal values of thos properties or for specifying which column in the spreadsheet contains the values in question. </li><li>It permits users to add their own subjects and/or properties. </li><li>It registers new subjects and properties in the Central INQLE Server, and makes these discoverable by other INQLE users running the importer wizard.</li></ul>The other big step forward off course is release of the Central INQLE Server (CIS), for looking up RDF subjects and properties. The CIS will play a larger role in the future, in coordinating the cooperation between INQLE servers.<br /><br />Quite a nifty bit o work if I do say so myself! And this is my excuse as to why it was almost 4 months in coming. Hopefully all this reengineering will pay off and make the process of importing data as RDF as easy and comprehensive as possible. My reasoning is that if you can import most any structure of CSV data, then you make it a whole lot easier for folks to use INQLE.<br /><br />Future work: hook the thing up to the full UMBEL subject database, so we can look up subjects and properties from a richer source. Plus all the billions of other things in the INQLE roadmap.<br />http://code.google.com/p/inqle/wiki/INQLE_RoadmapDavid Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-74728361537871596652008-07-07T22:57:00.003-04:002008-07-07T23:02:12.219-04:00Releasing INQLE version 0.1.2This version of INQLE fixes a few minor bugs. It adds relatively little new end-user functionality. However, we have added a new plugin type: org.inqle.datasets. This plugin type permits plugins to specify a new internal dataset for the purposes of containing data associated with them. This facilitates quicker lookups. This was also a prerequisite for us to be able to store incoming info (in the Central INQLE server). This is an important feature which will come in the next release in a few weeks. More later.<br />Davo DonohueDavid Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-57807999773364709722008-06-22T22:25:00.003-04:002008-06-22T22:57:41.937-04:00INQLE version 0.1 is born!My open source project <a href="http://code.google.com/p/inqle/">INQLE</a> (Intelligent Network of Querying and Learning Engines) has reached the ripe old age of 0.1! I had intended this version to be very bare bones, such that it would barely work. But I found that a few features were needed to make the bloody thing usable. Most notably I added 3 big features:<br /><ul><li>a wizard for loading spreadsheet data as RDF. This is a pretty powerful feature.<br /></li><li>a setup wizard which runs upon starting INQLE for the first time</li><li>an embedded database. This dramatically simplifies the process of installing INQLE. To my delight, I discovered that <a href="http://jena.sourceforge.net/">Jena</a> has recently begin supporting the fantastic <a href="http://www.h2database.com/">H2 database</a>, which performs very well as an embedded database (in fact it outperforms non-embedded databases significantly). I find INQLE runs much faster using an embedded H2 database than an external PostgreSQL database.</li></ul>Next up: probably will add security, probably using OpenID and Google accounts.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-84695364845714990172008-06-18T22:53:00.005-04:002008-08-07T23:30:25.976-04:00INQLE Scores 8.5 out of 10 on killer app scaleI stumbled across <a href="http://asa9.org/2008/06/scoring-system-to-assess-semantic-web.html">this post on scoring a semantic web application on a 10 point killer scale</a>. [OK, well actually I wrote it.]<br />So let me score my project <a href="http://code.google.com/p/inqle/">INQLE</a> on this scale.<br /><br />At this writing, IQNLE is very "early doors" (version 0.0.9). Currently, INQLE scores 6 on the 10 point scale. However, our vision/roadmap puts INQLE on a path to score about 8.5:<br /><br /><span style="font-style: italic;">Immediate Value to User</span><br /><span style="font-weight: bold;">+1</span>: The tool adds immediate value to the human user. INQLE permits automated machine learning experiments. Users must merely load data and they can then immediately start running experiments.<br /><span style="font-weight: bold;">+1</span>: We are aware of no product that does this.<br /><span style="font-weight: bold;">+1</span>: The tool is free.<br /><br /><span style="font-style: italic;">Generation of Semantic (RDF) Data</span><br /><span style="font-weight: bold;">+0.5</span>: INQLE allows users to generate data in spreadsheets (as they are want to do). Users must then use the INQLE interface to import that data.<br /><span style="font-weight: bold;">+1</span>: The new semantic data that INQLE generates are assertions about the correlations that exist between different things in the universe.<br /><span style="font-weight: bold;">+1</span>: Those things which INQLE correlates are real world objects. Some of INQLE's future sampling algorithms will combine local data with remote, pre-existing RDF entities.<br /><br /><span style="font-style: italic;">Consumption of Semantic (RDF) Data</span><br /><span style="font-weight: bold;">+1</span>: In future versions of INQLE, users will be able to annotate how valid or trivial or novel or spurious a correlation is.<br /><span style="font-weight: bold;">+0</span>: Such human annotation will require use of INQLE's interface.<br /><span style="font-weight: bold;">+1</span>: Future INQLE algorithms will be able to discover the results of past experiments.<br /><span style="font-weight: bold;">+1</span>: INQLE can then use the power of linked data and semantic reasoning, to perform repeated or related experiments. INQLE servers can therefore accrue an expanding body of knowledge.<br /><br />So 8.5 out of 10 is pretty decent. But we have to remember that we (and by "we", we mean "I") wrote the damn thing, thru our own myopic specs.<br /><br />So is INQLE the killer app for the semantic web? Um if your standard for a killer app is Google then probably not. But if you could live with lower expectations and if INQLE could really deliver on its ambition to effect true artificial intelligence and/or revolutionize the way research &amp; discovery is done, then it could deliver some degree of killerness.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-8239916391920664422008-06-18T22:35:00.004-04:002008-06-18T23:09:04.556-04:00Semantic Web Killer App ScaleMany smart people have asked this question:<br />"<a href="http://www.google.com/search?q=killer+app+for+the+semantic+web">What is the killer app for the semantic web?</a>". Well I do not have the answer to that question. But I can tell you some of the attributes that characterize a killer semantic web application.<br /><br />I came up with a scoring system you can use for evaluating semantic web technologies. The maximum score is 10.<br /><br /><span style="font-style: italic;">Immediate Value to User</span><br /><span style="font-weight: bold;">1 point</span>: The tool adds immediate value to the human user.<br /><span style="font-weight: bold;">1 point</span>: That immediate value to the user is novel functionality that is not available for free elsewhere.<br /><span style="font-weight: bold;">1 point</span>: The tool is free.<br /><br /><span style="font-style: italic;">Generation of Semantic (RDF) Data</span><br /><span style="font-weight: bold;">1 point</span>: Use existing human workflows to generate new semantic data.<br /><span style="font-weight: bold;">1 point</span>: Automated computer process generates new semantic data, without direct human involvement.<br /><span style="font-weight: bold;">1 point</span>: Generated semantic data links extensively to pre-existing semantic data, hosted remotely.<br /><br /><span style="font-style: italic;">Consumption of Semantic (RDF) Data</span><br /><span style="font-weight: bold;">1 point</span>: Humans may annotate the semantic data through a simple procedure, increasing the value thereof.<br /><span style="font-weight: bold;">1 point</span>: Such human annotation occurs automatically, using existing human workflows.<br /><span style="font-weight: bold;">1 point</span>: An automated computer process can consume the generated semantic data in some useful way. That is, humans are not the sole consumers of the generated semantic data.<br /><span style="font-weight: bold;">1 point</span>: Such automated processing increases the value of the body of semantic data, thereby facilitating cumulative accrual of value by the computer.<br /><br />Not sure how accurate the above model is for capturing the key features of a semantic web application. For example, maybe it puts too much emphasis on machine processing of data. But that's what the semantic web is all about, right? Most agree that it's not just another paradigm for presentation.<br /><br />So assuming that above scoring system is good enough, let's try to answer: "What is the killer app for the semantic web?"<br />Well it will be a tool for generating semantic data, of immediate value, using simple, human + automated methods. Such semantic data is processable by automated agents, in such a way that its value grows with time.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-10060209866951856392008-04-10T20:49:00.004-04:002008-04-10T21:30:26.400-04:00What's the best creative medium?Great question, Dave. Well let's break it down. What attributes might distinguish the creative media? <br /><br />Most features of creative media don't seem to differentiate one versus another. For example most creative media give you a significant buzz (that is, they are fun). True, some are more fun than others. E.g. stamp collecting I would wager is not lighting up much of the elation centers of the brain. Learning Klingon maybe down on the list as well. Just above learning Klingon I would rank my first profession: a lab researcher in a molecular biology lab. This was creative work albiet slow. It seemed that one made a creative decision about once per year. But for the most part, they are a wash here.<br /><br />All creative endeavors have some form of environmental toll as well. I suppose ones that generate a lot of byproduct would lose in this attribute. Like nickel smelting-4-fun.<br /><br />What about the value of the end product to society? Now here is a point of differentiation. Clearly some creative exploits are not worth a hill of beans save for the transient PET pattern they create in the beholder for about 5 seconds. I would put most modern art in this category. What about good art? Or musical performance? Well the benefits are subtle, and I would argue small.<br /><br />What about its potential to make you money? This is a good point of differentiation because it indicates how much the person is completely wasting his time. Measured by this standard, most creative exercizes are a fool's errand. But some media outperform others clearly. I suspect that here again, art fares poorly relative to more technological-related exploits.<br /><br />Where the hell are you going with this Dave? I hear you ask. Well my thesis is that the finest creative medium ever is [envelope please]. Computer programming!<br /><br />Eh? A hush fell over the Readership like a choking cloud of chlorine gas. Good thing nobody would ever read this. Well here is why software development is such a fabulous exploit for the few who are lucky/squashy enough to do it.<br /><br /><ul><li>High buzz per unit time invested. Imagine scientific advancement sped up by a 1000 fold. That's what computer programming feels like to me. It is pretty easy to get yourself humming on a project in which you are tinkering with code and running a new experiment every half minute or so. No pesky gels or radioactive phosphorus or carcasses either.<br /></li></ul><ul><li>High reward to society [I think]. This is hard to figure. Well what is the internet worth? Now throw in the value of non-internet devices. Costly. And all this wealth was created in just the last few decades. I bet well over half of the value of the internet was programmed in just the last few years. Yes I hear your point that the actual <span style="font-style: italic;">content </span>on the internet gets some credit. If this post is at all representative, then I think it's clear that content is overrated.</li></ul><ul><li>High potential for financial reward. Yes the days of everybody-who-knows-html-gets-rich are over. But you stack the median joe programmer against the median seth actor and i think in the former case he has a ranch house and a kid and 2 cars versus the latter is sleeping on his friend's couch, still chasing a forlorn dream.</li></ul><br />So in sum, I retract all above statements.<br />DaveDavid Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-14760901096821868572007-10-19T22:27:00.000-04:002007-10-19T23:39:56.610-04:00Evidence-Based EverythingSubtitle: <span style="font-weight: bold; font-style: italic;">Everything </span>is about <span style="font-weight: bold; font-style: italic;">prediction </span>is about <span style="font-weight: bold; font-style: italic;">data mining</span> is about <span style="font-weight: bold; font-style: italic;">everything</span>.<br /><br />Here's what I mean:<br /><br />1. "<span style="font-weight: bold; font-style: italic;">Everything </span>is about <span style="font-weight: bold; font-style: italic;">prediction"<br /></span>All information (books, articles, lectures, even conversations) is basically intended to provide you with a model from which to make future predictions. In some cases, the predictive models are explicitly spelled out, as in "the moral of the story". Other predictive models are more subtle. When your coworker says she thinks your employer is short-sighted, she is predicting that in the near future, that employer will do short-sighted things. When a self-help book tells you that people ate half the number of candies when the candy jar was moved 6 feet away from their desk, that book is providing you with a predictive model that says that mindless snacking goes down as snack food is moved away.<br /><br />2. <span style="font-weight: bold; font-style: italic;">"prediction </span>is about <span style="font-weight: bold; font-style: italic;">data mining"<br /></span>All predictions are measurable using data mining. <a href="http://en.wikipedia.org/wiki/Data_mining">What is data mining?</a> It is basically <a href="http://en.wikipedia.org/wiki/Scientific_method">the scientific method</a>, with particular focus on finding statistical correlations from tables of data. Data mining and associated scientific methods can be used to make just about any prediction. The best route to developing predictions about the real world is to use scientific methods, armed with -- you guessed it -- data mining.<br /><br />3. "<span style="font-weight: bold; font-style: italic;">data mining</span> is about <span style="font-weight: bold; font-style: italic;">everything.</span>"<br />Whether the subject matter is <a href="http://www.gottman.com/marriage/relationship_quiz/quiz2/">interpersonal relationships</a>, <a href="http://mindlesseating.org/">dietary habits</a>, or <a href="http://en.wikipedia.org/wiki/Freakonomics">social hierarchies among crack dealers</a>, it can be measured with data. And it should be! Because hiding with data everywhere are shocking findings that often argue against the conventional wisdom. Thus, data mining can dispel our misconceptions and enable us to better predict and manage our future, in all walks of life.<br /><br />My broader point is that we are very subjective creatures, with an amazing capacity to see the world through tinted glasses. In my field of internal medicine, we have only in the last few decades appreciated the critical need to practice <a href="http://en.wikipedia.org/wiki/Evidence-based_medicine">evidence-based medicine</a>. This basically means that we rely on scientific proof that an intervention is justified, and that in the absence of that proof we should proceed with caution.<br /><br />Before the evidence-based medicine movement, doctors relied exclusively on intuition and basic research into the pathophysiology (the nuts and bolts) behind diseases. The trouble with this approach is that intuition and even the science behind studying pathophysiology of diseases both can and do lie, frequently. Medicine has done many famous about-faces when it finally got around to studying whether an intervention is helpful. Examples include medicine's historic practices such as useless low-protein diets for kidney health, recommending toxic vitamin E to prevent cancer, widespread use of toxic anti-arrhythmic drugs and estrogen replacement therapy.<br /><br />Bottom line: everything, not just medicine, would benefit from increased use of scientific data mining, to provide us with new and improved predictive models about... everything.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-38694407462378888112007-10-09T22:47:00.000-04:002007-10-09T23:19:17.073-04:00The Global Environment: Let Me Go On RecordSome things are important enough to restate the obvious. I write this for the benefit of some distant future reader combing thru ancient posts, trying tounderstand why we ruined the planet.<br /><br />For the record, I understand and firmly believe the following environmental facts:<br /><ol><li>We are increasing CO2 concentrations to levels never before seen in at least 400,000 years.</li><li>This plus other human factors are contributing to major environmental damage in the form of global warming and other forms of global climate change.</li><li>Sea levels will almost certainly rise to their maximum levels within the next century or 2 (100 feet above current level) with catastrophic consequences (imagine all of Florida, trash and all, being washed out to sea).</li><li>Through climate change and through habitat destruction, we are visiting upon the earth one of the greatest extinctions ever. The pace of this mass extinction surely matches those from prior meteor impacts on a geologic scale. We will soon live in a planet without cheetas, pandas, large primates, many species of whale, most smaller primates, etc., etc.</li><li>There is a vast overpopulation of humans contributing to this. I recognize it is fashionable to say that world populations will level off around 11 billion +/- several billion. However: (a) there is no proof of this, (b) even if populations flatten, the environmental impact per person continues to rise, (c) 40,000 children starve to death every day, (d) habitat destruction and other consequences of human activity have wiped out most wild habitats, with the remainder to be consumed within the next few decades.<br /></li></ol>So obviously I am only touching on a fraction of the damages. As for policy, and the United States &amp; Bush adminstration's role in exacerbating most of these problems, the <a href="http://www.nrdc.org/bushrecord/">NRDC's "Bush Record"</a> speaks volumes.<br /><br />As for solutions, let me go on record as endorsing massive diversion of resources to protect environments, incentivising rainforest countries to preserve what remains, massive effort to invent new energy technologies, etc. For my small part I contribute the the <a href="http://www.worldwildlife.org/">World Wildlife Fund</a> and others.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-52567659221305769632007-10-06T22:18:00.001-04:002007-10-06T22:18:34.805-04:00link on vitamin d<a href="http://ods.od.nih.gov/factsheets/vitamind.asp">http://ods.od.nih.gov/factsheets/vitamind.asp</a><p>great stuff from the NIH on vitamin D.<p>How much vitamin D can children safely take? Accorfing to this<br>document and the NIH, the answer is a whopping 1,000 I.U. for infants<br>0-12 months, and 2,000 I.U. for all people over 12 months!<p>I expect we will be seeing recommendations from AAP that kids should<br>get more vitamin D, to prevent later Multiple Sclerosis plus cancer.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-38603439728234578572007-09-26T22:48:00.001-04:002007-09-26T22:59:59.010-04:00Vitamin D Totally Rocks"Totally rocks" is not your every day medical jargon. But the research behind vitamin D is getting to be highly cool. Here's the deal. Not very much of what we do in medicine has hard-hitting research behind it. And where research does exist, it seldom demonstrates a clear benefit to survival or longevity.<br /><br />Well Vitamin D is different! Check out this meta-analysis.<br /><a href="http://archinte.ama-assn.org/cgi/content/short/167/16/1730">http://archinte.ama-assn.org/cgi/content/short/167/16/1730</a><br /><br />The authors looked at every study ever done in which patients took either vitamin D or placebo, and summed up all that data. Bottom line: vitamin D increases longevity.<br /><br />2007 is really shaping up to be the year of vitamin D. Earlier this year, 2 meta-analyses came out demonstrating that vitamin D at a daily dose of 1000-2000 I.U. daily appears to decrease the risk of breast cancer by 50% and colon cancer by 2/3.<br /><a href="http://www.physorg.com/news89984353.html">http://www.physorg.com/news89984353.html</a><br />And now add to this a 3rd meta-analysis showing lower death rate.<br /><br />I don't believe that ever before in human history have we had a pill that we could give you and confidently tell you: "Here, take this. it will prevent cancer and make you live longer."<br /><br />I know that medicine tends to reverse itself on frequent occasions. However, these are 3 large studies, summing the results of many other studies. This provides considerable assurance that we are on solid ground taking vitamin D. I oficially signed on to vitamin D earlier this year, and I have been giving the stuff out like candy. Now that we can add this meta-analysis showing improved survival, I'm even more sold!David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.comtag:blogger.com,1999:blog-6231874373620187418.post-55963659967127310042007-09-26T20:09:00.000-04:002007-09-28T16:30:00.968-04:00Healthcare MatrixOn September 26, 2007, I delivered a talk on the US health care system entitled "<a href="http://docs.google.com/Present?docid=dcn9zh3q_2czbbv2&amp;fs=true">Healthcare Matrix: Inferior Quality for Twice the Price</a>". The audience is the <a href="http://www.ardenclub.com/georgists.htm">Arden Georgist Gild</a> in Delaware.<br /><br />(Georgists are those who promulgate the economic philosophy of <a href="http://en.wikipedia.org/wiki/Henry_George">Henry George</a>, a 19th century American economist. More on that lovable bunch another time perhaps).<br /><br />While I find activism about as appealing as door-to-door sales, the thing about the predominantly corporate US Health Care System is that it is so outrageously <a href="http://en.wikipedia.org/wiki/Health_care_in_the_United_States">expensive</a>, <a href="http://content.nejm.org/cgi/content/abstract/349/8/768">wasteful</a>, <a href="http://content.healthaffairs.org/cgi/content/full/hlthaff.w5.63/DC1">inequitable</a>, and <a href="http://www.annals.org/cgi/content/full/141/12/938">ineffective</a> that the story tells itself.<br /><br />To me the biggest surprise of my research into this topic was the Veterans Health Administration (VHA). Roll back the clock to 1995. The VA was the laughingstock of our health care system. <a href="http://content.nejm.org/cgi/content/abstract/348/22/2218">But around 1995, the VA embarked on a major restructuring</a>, which included release of its vaunted Vista electronic health record, and aggressive measurement of performance. The results speak for themselves: every single health outcome measure jumped dramatically from 1995 to 1997, and continued to improve thereafter. The same study stacks the VA's performance up against Medicare's from 1997 to 2001. The surprise to me was that not only did the VA outperform Medicare in all measures but 1 (annual eye exam for diabetics in year 2000), but that it bested Medicare by such wide margins. Parenthetically: Medicare outperforms HMOs (specifically, fee for service Medicare outperforms HMO-mediated "Medicare Advantage") and Medicare is much more cost efficient than HMOs.<br /><br />So if your keeping score,<br /><ul><li>VA beats Medicare</li><li>Medicare beats HMOs</li></ul><br />How could the socialistic health system of the Veterans Administration outperform the free-market driven remainder of our health care system? Many Americans are hard wired to think that free markets are <span style="font-weight: bold;">always</span> more cost effective than government programs (even in cases like this one, where the profit incentive runs counter to the mandate to deliver health to society). Rather than try to argue against peoples' ideology to explain why the socialistic VA's is supreme over the capitalistic HMOs and private doctors of Medicare, I will say this: It just is.David Donohuehttp://www.blogger.com/profile/00556308076450398293noreply@blogger.com