tag:blogger.com,1999:blog-194674042008-07-16T11:46:19.514-04:00Oracle Data Mining and AnalyticsMarcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comBlogger57125tag:blogger.com,1999:blog-19467404.post-41839643944010670842007-12-06T13:11:00.000-05:002007-12-06T13:18:25.288-05:00Recap PostFor the past couple of months the blog took a back seat. Basically, since KDD, I have had very little time to write. I have been on the road quite a bit and my trip to KDD unleashed a number of research ideas that I have been following up. I will post on the latter over time as the results mature. I have also dropped the ball answering many of the emails and comments I have received. I have Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-55853565768274930012007-08-12T09:34:00.001-04:002007-08-12T09:34:16.700-04:00KDD 2007For the next couple of days I am going to be attending the KDD (Knowledge Discovery in Databases) 2007 conference (conference website) along with some other Oracle colleagues. KDD is one of the primary conferences on data mining. This year it will take place in San Jose, CA, from August 12 to 15. Oracle is a Gold sponsor for the event and will have a large presence at the conference. Among otherMarcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-2032682943102923892007-07-09T21:51:00.001-04:002007-07-09T21:51:26.089-04:00On the Road and Upcoming TalksThis week I am going to be in San Francisco. I have been invited to give a talk at the San Francisco Bay ACM Data Mining SIG on Wednesday. The title of the talk is In-Database Analytics: A Disruptive Technology. Here is a link with information on the talk. On Friday morning, I am presenting at the ST Seminar at Oracle's headquarter. The title of that talk is In-Database Mining: The I in BI. If Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-84488314421897005772007-06-04T13:05:00.000-04:002007-06-04T21:06:36.874-04:00Way Cooler: PCA and Visualization - Linear Algebra in the Oracle Database 2This post shows how to implement Principal Components Analysis (PCA) with the UTL_NLA package. It covers some of the uses of PCA for data reduction and visualization with a series of examples. It also provides details on how to build attribute maps and chromaticity diagrams, two powerful visualization techniques. This is the second post in a series on how to do Linear Algebra in the Oracle Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-6815437732395239232007-05-02T01:11:00.000-04:002007-05-02T01:16:53.265-04:00Webcast Announcement: Oracle's In-Database StatisticsToday (Wednesday), May 2, 2007 at 12:00 PM EST, the Oracle Business Intelligence, Warehouse and Analytics (BIWA) Special Interest Group (SIG) will host another interesting free webcast: Oracle's In-Database Statistics Speaker: Charlie BergerSession Abstract Oracle Database 10g embeds a range of SQL-based basic statistical functions including: summary statistics, hypothesis testing, correlation Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-21652343477054534672007-04-24T13:18:00.001-04:002007-05-02T01:14:30.107-04:00Webcast Announcement: A Simple Fraud Detection Application using Oracle Data Mining, SQL Developer and Oracle BI EETomorrow, April 25, 2007On April 25, 2007 at 11:45 AM EDT, the Oracle Business Intelligence, Warehouse and Analytics (BIWA) Special Interest Group (SIG) will host the following free webcast: A Simple Fraud Detection Application using ODM, BIEE, and SQL Developer Speaker: Bob HaberstrohSession Abstract Classification is an often-used methodology in data mining that creates a predictive model Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-74666096038103123662007-04-20T16:32:00.001-04:002007-05-02T01:32:50.530-04:00Way Cool: Linear Algebra in the Oracle Database 1New to the Oracle Database 10g Release 2 is a hidden gem, the UTL_NLA package. This not very well known package (you don't get many hits for it in Google) brings linear algebra functionality to the Oracle Database. It makes the Oracle Database an even better platform for scientific and advanced analytics programming. Now it is possible to write performant matrix code in the database easily and Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-51312133806358951552007-02-12T16:44:00.000-05:002007-02-13T13:43:31.567-05:00Wikipedia and Oracle Data MiningWikipedia has a nice page on Oracle Data Mining (link). It provides a good overview of the features and history of the product. Here is a snippet of the text: Oracle Data Mining (ODM) is a software product distributed as an option to Oracle Corporation's Relational Database Management System (RDBMS) Enterprise Edition (EE). This product supports a collection of data mining and data analysis Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-65646490293597124262007-02-12T15:55:00.001-05:002007-02-12T15:40:08.442-05:00New Oracle Statistical Functions PageOTN has a new page (link) describing the statistical functions in the Oracle 10g Database. These functions are available in all versions of the database at no extra cost. Features include: Descriptive statisticsHypothesis testingCorrelations analysis (parametric and nonparametric)Ranking functionsCross Tabulations with Chi-square statisticsLinear regressionANOVATest Distribution fitWindow Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-16528523406941905232007-02-12T15:37:00.001-05:002007-02-21T22:22:52.159-05:00Welcome BIWAThe Business Intelligence, Warehousing and Analytics Special Interest Group (BIWA SIG, BIWA for short) has been recently created. Although it counts with a strong participation of Oracle employees, BIWA is an independent organization from Oracle. BIWA is a community in the making. It provides a number of benefits to its members (membership is free): Get the latest information about Business Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1166956383956582042006-12-24T05:30:00.000-05:002007-01-05T12:23:53.920-05:00Merry Christmas, Happy New Year, and a PollIt has been a great year. My daughter was born as well as this blog. I have launched this blog at the beginning of the year (January first to be more precise) and the readership has been great. Amongst the posts, Time Series and Automatic Pivoting were probably the most viewed. I am on vacation in Brazil right now enjoying a family reunion. I have a big family and it is hard to get everyone Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1166198674251784392006-12-15T11:03:00.000-05:002007-04-24T13:22:10.488-04:00Announcement: Oracle Data Mining Consultants Partnership ProgramWe're starting a program to work with qualified data mining consultants. You and your colleagues are invited to participate in a 2 day hands-on session designed for data mining consultants here in the Oracle Burlington MA office February 7 & 8, 2007. It is also possible to attend remotely via webminar. Space is limited, so please RSVP asap. The Oracle Data Mining Consultants Partnership ProgramMarcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1162298644212560282006-10-31T07:37:00.000-05:002007-04-24T13:22:10.489-04:00Free Webinar: Competing on AnalyticsI blogged some time ago (link) about an article on The Harvard Business Review by Babson College's Tom H. Davenport on how analytics are becoming a key competitive factor for companies. I have just learned that Prof. Davenport is giving a free webinar today. The theme is "Competing on Analytics." What participants will learn: What data-driven marketing is (and isn't)How marketing visionaries likeMarcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1162258394004747692006-10-30T20:22:00.000-05:002006-10-31T15:08:21.110-05:00Oracle Data Mining in ArgentinaI spent the week of the 18th in Buenos Aires spreading the word on Oracle Data Mining. I was invited by Snoop Consulting as a keynote speaker at their Update' 06 (warning, the site is in Spanish) event. Snoop Consulting has a very capable technical team. They are positioning themselves to become a leading company of added-value services for information technologies in the region, focused mainly Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1162076501122657822006-10-28T18:55:00.000-04:002006-10-31T15:10:04.926-05:00Time Series RevisitedI have been asked a couple of times for a script that would reproduce the results in the time series forecasting series. I finally managed to do it. In the process I found out that a couple of the queries needed to be tuned: In the airline example described in Part 2, the normalization shift and scale parameters were computed using the whole data. A better methodology would be to use only the Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1160527974300198462006-10-10T20:24:00.000-04:002006-10-10T21:03:20.860-04:00Oracle Data Mining 10gR2 Code Generation Release Now Available on OTNI have just received this from Product Management: We are pleased to announce the new Oracle Data Mining 10gR2 Code Generation release is now available for download (RTM) on OTN. This new ability to go directly from a data analyst building predictive models to having working in-database PL/SQL code for implementing a complete data mining solution is unrivaled in the industry. There is no dataMarcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1157626875916902942006-09-07T06:56:00.000-04:002006-09-07T07:07:48.733-04:00KDD 2006 - Day 2My initial plan was to write posts from KDD as the conference unfolded. So much for that plan. There was not much time or energy left after talks, time at the Oracle booth, and talking to people. Upon returning from KDD I left for vacation and did not have a chance to write about the other KDD until now. On the second day I spent most of my time at the Oracle booth talking to visitors and saw Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1156166015375728542006-08-21T09:07:00.000-04:002006-09-28T09:04:03.663-04:00KDD 2006 - Day One KDD concentrates most of the tutorials and workshops on the first day. In previous years I usually jumped around from room to room trying to catch interesting talks. This year I decided to follow a different strategy. I picked a full day workshop and stuck with it for the day. I chose the Data Mining for Business Applications Workshop organized by Rayid Ghani (Accenture Technology Labs) and Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1155873275383269692006-08-17T23:54:00.000-04:002006-09-07T07:06:01.443-04:00New Code Generator for Oracle Data MinerThe following was announced today: The beta release of Oracle Data Miner, a graphic user interface for Oracle Data Mining Release 10.1 and above, adds Oracle Data Miner PL/SQL Code Generator and is now available on OTN. The ODM PL/SQL Code generator enables companies to easily transform a data mining predictive model into an automated business process within an enterprise. The ODM "analytical" Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1154656194777651202006-08-03T21:43:00.000-04:002006-09-07T07:07:20.856-04:00Oracle at KDD 2006 The KDD (Knowledge Discovery in Databases) 2006 conference (conference website) is quickly approaching. KDD is one of the primary conferences on data mining. It will take place in Philadelphia from August 20 to 23. This year Oracle is a Gold sponsor for the event and will have a significantly larger presence at the conference. I have heard that, among other things, Oracle will be sponsoring an Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1154317909937884902006-07-30T23:48:00.000-04:002006-07-30T23:54:51.510-04:00Finding the Most Typical Record in a GroupI recently came across the following question: How can I find the most typical record in a group or cluster of records? For example, suppose we have a set of customer records, what is the customer that best typifies the group or cluster? The answer to this question can be used for characterizing groups of records of all types. For example, it can be used for characterizing multimedia collections Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1153194107076766692006-07-17T23:36:00.000-04:002006-07-17T23:41:47.146-04:00Blog FaviconSome of you might have noticed that the blog now has a favicon: For those interested in creating their own favicon, I used Mridul's tips and the program iconographer to create the favicon. Iconographer is a great program for the Mac.Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1152528127135137482006-07-10T06:32:00.000-04:002006-07-10T06:42:07.180-04:00High Performance Scoring with Oracle Data MiningA recent white paper at the Oracle Data Mining website describes how Oracle Data Mining can scale to score millions of records with modest off-the-shelve hardware. The paper shows some results that complement those in a paper presented at VLDB last year. This type of capability is what makes it possible real-time scoring as described in this series of posts. I have heard that a competitor Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1151760674376033202006-07-01T09:28:00.000-04:002006-07-01T09:32:38.196-04:00New Oracle Data Mining JDeveloper Extension on OTNA new release of OJDM extension for Oracle 10.2.0.2 Database is available as an Official JDeveloper Extension. For more details about the extension go here.Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.comtag:blogger.com,1999:blog-19467404.post-1151759756821598512006-07-01T09:12:00.000-04:002006-07-01T09:21:17.023-04:00Blog ChangesI have added a "Series" section to the sidebar. This section will have links to pages that groups all the posts of a series in a single place. I found that this would be helpful, as I usually don't write all the posts of a series at the same time. I have also created an "All Posts" page that can be accessed from the sidebar in the Posts section. Finally, the Newsletter section has been renamed Marcoshttp://www.blogger.com/profile/14756167848125664628noreply@blogger.com