Mga app ng Google
Pangunahing menu

Post a Comment On: Steve Sailer: iSteve

"Will Big Data change the world?"

20 Comments -

1 – 20 of 20
Anonymous Anonymous said...

We recruited 2,500 households ... so we could track every single consumer packaged goods purchase they made. We could finally answer quantitatively an endless number of questions

But ... Severe selection bias! :-)

4/16/13, 8:52 PM

Anonymous Anonymous said...

It does seem that a lot of the applications of data mining including incentives and strategies based on an understanding of said data lead in a new direction only to run into diminishing returns or to find that this new direction leads to horrible outcomes.

In sports it leads to new and unexciting play. In police work it leads to focus on easy revenue raising, and appears to lead to a reduction in the reporting of crime. The easiest way to increase homicide clearance stats is to ensure that as many difficult cases as possible are not classed as homicides.

In management this has lead to the importation of cheap labor with no thought as to the cost and whether the outcomes are even any better than they were before. Companies still have the same number of competitors, and if costs are now lower, competition has made sale prices lower commensurately.

These developments in turn leads to new laws, consumer backlash, and management backlash. And backlash from the public, or at least the remnant public still left over from before the importations.

4/16/13, 9:18 PM

Anonymous Anonymous said...

But the real-time diversion of self-evident "discoveries" provided by incessant analysis of big-time data will entertain us as we orbit with increasing velocity around a vanishing point.

Voyeuristic, narcissistic, and limited.

Neil Templeton

4/16/13, 10:03 PM

Blogger DR said...

I'm sure what you were doing was fairly cutting edge for the time. But the notion that your career had anything to do with modern big data/machine learning is laughable.

Here's a clue IBM's Watson doesn't run in Lotus 123 (which you mentioned as being one of the tools you used in this job in a previous post).

I think people really fail to grasp how close computers are to wiping out pretty much all the jobs that the bottom half of the curve does. The vast majority of this advancement is driven by improvements machine learning from the past decade.

4/16/13, 10:04 PM

Blogger Steve Sailer said...

"But ... Severe selection bias! :-)"

We always had a lot of nice white ladies wanting to sign up for the panel, and fewer of other demographics, just like the Reuters-Ipsos election panel in 2012. But, that's who consumer packaged goods marketers wanted to target anyway, so it was fine.

4/16/13, 10:12 PM

Blogger Steve Sailer said...

"But the notion that your career had anything to do with modern big data/machine learning is laughable. Here's a clue IBM's Watson doesn't run in Lotus 123"

We didn't process 10% of all supermarket purchases in the country in the later 1980s on PCs. We did it on IBM mainframes, and they used various "machine learning" artificial intelligence techniques like neural networks for estimating missing data.

4/16/13, 10:18 PM

Anonymous Jim said...

As a Braves fan, the most devastating blow to me as a fan of the team was the death of long time team broadcaster Skip Caray and the retirement of his partner Pete Van Wieren. I could barely listen to the replacements and indeed lost interest in baseball for years. The Voice of Summer is the most important part of the game.

4/16/13, 10:33 PM

Anonymous wren said...

Big Data will be someone with those damn google glasses looking at you on the street and knowing EVERYTHING about you two seconds later.

That day is coming unfortunately.

4/16/13, 11:51 PM

Anonymous Anonymous said...

But the notion that your career had anything to do with modern big data/machine learning is laughable. Here's a clue IBM's Watson doesn't run in Lotus 123

The fundamental principles at work are the same.

4/17/13, 2:22 AM

Blogger Leon Kautsky said...

Fascinating.

"machine learning" artificial intelligence techniques like neural networks for estimating missing data."

No you didn't. Or at least, it's extremely unlikely. Neural nets are among the most computationally expensive learning algos to use today (because you can't take adv. of global convexity) and they're 10x faster (1000x if you count computer speed-ups) than they were in 198x.

Nevertheless, I'm impressed by your intuition: I work in so-called big data and it is correct that the vast majority of our work is pretty useless. Once in a while, though, we strike gold: i.e. stuff like being able to predict your race/sexual orientation/sociopathy/intelligence/buying habits from your facebook likes.

There are AI applications like Watson/Siri or w/e that are big data powered because you need the machine to have vast amounts of knowledge before it becomes useful, let alone commercializable.

4/17/13, 2:35 AM

Anonymous Anonymous said...

I just checked, and Willlie Mays came in 6th in the MVP voting that year.

4/17/13, 3:17 AM

Anonymous astorian said...

The revolution in baseball stats, as Steve says, hasn't really changed things very much. After all, even thirty years ago, if you'd asked an old-school, innumerate fan to name the greatest hitters of all time, he'd probably have said "Babe Ruth, Lou Gehrig and Ted Williams," and he'd have been right!

The stats revolution hasn't shown that any perceived superstars really sucked, or that Mario Mendoza was really a Hall of Famer. Rather, it's shown that a handful of guys who were perceived as borderline Hall of Famers were really just very good, while some guys who were perceived as very good were actually borderline Hall of Famers.

4/17/13, 6:13 AM

Anonymous Anonymous said...

Apparently they even have college courses called "Introduction to Data Science" now offered by statistics departments:

http://columbiadatascience.com/about-the-class/about-the-course/

Doesn't seem like anything different from what's been traditionally taught. Just marketed differently. The data probably just shows that people are suckers for marketing.

4/17/13, 7:19 AM

Anonymous countenance said...

How many World Series or AL Pennants did the Oakland A's win during the Sabermetrics era?

I'm thinking of a number larger than -1 and smaller than 1.

4/17/13, 7:38 AM

Anonymous countenance said...

One more thing: While I was initially sympathetic to the notion that Big Data is how Obama won re-election, I'm souring to that theory as time goes on. Not only that, I'm souring to all but the Occam's Razor explanation to why/how Obama won re-election. I suppose floating all the other theories was worth doing, and the theories all have some combination of truth and nonsense to them, some more of one than the other.

But sometimes, you just have to accept the Occam's Razor simple and obvious explanation.

In this case, Obama won re-election because he was the incumbent President. It's hard as hell to knock off incumbent politicians.

4/17/13, 7:45 AM

Anonymous stari__momak said...

"I think people really fail to grasp how close computers are to wiping out pretty much all the jobs that the bottom half of the curve does"

Funny, but I don't see a lot of machine innovation in the occupations that the bottom half fill. In fact, probably the opposite. A guy in my neighborhood had his front yard redone (xeriscape). This being SoCal the workers were Mexican (the landscaper and old timer Japanese/Japanese American). The thing is, they took two weeks, digging trenches with picks and shovels, moving dirt around by wheelbarrow, etc. No Bobcat or even Ditch-Witch or Rototiller that I saw, let alone an sort of intelligent machine.

When you can hire cheap labor, you don't develop capital goods to replace labor.

4/17/13, 9:04 AM

Anonymous helene edwards said...

Ron Santo had a nice year in '64, but I don't think he should be in the Hall.

4/17/13, 11:33 AM

Blogger Steve Sailer said...

"I work in so-called big data and it is correct that the vast majority of our work is pretty useless. Once in a while, though, we strike gold: i.e. stuff like being able to predict your race/sexual orientation/sociopathy/intelligence/buying habits from your facebook likes."

We were doing stuff like that involving supermarket shopping a long time ago. It wasn't useless, smart corporations paid a fair amount of money for our services. It just gets incorporated into how things are done and the world goes on, a little bit different than before, but not all that different.

4/17/13, 1:44 PM

Anonymous Anon87 said...

countenance - MIT's Technology Review had a recent article where they going into great detail how "analytics" and "data" had a huge hand in the re-election. The Best And Brightest working on a whole different level than the stodgy old GOP. If you read between the lines though, I think you can see the obvious answer: Find voters, figure out what they want, and then promise it to them for free.

4/17/13, 4:07 PM

Anonymous Anonymous said...

"No you didn't. Or at least, it's extremely unlikely."

I'm completely with Steve on this. I think you out in the weeds or very young and callow. (I'm envious!) I was involved in the neural networks field throughout the 80s. There was a big revolution around, maybe 1982-1983 when Bob Hecht-Nielsen got a significant amount of DARPA funding, organized the first conferences and some workshops, did a start-up, etc..

The big feeling in the air was "AI was back", but out from under the domination of the symbolic AI types (rightly-or-wrongly). And a lot of people didn't want complete AI, they just wanted to develop classifiers or recognizers that they didn't need to program. It all sort of came out of woodwork at about the same time. Backprop, Kohonen maps, the whole first generation.

By the late 80s people were trying to use neural nets for a lot of applications. They were also beginning to realize they couldn't debug them. Around the end of the decade folks began to realize how close backprop was mathematically to hidden markov models and the stats folks begun to say, hey, you guys should have talked to us about all this first... (or maybe they said, hey, we can do this non-linear stuff too...)

So I think you guys today do now have more theory motivating the algorithms. Things are on a more formal basis. You are getting demonstrably better results. They didn't have support vector machines, for instance, (a common algorithm now) back in the day. And of course you do have more compute power and disk space... Computer powers a funny thing, most computers today spend 90% of their time idle.

The application domains don't seem to have changed that much and I'm not really sure the overall effectiveness has... What has changed is that folks like Google now do have access to all the web-pages in the world and they want to crunch on them all the time, find out what's hot, where the money is... Hi, Google!

I think the plain old stats folks had a bit of revenge, the PCA and factor analysis types. The stuff the Oil companies run. It's all come a long way. I'm not knocking the current Big Data stuff, just saying that Steve could well have been doing large scale applications in the 80s, where large scale means all the data that's available. I also worked on the first laser cash registers collecting all this info in the very early 80s (not on the analytics side)... And if you collect the data, someone will have to do something to justify it all.

4/17/13, 9:56 PM

Comments are moderated, at whim.
You can use some HTML tags, such as <b>, <i>, <a>

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.
OpenID LiveJournal WordPress TypePad AOL