<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss'><id>tag:blogger.com,1999:blog-22121434</id><updated>2009-05-31T13:06:10.644-05:00</updated><title type='text'>Seth Maislin's Indexing Blog</title><subtitle type='html'>A public exploration of indexing and information storage, retrieval, naming, and categorization. A spontaneous collection of ideas about ideas. Yet another website by Seth Maislin (like http://taxonomist.tripod.com).</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default?start-index=26&amp;max-results=25'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>52</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-22121434.post-487793039172570554</id><published>2007-10-14T20:58:00.000-05:00</published><updated>2007-10-14T21:10:37.624-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='future of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='indexing tools'/><category scheme='http://www.blogger.com/atom/ns#' term='business of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='keywording'/><title type='text'>Human-computer hybrids, in indexing</title><content type='html'>I recently completed an (as-of-yet unreviewed) article for &lt;em&gt;Information - Wissenschaft &amp;amp; Praxis &lt;/em&gt;(IWP), the premier German journal on information science. The topic was the intersection of computer-based indexing and human indexing, and how these two approaches to indexing are unequal but in many ways compatible.&lt;br /&gt;&lt;br /&gt;The biggest challenge in writing the article comes from the simple fact that I'm a human indexer, and that I believe that automatic indexing fails every important quality test. On the other hand, since it's unlikely we're going to have people typing away to index the World Wide Web (see my entry &lt;a href="http://maislin.blogspot.com/2006/12/needle-in-haystack-with-100000000.html"&gt;"A needle in a haystack with 100,000,000 blades"&lt;/a&gt;), it seems we're going to need something faster than human fingers and brains to get the job done.&lt;br /&gt;&lt;br /&gt;I'm not going to repeat the article's ideas here, except to say that I tried to give an even-handed view of automatic indexing -- even as I tend to rip it to shreds in this blog when I can. The distinction is that people constantly overestimate when it's necessary. Automatic indexing is overused and misused.&lt;br /&gt;&lt;br /&gt;Still, I thought my loyal readers might knowing that even on this, there are two valid opinions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-487793039172570554?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/487793039172570554/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=487793039172570554&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/487793039172570554'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/487793039172570554'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/10/human-computer-hybrids-in-indexing.html' title='Human-computer hybrids, in indexing'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-7791023395235051827</id><published>2007-09-11T11:33:00.000-05:00</published><updated>2007-09-11T11:48:46.119-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='business of indexing'/><title type='text'>External deadline forces</title><content type='html'>&lt;p&gt;"I need this book in hand by Friday because..."&lt;/p&gt;&lt;p&gt;Outside of the natural production process there are, definitely, many different kinds of external circumstances that impact the timing and schedules of indexing. Many are sector- or medium-specific, but of course there are indexers who work among several of these and thus feel the impact all year. Here are some examples that I know: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;Textbooks that are used in American public schools tend to appear in time for the Texas and California state adoption processes. If a book isn't published on time to be reviewed by the school officials in these states, it's unlikely that the book will be used in public schools at all.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;College textbooks need to be on the shelves in time for traditional semester beginnings, in September and January.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Books that are budgeted for one year are pushed to get finished during that budget (fiscal) year, to avoid (a) losing the opportunity to spend money already allocated for the publishing process, and (b) spending money needed in the next year. This impacts the indexers around U.S. Thanksgiving.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Software books targeted toward the general public need to be first to market to catch the wave of early sales; these schedules are irregular but can be predicted by looking at the various technologies that are coming out. For example, we're still near the beginning of the Windows Vista wave, since the new operating system was only recently released.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;General-readership books based on cultural events (news items, holidays, anniversaries) are similar to software books, in that being first to market matters equally.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Politics is a special kind of cultural event in that it's ongoing. Books on politics tend to appear in advance of events that are potentially influential in the political world. For example, books about presidential candidates tend to appear in parallel with their campaigns: early books to define the brand, later books to strengthen the message, and post-election books to analyze the results and consequencies. Other than elections, books related to policy making, international relations, and larger political issues (like national security and environmental conservation). Corporate politics can fall into this category as well, though these publications may double as marketing and promotion documents.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Professional conferences occur in clusters (lots in the summer, for example), and so publications that are relevant to conference events tend to get published (and re-published) in clusters.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;New printing and publishing technologies, which the layperson doesn't hear about, can drive new publications in a way similar to first-to-market publishing. For example, when CD envelopes were first made available in books, there was a market-driven desire to include CDs with more books. Most printing technologies are small variations on what exists today, but when a new possibility exists, it's a trend that some publishers chase right away. For example, if the quality of color rendering took a small leap forward, books where color is particularly critical (art, medical imaging, etc.) would appear more frequently for a while.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;If you have more to suggest, let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-7791023395235051827?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/7791023395235051827/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=7791023395235051827&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/7791023395235051827'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/7791023395235051827'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/09/external-deadline-forces.html' title='External deadline forces'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-591218288468026181</id><published>2007-07-24T23:48:00.001-05:00</published><updated>2007-07-25T00:40:38.428-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='fun with indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='books'/><title type='text'>Books I really, really, really want to index</title><content type='html'>Perhaps because most of my indexing work is on books that are rather typical for nonfiction reference books -- technical titles like &lt;em&gt;Measurement, Analysis, and Control Using JMP&lt;/em&gt; and resource guides like &lt;em&gt;Dx/Rx: Colorectal Cancer &lt;/em&gt;-- I jump for joy when I get something so off the beaten path that I renew my love for this job. For example, I recently completed the index for &lt;em&gt;First Position, &lt;/em&gt;a collection of biographies of ballet dancers; more recently, I indexed &lt;em&gt;Sensual Knits &lt;/em&gt;and &lt;em&gt;Sensual Crochet,&lt;/em&gt; both beautifully photographed books of designs and patterns.&lt;br /&gt;&lt;br /&gt;But now, working in the wee hours of the night, I find myself fantasizing about the books that I really, really, really want to index, books that are just asking to be written so that I, Seth Maislin, can be assigned their indexes. So here's my wish list:&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;u&gt;Chihuahuas for Dummies&lt;/u&gt;&lt;/em&gt;&lt;br /&gt;Don't laugh. You probably have no idea just how far-reaching &lt;a href="http://www.dummies.com/"&gt;the Dummies series&lt;/a&gt; has become since its long-ago inception as a series for computer use. There's &lt;em&gt;Fantasy Football for Dummies, &lt;/em&gt;a book about imaginary sports playing; &lt;em&gt;Stretching for Dummies, &lt;/em&gt;a book about limbering up, perhaps in advance of reading &lt;em&gt;Sex for Dummies&lt;/em&gt;; &lt;em&gt;Guitar for Dummies, Bass Guitar for Dummies, &lt;/em&gt;and the upcoming &lt;em&gt;Rock Guitar for Dummies, &lt;/em&gt;which I have to believe compete with each other somehow; and &lt;em&gt;Jewish Cooking for Dummies, &lt;/em&gt;a book that, dare I say it, would make me feel guilty to own. Nevertheless, let me make myself clear here. &lt;a href="http://www.dummies.com/WileyCDA/DummiesTitle/productCd-0764552848,subcat-PETS.html"&gt;&lt;em&gt;Chihuahuas for Dummies&lt;/em&gt;&lt;/a&gt;&lt;em&gt; &lt;/em&gt;&lt;strong&gt;is a real book.&lt;/strong&gt; I want to index the next edition, you see, because I'm dying to see what changes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;u&gt;10.9 Seconds: The Joey Chestnut Story&lt;/u&gt;&lt;/em&gt;&lt;br /&gt;(see &lt;a href="http://origin.mercurynews.com/valley/ci_6297731"&gt;http://origin.mercurynews.com/valley/ci_6297731&lt;/a&gt; to get the joke) Yes, this book is my own invention, but the fun part about indexing sports books is that they are so completely self-reverential. (Yes, &lt;em&gt;reverential,&lt;/em&gt; not &lt;em&gt;referential.&lt;/em&gt;) Written by sports geeks for sports geeks, the authors' language captures the awe-hubris-humor combination achieved by fans and record-breakers when it comes to the sport that is most of their life. It doesn't matter what the sport is, either, so I'm all for those esoteric things like Ultimate Frisbee (I was offered such a book once) and so on. I recently indexed the comprehensive &lt;em&gt;Chasing the Hunter's Dream, &lt;/em&gt;a directory of hunting opportunities around the world. This book included both descriptions of "dream hunts" -- think lion hunts in Africa -- and an entire section in the back dedicated to recipes, including a few meals for squirrels -- I mean, &lt;em&gt;of &lt;/em&gt;squirrels. And I mentioned &lt;em&gt;First Position &lt;/em&gt;in my intro, where at times I felt like I was reading an artist's diary.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;u&gt;How to Work My Body: A Manual&lt;/u&gt;&lt;/em&gt;&lt;br /&gt;There are a number of sex books out there -- including one for Dummies -- and most of them have indexes. I just finished indexing &lt;em&gt;Him &lt;/em&gt;and &lt;em&gt;Her, &lt;/em&gt;short and photograph-filled manuals of the sexes, along with instructions to make them work. And I do mean "work": the book about men attempts to explain why they tend not to do chores around the house. (Oh come on, you didn't think I'd use an erotic example of "work", did you? :-) These books, produced by the same group of people who made &lt;em&gt;Sensual Crochet,&lt;/em&gt; were a joy of sex to index, especially once I realized that most of the anatomy-filled books that I index are about abnormal anatomy: prostate disorders (&lt;em&gt;100 Questions and Answers About Prostate Diseases&lt;/em&gt;), gunshot wounds (&lt;em&gt;Criminal Investigation, 2nd edition&lt;/em&gt;), and the like. And unlike the traditionally polite sex-instruction book, &lt;em&gt;Him &lt;/em&gt;and &lt;em&gt;Her &lt;/em&gt;are more about the art than the words -- something that, for eunuchs at least, would make the indexing go much faster.&lt;br /&gt;&lt;br /&gt;&lt;u&gt;The user manual to anything only cool people own&lt;/u&gt;&lt;br /&gt;I had the honor of indexing the user manual to the Class E series Mercedes-Benz automobile. This full-color production was totally awesome; I spent a lot of time trying to convince myself that reading the manual long before the car's official release was as envy-worthy as owning the car itself. (For many months my friends and family joked that I should paid in cars instead of dollars.) I've indexed the manuals to software applications before, but I have more memories from editing the user guide to a long-since-extinct universal remote control ... and I'm talking back when these things were large control panels. So what other cutting-edge production is taking place? I missed indexing the iPhone manual, but maybe someday I'll get to index the field guide for a nasty-looking military weapon.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;u&gt;Instructions to the 1040 Form&lt;/u&gt;&lt;/em&gt;&lt;br /&gt;Indexing gets so little press, but that doesn't stop me from wanting to index something that's so popular or high-profile that I can't feel proud. I'll never be a household name, but if I had landed that one magical indexing project with the U.S. Internal Revenue Service, my work might have reached every household. They really were looking for someone, at least for a little while. Even the newest Harry Potter book isn't as popular. Which reminds me: is someone out there indexing Rowland's books? If not, there ought to be. &lt;em&gt;The Unauthorized Index of Harry Potter &lt;/em&gt;would be a big seller ... despite use of the word &lt;em&gt;index&lt;/em&gt; in the title. Move over, back-of-the-book indexing. We're on the cover now.&lt;br /&gt;&lt;br /&gt;Okay, I'm starting to drool.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-591218288468026181?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/591218288468026181/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=591218288468026181&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/591218288468026181'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/591218288468026181'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/07/books-i-really-really-really-want-to.html' title='Books I really, really, really want to index'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-6537869959092364704</id><published>2007-07-09T16:28:00.000-05:00</published><updated>2007-07-09T16:31:21.377-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft Word indexing'/><title type='text'>Printing Word documents with XE fields visible</title><content type='html'>&lt;span style="font-size:78%;"&gt;(This is taken from my &lt;a href="http://taxonomist.tripod.com/indexing/wordproblems.html"&gt;Word Indexing FAQ&lt;/a&gt;.)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Can I Print My Documents with the XE Fields Visible?&lt;br /&gt;&lt;/em&gt;&lt;br /&gt;Yes, you can. Microsoft Word can make all hidden-text codes visible, whether they're for indexing or not. Go into Page Setup (available from the File menu) and look for something that says "print display codes" or "print hidden text" or something like that. Until you uncheck that box in the future, all of your codes will show up in your printouts.&lt;br /&gt;&lt;br /&gt;Be aware that printing with your indexing fields visible will affect the pagination. Don't write an index using your hard copy this way.&lt;br /&gt;&lt;br /&gt;On a related note, remember that you can track changes when you work. Every time you insert, edit, or delete an XE field, you'll get a note in the margins. These marginal callouts can speed up your ability to find your XEs, although it might also clutter up your work. Use Tools &gt; Track Changes to turn that feature on.  Additionally, these changes can be made visible when printing as well, using a similar process as described above. Keeping XEs invisible but marginal notes visible allows you to see the index pointers without messing up the pagination. Be warned, however, that if changes are already being tracked, don't turn that feature off! You could lose that information for good. Instead, use the View menu to make those changes visible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-6537869959092364704?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/6537869959092364704/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=6537869959092364704&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6537869959092364704'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6537869959092364704'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/07/printing-word-documents-with-xe-fields.html' title='Printing Word documents with XE fields visible'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-8906507508503122907</id><published>2007-06-27T09:33:00.000-05:00</published><updated>2007-06-27T09:43:23.817-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='human factors'/><category scheme='http://www.blogger.com/atom/ns#' term='fun with indexing'/><title type='text'>Technology that ignores indexing</title><content type='html'>It's true, I'm including a link to this video because, frankly, it made me laugh. Witness a &lt;a href="http://www.tubearoo.com/articles/87148/Microsoft_Surface_Parody.html"&gt;satire of Microsoft Surface&lt;/a&gt;. Yes, it's funny, but I found myself thinking about how technologies are so often designed to create needs, not to meet existing needs. I mean, it's fascinating to imagine a table that has a computer screen as a surface, but what about building height adjusters into the legs so they don't wobble? And as I continued to think about this, I discovered I have two reasons to put this video in my blog.&lt;br /&gt;&lt;br /&gt;First, there's the opening sequence in which someone is looking at digital photos and videos scattered across the tabletop. With his fingers, the Surface user can move them around, open then, and even video the videos. In other words, the engineers of this expensive table have managed to reproduce the &lt;em&gt;worst part of photographs:&lt;/em&gt; the pile of undifferentiated images. If someone came to you and dumped a box of photographs on your table, would you be happy? Now, what if all those photographs were digital? This is technology that completely ignores indexing. Compared to tools like Picasa, which puts &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;metadata&lt;/span&gt; to work, this product does its best to create an interface option where &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;metadata&lt;/span&gt; is ignored. And if you're one of those "old-timers" who longs for the physical-contact nostalgia of long-ago days of printed photographs, such that you might think shuffling through a pile of photographs would be fun, think again. Remember, these are &lt;em&gt;digital &lt;/em&gt;photographs. They have no width and no weight.&lt;br /&gt;&lt;br /&gt;Second, there's the reality that in real life, we use table tops as horizontal storage surfaces. Whether you're a neat freak who has only a magazine or a coaster rack on top, or you're more like me and live with your tables essentially &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;camouflaged&lt;/span&gt; by life's detritus, either way you've essentially buried your workspace. In other words, this tool seems to forget the environment in which we look things up. The &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_3"&gt;voiceover&lt;/span&gt; in the ad jokes about the convenience of a handheld machine in comparison to this table, but I'd like to suggest that this table would make more sense as a vertical hang-on-the-wall &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_4"&gt;flat-screen TV&lt;/span&gt;. Take a lesson from the many-years-old television industry: there is no market for a horizontal television.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-8906507508503122907?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/8906507508503122907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=8906507508503122907&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/8906507508503122907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/8906507508503122907'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/06/technology-that-ignores-indexing.html' title='Technology that ignores indexing'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-904273756011483380</id><published>2007-06-20T10:11:00.000-05:00</published><updated>2007-06-20T10:37:24.531-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='fun with indexing'/><title type='text'>Indexing has at least one fan</title><content type='html'>It seems &lt;a href="http://www.soltys.ca/coredump/2007/06/indexing-blog.html"&gt;my blog got noticed&lt;/a&gt; just the other day, which is pretty neat. It seems I have at least one fan ... other than myself, of course.&lt;br /&gt;&lt;br /&gt;Why does the field of indexing have so few fans outside of the profession itself? I heard many stories from and about thankful authors who swear by the quality of the indexes written by professionals. My favorite is something like this: "Until I looked at that index, I didn't even know I &lt;em&gt;wrote &lt;/em&gt;all that!" I'm talking about something more general.&lt;br /&gt;&lt;br /&gt;No, we're not firefighters, bursting through burning walls to save people we've never met, but I like to think that we make the world a better place anyway. We're the traffic cops of information, tour guides for books, instructors and librarians, a taut rope in the rough seas of data storms....&lt;br /&gt;&lt;br /&gt;If you know anything about indexing -- not index&lt;em&gt;es,&lt;/em&gt; but index&lt;em&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;ing&lt;/span&gt; &lt;/em&gt;-- then you know it's not a boring profession. Think about the public perception of lawyers, and how we don't consider that profession boring, and yet the reality of law is lots of books, lots of reading, lots of research. Those "exciting moments" brought to you on the television, along with the anxiety and intrigue of any moral or ethical battles regarding the implementation of law, represent only a small piece of the whole system. There is a lot of boredom in &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;lawyering&lt;/span&gt;. No, the part of the law that brings so many students into the law schools (other than the potential for income, perhaps) is the idea that law governs our every-day lives, the sociological analogy to science.&lt;br /&gt;&lt;br /&gt;Indexing is the process of analyzing and re-representing information, the lifeblood of everything we do. It's the &lt;em&gt;Matrix; &lt;/em&gt;we hold the &lt;em&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Da&lt;/span&gt; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;Vinci&lt;/span&gt; Code. &lt;/em&gt;We are responsible for getting data from one place to another in an efficient format, so that we can actually talk. Whenever you get frustrated by a failure to communicate, remember that an indexer can change that.&lt;br /&gt;&lt;br /&gt;Maybe the reason indexing seems so boring is that the word is so inexorably tied to that alphabetized, indented thing you see in the back of books. Despite the applications indexing has for the Web, in search, and with taxonomy, people associate what we do with good old-fashioned paper. And gosh, they've been around, like, forever, so of course they're as boring as dirt -- note that dirt is not boring to some people -- and a whole lot less inspiring of nostalgia. Such a shame.&lt;br /&gt;&lt;br /&gt;Maybe we need a movie, the way &lt;em&gt;Top Gun &lt;/em&gt;got people signing up for the U.S. Air Force. Here's one. It's called &lt;em&gt;Cross.&lt;/em&gt; Jack Hannah, Agency "prep consultant" who can find out anything about anyone, is double-crossed when a routine inside investigation of an agent turns out to have the exact same life Jack has. Part &lt;em&gt;No Way Out &lt;/em&gt;and&lt;em&gt; &lt;/em&gt;part &lt;em&gt;Blow Up, Cross&lt;/em&gt; follows Jack on a dangerous journey into government archives to answer what should have been a simple question: Who is the real Jack Hannah?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-904273756011483380?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/904273756011483380/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=904273756011483380&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/904273756011483380'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/904273756011483380'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/06/indexing-has-at-least-one-fan.html' title='Indexing has at least one fan'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-8963768725914588437</id><published>2007-06-09T13:40:00.000-05:00</published><updated>2007-06-09T13:58:53.828-05:00</updated><title type='text'>Exemplary indexes</title><content type='html'>Historically, the &lt;a href="http://www.asindexing.org/site/WilsonAward.shtml"&gt;H. W. Wilson Award&lt;/a&gt; has been given to indexers of scholarly books, often because of the very complications and challenges you're talking about. What makes Do Mi &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;Stauber&lt;/span&gt;, who won &lt;a href="http://www.asindexing.org/site/PR20070531.shtml"&gt;this year's award&lt;/a&gt;, so great at her work is that she has a knack at doing this without slowing down very much. I like to think I have the same knack when it comes to technical and reference books.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.asindexing.org/site/sigs.shtml#scholar"&gt;Scholarly indexing&lt;/a&gt; is WAY hard. I recently accepted a book I didn't realize was scholarly, tried to index it myself, and realized almost immediately that I was in way over my head. (Note that I'm talking about that irrational fear an indexer experiences at the start of every project, but rather something quite objective: an inability to understand the sentences and paragraphs well enough to parse them into &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;indexable&lt;/span&gt; ideas.) I subcontracted the index to another indexer, someone who specializes (or at least doesn't mind) scholarly works, and the result was great.&lt;br /&gt;&lt;br /&gt;By the way, you need to see the &lt;a href="http://www.amazon.com/gp/sitbv3/reader/002-4081374-7163235?ie=UTF8&amp;p=S0JH&amp;amp;asin=0231137486"&gt;award-winning book's index&lt;/a&gt; to really understand what I'm talking about.&lt;br /&gt;&lt;br /&gt;Scholarly works are exceptionally difficult, even if you know the basic subject matter, because of how they are written. Many scholarly publishers underpay their indexers, too, because scholarly books rarely have large audiences: they're library-books-to-be, really, put there for students and faculty. Given that a book won't sell well, publishers are often reluctant to put more money into the production process. However, for the kind of book that &lt;a href="http://www.domistauberindexing.com/"&gt;Do Mi&lt;/a&gt; indexed -- and even the one I gave to someone else -- the indexer had better be making closer to $6/page (U.S.). In comparison, I think $4/p is reasonable for the average technical book, like a book on mathematics or computer programming. See, a technical book requires expertise in or a strongly sympathetic understanding about the subject, whereas scholarly books require a tremendous amount of time spent synthesizing what's in there. Think poetry and "Shakespeare," not of prose and "John &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Grisham&lt;/span&gt;." :-)&lt;br /&gt;&lt;br /&gt;But the H. W. Wilson Award can be given to indexers of other kinds of books, including technical. What makes the award possible is an exemplary show of knowledge and cunning, something that many technical books don't allow for. You also need the kind of working environment in which a publisher won't chop your index down to size, use a lousy design, or force you to complete the job too quickly to produce an exemplary product -- the kinds of things that are more likely to happen in technical fields than scholarly, in fact. But even a coffee table book can win the award, if the index shows that extra something special. :-)&lt;br /&gt;&lt;br /&gt;Given the kinds of things I index -- and the circumstances in which I index them -- I often think the only way I would win the Wilson Award is if I wrote the book myself, specifically for the purpose of making an awesome index. For example, maybe I would write a book that would require me to use &lt;a href="http://taxonomist.tripod.com/indexing/liungman.html"&gt;symbols as entries&lt;/a&gt;. :-) Then again, I'm still trying to write a &lt;a href="http://maislin.blogspot.com/search?q=mysterious"&gt;mystery index&lt;/a&gt;, too. I wonder if &lt;em&gt;that &lt;/em&gt;would win a Wilson...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-8963768725914588437?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/8963768725914588437/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=8963768725914588437&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/8963768725914588437'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/8963768725914588437'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/06/exemplary-indexes.html' title='Exemplary indexes'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-5232678195205015342</id><published>2007-05-29T20:10:00.000-05:00</published><updated>2007-05-29T20:18:28.956-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='business of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='misspellings and other errors'/><title type='text'>Throughput in indexing</title><content type='html'>&lt;p&gt;I gave a presentation last year (at the &lt;a href="http://www.asindexing.org/site/conferences/conf2006/index.shtml"&gt;Toronto conference of the American Society of Indexers&lt;/a&gt;) about money. I synthesized some statistics to come up with something I hadn't seen expressed before.&lt;/p&gt;&lt;p&gt;According to the 2004 survey of ASI members (all numbers in U.S. dollars):&lt;/p&gt;&lt;ul&gt;&lt;li&gt;median per-page rate: $3.26 to $3.50 &lt;/li&gt;&lt;li&gt;median hourly rate: $30 to $40 &lt;/li&gt;&lt;li&gt;median per-entry rate: $0.70 to $0.79&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Note that MEDIAN rates do not necessarily match an indexer's lifestyle, workload, or typical projects. For example, some indexers work exclusively on the kinds of projects that earn more (or less) than the median. In other words, these are NOT target numbers; rather, they are reflective of the variety of everything that indexers do.&lt;/p&gt;&lt;p&gt;Synthesizing these numbers:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;typical indexing speed: 10 pages per hour &lt;em&gt;or &lt;/em&gt;45 entries per hour&lt;/li&gt;&lt;li&gt;typical index density: 4.5 entries per page&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;From the survey:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;average annual income: $33,325 &lt;/li&gt;&lt;li&gt;part-timers (&lt;32&gt;&lt;li&gt;full-timers (40+ h/wk) in survey: 12% median&lt;/li&gt;&lt;li&gt;income for full-timers: $45k-$49k [from 2000 survey]&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Synthesis:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;To make $45k/year at $4/page = 11,125 indexable pages per year&lt;br /&gt;= thirty-seven 300-page books per year&lt;br /&gt;&lt;/li&gt;&lt;li&gt;At 10 pages/hour, you must index for 185 six-hour days/year &lt;/li&gt;&lt;li&gt;At 20 pages/hour, you must index for 100 six-hour days/year&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If you want to make more money, focus on throughput: projects that are easier for you, more effective indexing tools, improved marketing, stronger client relationships, etc. In fact, the reason that advanced indexers tend to make more money is that they have been given the opportunity to build these skills: speed, marketing, relationships. For example, indexing a single book for a single author might be short-term lucrative, but building relationships with the author's institution is more lucrative in the long term. Also, experience clearly counts toward speed, too, while short-cutting quality can seriously damage relationships.&lt;/p&gt;&lt;p&gt;If you think of your career in terms of throughput, you might think about your day-to-day tasks differently. For example, there have been debates among indexers regarding the sharing of book mistakes caught (like misspellings); when thinking about throughput, sending such mistakes (a) slows you down, but (b) improves repeat business. On the other hand, when you've got a client who provides you with only one book a year, it's all loss, and no trade-off, in terms of income.&lt;/p&gt;&lt;p&gt;Finally, when I gave this presentation I made it clear that income isn't the only reason we're doing what we do. After all, there are more lucrative professions out there in the world. If you're earning a ton of money but destroying your health, sacrificing your happiness, hurting your family, or failing yourself in some other important way, then please reconsider your priorities.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-5232678195205015342?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/5232678195205015342/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=5232678195205015342&amp;isPopup=true' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/5232678195205015342'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/5232678195205015342'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/05/throughput-in-indexing.html' title='Throughput in indexing'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-7945768904002359716</id><published>2007-03-25T18:38:00.000-05:00</published><updated>2007-03-25T19:04:22.867-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing process'/><category scheme='http://www.blogger.com/atom/ns#' term='future of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='human factors'/><category scheme='http://www.blogger.com/atom/ns#' term='business of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='search engines'/><title type='text'>Indexers indexing infinitely ... like monkeys</title><content type='html'>Three ideas have merged.&lt;br /&gt;&lt;br /&gt;First, there's the idea I published last December as &lt;a href="http://maislin.blogspot.com/2006_12_01_archive.html#7116072763820819135"&gt;"A needle in a haystack with 100,000,000 blades,"&lt;/a&gt; where I argued how the Web, or an approximation thereof, could be indexed by humans for a reasonable amount of money.&lt;br /&gt;&lt;br /&gt;Second, there's &lt;em&gt;The New York Times&lt;/em&gt; article &lt;a href="http://www.nytimes.com/2007/03/25/business/yourmoney/25Stream.html"&gt;"Artificial Intelligence, With Help From the Humans,"&lt;/a&gt; in which we learn that the Amazon Mechanical Turk service subcontracts human workers to perform tasks that are especially challenging for computers to accomplish, such as matching images to textual descriptions. For some jobs, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;Turkworkers&lt;/span&gt;&lt;/span&gt; might make one penny per transaction.&lt;br /&gt;&lt;br /&gt;And finally, there's the &lt;a href="http://en.wikipedia.org/wiki/Infinite_monkey_theorem"&gt;infinite money theorem&lt;/a&gt;, which states that a monkey hitting keys at a typewriter for an &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;infinite&lt;/span&gt; amount of time will "almost surely" type the complete works or Shakespeare, or something similar. I first heard this ideas as a "million monkeys and million years," but I bet the math's a bit different. After all, "infinite" is much bigger than a million million.&lt;br /&gt;&lt;br /&gt;Putting these ideas together seems to provide a rather obvious solution: third-world indexers. After all, if it costs only a nickel to get someone to write a few keywords for something, we can get a lot of indexing done very cheaply; I say "third world" because no indexer I've ever known is willing to work for a penny per word.&lt;br /&gt;&lt;br /&gt;The indexing industry is facing the very real possibility that our workload will be taken from us and delivered to those in economies that allow lower prices. But what if we went a step further and, instead of looking for less expensive indexers with good qualifications, we decided to look for dirt cheap indexers with no qualification other than time to waste? What if, I ask, we asked monkeys to pound away at their keyboards?&lt;br /&gt;&lt;br /&gt;I find the idea amusing but too close to the truth. After all, the intelligence behind Google is the social intelligence, the uneven and culturally biased workings of millions of Internet users plugging away at their disparate tasks. What Mechanical Turk has going for it, then, is the human decision making at the back end. Whereas most search engines look for better and greater stores of &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;metadata&lt;/span&gt;&lt;/span&gt; with which to judge content, one man in a back room can make smarter decisions upon command. No, the real problem is that today's human intelligence is worth only pennies per word. Computers do their best, and humans sweep up afterwards. Our natural intelligence isn't worth a whole lot, I guess.&lt;br /&gt;&lt;br /&gt;That's how we know computers are smart. Computers own us monkeys.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-7945768904002359716?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/7945768904002359716/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=7945768904002359716&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/7945768904002359716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/7945768904002359716'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/03/indexers-indexing-infinitely-like.html' title='Indexers indexing infinitely ... like monkeys'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-3987181895920351972</id><published>2007-03-24T10:29:00.000-05:00</published><updated>2007-03-24T12:34:02.779-05:00</updated><title type='text'>The passive-aggressive bullies of the information world</title><content type='html'>&lt;p&gt;An indexer, while building an index of historical documents for a small township on Cape Cod, Massachusetts, came across an old diary written during the American Civil War. She scanned the pages, filled with small and semi-illegibly handwritten words, and realized that nothing important had been written.&lt;br /&gt;&lt;br /&gt;The diary went &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;unindexed&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;This anecdote, shared by indexer &lt;a href="http://www.marisol.com/rowland.htm"&gt;Marilyn Rowland&lt;/a&gt; at the March 24 (2007) meeting of the &lt;a href="http://www.newenglandindexers.org/"&gt;New England Chapter of the American Society of Indexers&lt;/a&gt;, struck me as surprisingly uncomfortable. Certainly I agree that when something seems unimportant to the indexer, it should not be indexed; in fact, I've claimed many times within this blog that one of the biggest failings of computer-generated lists and search engine algorithms is that they cannot identify the true value (or correctness) of content, even when using social algorithms.&lt;br /&gt;&lt;br /&gt;Still, not indexing &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;someone's&lt;/span&gt; diary? This sounds passive-aggressive. So does this instruction: "Don't index the names of everyone in that photograph. Mention these two important people, and don't bother with the rest."&lt;br /&gt;&lt;br /&gt;Just as scientists are often accused of sacrificing ethics and social responsibility in favor of "pure scientific exploration" (the temptation &lt;a href="http://www.puaf.umd.edu/IPPP/Fall97Report/cloning.htm"&gt;to clone human beings&lt;/a&gt; is a fun example), so might indexers be accused of excessive marginalization or trivialization of content. It may be human nature to filter out everything we don't need to survive or enjoy ourselves in our lives, but it is an indexer's nature to impose these filters upon future users. In other words, indexers are responsible -- on a daily basis -- for rewriting history.&lt;br /&gt;&lt;br /&gt;Everything we create in our lives -- email messages to diaries, family snapshots to oil paintings, back-of-the-napkin notations to dissertations -- is subjected not just to the entropy of time but also the red pen of the indexers. We may speak about the value of individuals, but in reality it's just a big game of &lt;em&gt;&lt;a href="http://en.wikipedia.org/wiki/Survivor_(TV_series)"&gt;Survivor&lt;/a&gt;,&lt;/em&gt; where the indexers are the ones to vote our creativity out of existence.&lt;/p&gt;&lt;p&gt;There is no good way to remove indexers from the equation, of course. If nothing were indexed, and no content were ever deemed to be more valuable (worth finding) than something else, content would be lost in the same way a paper cup with a lipstick stain inevitably disappears into a landfill. But who would have believed that &lt;em&gt;indexers &lt;/em&gt;are the ones in control, that &lt;em&gt;indexers &lt;/em&gt;are &lt;a href="http://en.wikipedia.org/wiki/The_Langoliers"&gt;&lt;em&gt;the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Langoliers&lt;/span&gt;&lt;/em&gt;&lt;/a&gt;, who like the big kids in school get to decide who gets picked first for the schoolyard team, and who doesn't get picked at all. We are, let's face it, the bullies of the information world.&lt;/p&gt;&lt;p&gt;Don't mess with me. I'll erase you.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-3987181895920351972?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/3987181895920351972/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=3987181895920351972&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/3987181895920351972'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/3987181895920351972'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/03/passive-aggressive-bullies-of.html' title='The passive-aggressive bullies of the information world'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-4078694221105907494</id><published>2007-03-17T22:11:00.000-05:00</published><updated>2007-03-17T22:13:02.918-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing process'/><category scheme='http://www.blogger.com/atom/ns#' term='indexing tools'/><title type='text'>Notes on automatic indexing</title><content type='html'>"Automated indexing software" is, according to the common definition, software that analyzes text and produces an index without human involvement. I'm a firm believer that the technology doesn't exist, and that a human being is required to write an index. Thus I don't use the software, and I also don't recommend it.&lt;br /&gt;&lt;br /&gt;There are those who advocate it, arguing that it's "not as bad as an indexer would have you think." These people are often coming from the standpoint that automatic software is faster and cheaper, and they're right. Thus the issue surrounds quality.&lt;br /&gt;&lt;br /&gt;I believe that good automatic indexes will exist once there's good artificial intelligence, something that presently doesn't exist. In very limited circumstances, however, it does; a machine can easily cull capitalized words from a textbook to create an approximation of an index of names -- although, again, the machine isn't going to differentiate between names like "David Kelley" and places like "San Francisco," since they are both of the same format and used the same way. It also won't know that "Bill Clinton" is also "William Jefferson Clinton." And certainly it can't tell when the name is being mentioned in an unuseful and trivial way, as are the names in this paragraph! So imagine the problems trying to get a machine to parse full sentences of ideas and recognizing the core ideas, the important terms, and the relationships between related concepts throughout the entire text.&lt;br /&gt;&lt;br /&gt;FYI, those who advocate automatic software, however, would argue that the machine gets "close enough" so that a human being can edit the resulting product. However, expert evaluators unanimously agree that the software fails; those who disagree are likely those who are sufficiently ignorant of indexing in the first place such that they are unable to determine the quality differences.&lt;br /&gt;&lt;br /&gt;Oh, I should mention that there are software programs that human indexers use to simplify and speed up the mechanics of the index process. For example, it would be silly to disallow a computer to alphabetize the entries, reformat the index, and manipulate page numbers. There are a few software packages that do this exclusively, which are considered top of the line; other applications that have indexing capabilities, such as Microsoft Word and Adobe FrameMaker, have some of these capabilities, with notable limitations.&lt;br /&gt;&lt;br /&gt;For information on the various software available, see &lt;a href="http://www.asindexing.org/site/software.shtml"&gt;http://www.asindexing.org/site/software.shtml&lt;/a&gt;. If you have feedback, especially differing opinions, I'd love to hear them. Write me at &lt;a href="mailto:seth@maislin.com"&gt;seth@maislin.com&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;(This article was originally published in 2002 and 2004 -- and it's still 100% accurate.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-4078694221105907494?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/4078694221105907494/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=4078694221105907494&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/4078694221105907494'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/4078694221105907494'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/03/notes-on-automatic-indexing.html' title='Notes on automatic indexing'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-5117309795001232016</id><published>2007-03-05T00:12:00.000-05:00</published><updated>2007-03-05T00:27:04.728-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing process'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft Word indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='human factors'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><title type='text'>Interpretation, not computation</title><content type='html'>After explaining the limitations of Microsoft Word's auto-indexing feature to one of the many people who write me asking for indexing advice, I got an interesting response. Clearly frustrated by the nonexistence of computer tools to do something as simple as generate a name index, he wrote:&lt;br /&gt;&lt;br /&gt;&gt; I'm amazed at the poor development of the science of indexing for printed matter such as books.&lt;br /&gt;&lt;br /&gt;I wrote back, "You misunderstand!"&lt;br /&gt;&lt;br /&gt;The science of indexing is quite broad, given that it has a history in long-ago library science. What seems undeveloped in this case are the tools, but that's a misunderstanding of what indexing is. Indexing is an editorial field, not an automatic one. You might say it's a lot like writing, in that the writer must decide what their readers want to read, and then the writer must communicate those ideas in an organized and approachable way. Indexing is the same: analysis of text to discover what readers might find interesting, and then multiply labeling and organizing those ideas so people can find them.&lt;br /&gt;&lt;br /&gt;Computers will never be able to write indexes because they can't (a) interpret importance of a concept, (b) understand concepts over simple words, and (c) connect ideas in contextually relevant ways. As much as I admire the Google.com search engine for what it can do, once again I will demonstrate what it &lt;em&gt;can't &lt;/em&gt;do. Google finds 10,000,000 things when we really only want 3 (or 10 or 20). It finds what we type, but it doesn't find synonyms. And there's no guarantee that Google is searching everything that's out there, though it appears to come close; in book indexing, however, there's a human to make sure every page was considered.&lt;br /&gt;&lt;br /&gt;How often has Microsoft Word attempted to auto-correct you in a completely inaccurate way? Spell-check? Auto-format? Auto-complete? Half-intelligent humans don't make the kinds of mistakes that these tools do.&lt;br /&gt;&lt;br /&gt;Here's what I wish he had written:&lt;br /&gt;&lt;br /&gt;&gt; I'm amazed that people who know full well that computers could never write newspaper articles still believe computers can write indexes.&lt;br /&gt;&lt;br /&gt;Another problem, of course, is that indexes aren't respected in the industry. The reason Microsoft Word even &lt;em&gt;has &lt;/em&gt;an automatic indexing feature is because the people who wrote that software have no idea of the damage such a tool provides. That Word's {XE} functionality is so miserable is even further proof. There's a nasty cycle: people use inferior tools, quality indexing grows less likely, and inferior tools become the standard.&lt;br /&gt;&lt;br /&gt;Indexing is an editorial process, just like writing and editing. Indexing requires interpretation, not computation.&lt;br /&gt;&lt;br /&gt;Computers will not and &lt;em&gt;should not &lt;/em&gt;be used as indexers. If my job ever dies because computer programmers have found a way to make me obsolete, at least I know I'll be in the enlightening company of human writers and artists.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-5117309795001232016?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/5117309795001232016/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=5117309795001232016&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/5117309795001232016'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/5117309795001232016'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/03/interpretation-not-computation.html' title='Interpretation, not computation'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-1299669133903723321</id><published>2007-02-07T22:13:00.000-05:00</published><updated>2007-02-06T11:30:50.672-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing process'/><category scheme='http://www.blogger.com/atom/ns#' term='future of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='power of information'/><title type='text'>Indexes are the speed limit on the information highway</title><content type='html'>The growing demand for indexes that point to online, changing, and custom content is forcing a huge gap into the indexing industry, and that gap is physical.&lt;br /&gt;&lt;br /&gt;With traditional publishing, if I wanted a reader to find content on page 114, I would simply have the number 114 show up in my index: "credit card fraud, 114." To make this work, however, that page number must be immutable across the lifetime of my index. Should the content be republished in a different format, layout, or language, or with significant edits within the first 100-plus pages, my index could be rendered inaccurate. In other words, if I type "114" in my index, that content had better be on page 114.&lt;br /&gt;&lt;br /&gt;The appeal of fluid content, however, is slowly making traditional information delivery obsolete. Not only are books republished for lots of "traditional reasons" (e.g., updated editions, new languages, different book and print sizes), but technology is enabling books to be published without a single physical page. With the possible exception of the Adobe PDF format (which purposefully preserves the overall book-like format in an electronic file), page breaks are optional and subjective. A Web page or HTML document can have a scroll bar, such that there are no pages; an e-book intended for a handheld reader is paged according to the size of the reader; a news or magazine article of any length can be broken in two or three simply to increase ad sales; and some electronic documents can be edited by the readers such that anything goes.&lt;br /&gt;&lt;br /&gt;Ah, how I miss the days when 114 meant 114.&lt;br /&gt;&lt;br /&gt;Indexing content that changes is going to be hard, but the fundamental challenge isn't about keeping up with what was newly published today, or even in the last twenty minutes. It's about content ownership. When content is moving around all the time, indexers don't have a good way to tracking where that content is going.&lt;br /&gt;&lt;br /&gt;As an analogy, consider a classroom filled with thirty students, with one student at each desk. If you have a photograph of where every student is sitting, you could leave the room and generate a spreadsheet that lists each person's name and seat location. But what happens when the students are playing musical chairs? Every photograph you take is outdated almost immediately; even staying in the room wouldn't be good enough, because your typing speed will never match the speed of twenty kids jumping around. In fact, the only way you could manage a spreadsheet that shows where each student is sitting at all moments is if that spreadsheet operated in real time, by reference. In other words, if all thirty students carried GPS locator chips in their pockets, you could track the chips -- and thus the students -- by satellite. Your map could be as dynamic as what it is you're mapping.&lt;br /&gt;&lt;br /&gt;Embedded indexing, or indexing by reference, is a rudimentary and imperfect example of this process. With embedded indexing, I can have some kind of information inserted into the content -- like the GPS chip in the student's pocket -- and then I can generate an index based on where that information is at any one time. This blog entry, for example, has keywords attached to it; the website where my blog is published can, at any time, generate a list of all entries with that keyword. This kind of dynamic indexing is not uncommon these days; website content is served according to a number of immediate rules, and the result can be as simple as a website that publishes "Hello Seth Maislin" on my page but no one else's, or as complicated as an online stock trading program that keeps track of millions of private transactions.&lt;br /&gt;&lt;br /&gt;I say this is rudimentary, however, because it's still a snapshot. Perhaps it's convenient to have that snapshot taking at the moment I arrive at a website, but if I leave my browser at a website and walk away for 20 minutes, the picture doesn't have to change. The "Hello Seth Maislin" greeting made sense when I was sitting at the computer, but if I walk away and my wife sits down, it's now wrong. The snapshot is old. Google search results can change from one minute to the next. Even stock trading programs sport copious warnings that despite the best efforts of the website, the price you &lt;em&gt;think&lt;/em&gt; you're getting may not be the *actual* price when you complete a transaction; the delay between your clicking the mouse and the machines at the other end doing something is a legitimate and unavailable delay. Some website attempt to minimize this by taking a snapshot every fraction of a second, as if you were watching what was happening "live." In reality, there's still a delay, and there's still no way to truly synchronize everyone's machine.&lt;br /&gt;&lt;br /&gt;My point is that indexes to changing documentation must live apart from the documentation. If they really lived completely together, the content and the index would be essentially the same thing, just as the GPS chip and the student are really one merged object. But because indexes are &lt;em&gt;interpretations&lt;/em&gt; of content, there is always going to be a gap. The generation of the index be removed from the content that is being indexed, in order for that interpretation to take place.&lt;br /&gt;&lt;br /&gt;The only way for indexing to survive, I think, is for content to slow down. And because I believe indexing -- interpretation -- is critical for learning, the only logical conclusion is that content &lt;em&gt;will&lt;/em&gt; slow down.&lt;br /&gt;&lt;br /&gt;The need for an index is the logical limit of just how fast data can travel.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-1299669133903723321?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/1299669133903723321/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=1299669133903723321&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/1299669133903723321'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/1299669133903723321'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/02/indexes-are-speed-limit-on-information.html' title='Indexes are the speed limit on the information highway'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-6265700651245355259</id><published>2007-02-03T13:12:00.000-05:00</published><updated>2007-02-03T13:24:44.474-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='business of indexing'/><title type='text'>TAA Conference in Buffalo, June 22-23</title><content type='html'>I have been scheduled to present twice at the &lt;a href="http://www.taaonline.net/TAAConference/index.html"&gt;2007 conference for the Text and Academic Authors Association (TAA)&lt;/a&gt;. I'm excited about these presentations -- actually, one of them is a roundtable -- because this will be perhaps the first time when my audience is predominantly authors. Although I have taught indexing to numerous technical writers over the years, the nature of their writing is significantly different to that of other authors. &lt;a href="http://www.taaonline.net/"&gt;TAA&lt;/a&gt; members tend to write journal articles and textbooks; they are writing because they &lt;em&gt;want &lt;/em&gt;to share information (whether driven by a simple desire to share knowledge or by more complicated goals like industry prestige, peer respect, or job security), whereas technical writers are obligated to write documentation as part of a larger project.&lt;br /&gt;&lt;br /&gt;In some ways, having this opportunity to reach out to the authoring community represents a longer reach than usual, in that most indexers ply their trade among the publishers themselves, who manage the book production but don't do any of the writing. Although any business benefits I receive from these talks won't be as lucrative as the others -- convincing one author to hire me for the job isn't as valuable as convincing one publisher to hire me for &lt;em&gt;several &lt;/em&gt;jobs -- the advocacy benefits are likely bigger but unknown. I often think that the indexing process is hidden from authors, despite their desire to see quality indexes appended to their work.&lt;br /&gt;&lt;br /&gt;If the indexing industry is going to grow, it won't be because the indexers have advocated for themselves. No, indexers will be prominent only when others -- like writers -- advocate for them and their products.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-6265700651245355259?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/6265700651245355259/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=6265700651245355259&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6265700651245355259'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6265700651245355259'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/02/taa-conference-in-buffalo-june-22-23.html' title='TAA Conference in Buffalo, June 22-23'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-6711994005113238974</id><published>2007-01-20T11:15:00.000-05:00</published><updated>2007-02-06T11:30:50.713-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='web indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='books'/><title type='text'>Foreword to Heather Hedden's upcoming book</title><content type='html'>&lt;div align="left"&gt;I was asked to write the foreword to Heather Hedden's upcoming &lt;em&gt;Indexing Specialties: Web Sites,&lt;/em&gt; to be published in 2007 by ITI. Given the importance of this book in the indexing industry, I am reprinting that foreword here. For more information on the book itself (not yet available), visit either &lt;a href="http://www.asindexing.org/site/asipub.shtml"&gt;ASI's publications page&lt;/a&gt; or a list of &lt;a href="http://books.infotoday.com/books/index.shtml#index"&gt;ITI's indexing publications&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;- - - - - -&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;&lt;strong&gt;Foreword&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Indexing is not a popular profession by any stretch of the imagination. Not only is it almost completely unknown in lay circles, but let's be honest: writing indexes sounds about as exciting as cleaning the house, but a hundred times harder. Also, if you were born in any year before 1990, the idea of Web indexing sounds like cleaning a house in outer space. I mean, there are no houses in outer space.&lt;br /&gt;&lt;br /&gt;The Internet and the Web -- this monstrously huge and growing system of sharing data -- desperately need more information sorcerers like Heather Hedden. Not only does Heather have the talent to recognize when knowledge is missing, but she also has the ability to make that knowledge visible. She starts by learning for herself, and then she loves to share.&lt;br /&gt;&lt;br /&gt;Heather and I first crossed paths in my classroom, where I taught a course called "Writing Indexes for Books and Websites." My course was written to explore the questions and theories of indexing, and so couldn't be limited to just books. Heather’s interest went much further, and since then she has explored writing web indexes as a singular discipline. For me, Heather has been a student, an apprentice, and a role model. She's someone I count on to get things done. She has vaulted across the lines from library science to book indexing to web indexing, each time with surprising success, and has since become a renowned and respected expert in the web indexing community.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Indexing Specialties: Web Sites &lt;/em&gt;is a book filled with honest, get-it-done advice. Heather is not afraid to talk about the code and the tools, because she has faith in her readers. In her hands, the complicated stuff looks straightforward. Besides, when the technical lessons are over, Heather shows readers how to think about web indexing as well: as a process and as a business. Until now, if book indexers wanted to graduate to the Internet frontier, they had no unified place of reference, no single source of everything they'd want to know. In fact, some of the tools Heather includes in this book were almost completely unknown to indexers until now.&lt;br /&gt;&lt;br /&gt;I am excited and pleased to see Heather compiling this knowledge in a book. She has put into print an indexer's Rosetta Stone, which will lead book indexers toward other information management topics like taxonomies, information architecture, and search tools. It's not about complicated coding practices and computer programs, but about the guidelines to getting that A-to-Z index published on the Internet, and doing it right.&lt;br /&gt;&lt;br /&gt;She begins by exploring the boundaries of web site indexing, clarifying what kinds of sites need indexing, how they should look, and how they should work. Then she immediately provides the HTML building blocks to making your indexes appear on the Web, the surprisingly simple code you'd need to create index pages, index entries, indentations, hyperlinks, and cross-reference links. If you've never programmed on the Web before and are afraid it's over your head, you’ll be kicking yourself once you see how easy Heather makes it.&lt;br /&gt;&lt;br /&gt;Once you're armed with the grammar, you next need the tools to actually write. Heather gives you the detail about the tools (&lt;/span&gt;&lt;a href="http://indexres.com/home.php"&gt;CINDEX&lt;/a&gt;&lt;span style="font-family:times new roman;"&gt;, &lt;/span&gt;&lt;a href="http://www.html-indexer.com/"&gt;&lt;span style="font-family:times new roman;"&gt;HTML Indexer&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:times new roman;"&gt;, &lt;/span&gt;&lt;a href="http://www.levtechinc.com/ProdServ/LTUtils/HTMLPrep.htm"&gt;&lt;span style="font-family:times new roman;"&gt;HTML/Prep&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:times new roman;"&gt;, &lt;/span&gt;&lt;a href="http://www.macrex.com/"&gt;Macrex&lt;/a&gt;&lt;span style="font-family:times new roman;"&gt;, &lt;/span&gt;&lt;a href="http://www.sky-software.com/"&gt;&lt;span style="font-family:times new roman;"&gt;SKY Index Professional&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:times new roman;"&gt;, and &lt;/span&gt;&lt;a href="http://publish.uwo.ca/~craven/xrefhtju.htm"&gt;XRefHT&lt;/a&gt;&lt;span style="font-family:times new roman;"&gt;) to create or generate indexes that are ready for web publication. She takes more time exploring the specialized tools of XRefHT and HTML Indexer, two stand-alone web indexing applications, and shows how you can use their features with agility.&lt;br /&gt;&lt;br /&gt;The last third of the book is dedicated to the "mindspace" of web indexing. There's more to indexing than just the tools, and so Heather writes carefully about how indexers should approach the job. She addresses the challenges of working out of order, adding anchors, indexing periodicals, and knowing which pages and at what level of detail you should index. She deals in detail with cross-references, language, subentry structure, and format. Finally, Heather dives into the nitty-gritty of the web indexing marketplace, including how to market yourself as a web site indexer.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Web Sites &lt;/em&gt;is going to satisfy you immediately and in the long term. On behalf of the American Society of Indexers -- and myself, personally -- I am honored to welcome Heather as an esteemed author in our community.&lt;br /&gt;&lt;br /&gt;Seth Maislin&lt;br /&gt;President of the American Society of Indexers (2006-2007)&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-6711994005113238974?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/6711994005113238974/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=6711994005113238974&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6711994005113238974'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6711994005113238974'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/01/foreword-to-heather-heddens-upcoming.html' title='Foreword to Heather Hedden&apos;s upcoming book'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-121955941814763775</id><published>2007-01-09T21:26:00.000-05:00</published><updated>2007-06-09T13:57:47.718-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='fun with indexing'/><title type='text'>Constructing a mysterious index</title><content type='html'>[NOTE: Edited on 6-9-07 to fix missing indentations for index entries.  -SM]&lt;br /&gt;&lt;br /&gt;For years I have puzzled over the possibility of writing an index, to an imaginary book, in which a mystery is revealed and potentially solved. The book itself would not have to be a mystery, but there would have to be some kind of secret.&lt;br /&gt;&lt;br /&gt;For example, suppose you had these entries:&lt;br /&gt;&lt;br /&gt;La Traviata, clandestine meeting at, 145&lt;br /&gt;Marters, Francine&lt;br /&gt;meeting at La Traviata, 145&lt;br /&gt;&lt;br /&gt;From these entries you would learn that Francine Marters met someone at La Traviata surreptitiously. An additional entry&lt;br /&gt;&lt;br /&gt;Rapiere, Evan&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;accidental discovery of La Traviata matches, 166&lt;br /&gt;&lt;br /&gt;implies not only that Evan was not the person at the restaurant, but also that Evan might have been the reason the secret was necessary. A final entry,&lt;br /&gt;&lt;br /&gt;Pfiser, Victor&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;confronted by Evan Rapiere, 231&lt;br /&gt;&lt;br /&gt;allows for the possibility that Victor was the other person with Francine (on page 145), such that Evan's discovery of the matches led to this confrontation.&lt;br /&gt;&lt;br /&gt;What would make an index like this potentially interesting as a puzzle would be (a) the randomization of information, caused by the alphabetization of entries; (b) the summary-style labels in the index, which must naturally leave out much of the story; (c) the creativity of the labels, which can emphasize or omit interesting facts without destroying the quality of the index itself; and (d) the ability to tell many overlapping and long stories across just a few pages.&lt;br /&gt;&lt;br /&gt;On the other hand, what makes an index puzzle challenging -- and the reason I've had no success so far -- is that the index must articulate all the facts; an index can have no secrets if it's going to work as a puzzle. For example, if this were a murder mystery, wouldn't the murder have to be indexed? If the index is going to be a good one (and that's a requirement for me, because otherwise it would seem too contrived), you'd have to have entries like these:&lt;br /&gt;&lt;br /&gt;Rapiere, Evan&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;murder of, 235&lt;br /&gt;Marters, Francine&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;guilty confession of, 469&lt;br /&gt;&lt;br /&gt;Is there any way that a legitimate index could &lt;em&gt;obfuscate &lt;/em&gt;information sufficiently enough to leave some mystery? In a way, an index must be too "honest" to allow for secrets.&lt;br /&gt;&lt;br /&gt;The other problem, opposite to the honesty problem described above, is that if an index isn't specific enough, it's impossible to put the facts together in the first place. For example, if I changed the &lt;em&gt;matches &lt;/em&gt;entry above to this:&lt;br /&gt;&lt;br /&gt;Rapiere, Evan&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;accidental discovery of match book, 166&lt;br /&gt;&lt;br /&gt;there's no way to connect this to La Traviata without help. Similarly, if I change the murder entry to simply this:&lt;br /&gt;&lt;br /&gt;Rapiere, Evan, 235&lt;br /&gt;&lt;br /&gt;or&lt;br /&gt;&lt;br /&gt;Rapiere, Evan&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;surprised by an intruder, 235&lt;br /&gt;&lt;br /&gt;then there is nothing in the index to clarify that Evan actually died.&lt;br /&gt;&lt;br /&gt;I'm looking for ideas on how to get around these challenges. How much integrity can the index maintain without either giving too much away or leaving too many holes in the story?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-121955941814763775?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/121955941814763775/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=121955941814763775&amp;isPopup=true' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/121955941814763775'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/121955941814763775'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/01/constructing-mysterious-index.html' title='Constructing a mysterious index'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-6969070531079308532</id><published>2007-01-03T22:46:00.000-05:00</published><updated>2007-01-03T22:53:33.384-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='human factors'/><category scheme='http://www.blogger.com/atom/ns#' term='business of indexing'/><title type='text'>My waitress, my audience</title><content type='html'>For some reason I felt like hugging my waitress this morning.&lt;br /&gt;&lt;br /&gt;There’s a diner in my hometown that I visit up to three times weekly, while my daughter is at school. They know me here. I have my special table, my eminent domain, where I plug in my computer. Kathleen, my regular waitress, brings me coffee even as I sit down and tries to guess what I want for breakfast, with some accuracy. She knows my name, too, which I feel is the best part. So in this new year, after I happily kissed 2006 good-bye, I was inspired to welcome my diner lifestyle in true living style. “Kathleen, can I give you a hug?” I asked, and she said yes.&lt;br /&gt;&lt;br /&gt;Then we get to talking. Her age, my age. Her career path, my career path. The kinds of things people always talk about at the start of a new year. To sum up her half of the conversation in just one sentence, I’d write this: She came to the realization that at the age of 54, if she had taken her father’s advice and gotten a job at the phone company when she was 20, she would be retired instead of working at the diner. (I have to add that she’s never had a vacation in her entire life, except for two weeks when this diner was closed temporarily, and that her job pays no benefits.)&lt;br /&gt;&lt;br /&gt;As a self-employed indexer working voluntarily where she works, what can I tell her?&lt;br /&gt;&lt;br /&gt;First, I tell her that my profession is dominated by women over the age of 45. Many of these women raised their children and wanted to do something different with their lives. Many felt oppressed by their imaginings of the traditional workplace, or were afraid of having insufficient skill to reenter the job market. Others no longer had the financial support of a spouse and simply needed to find work to stay comfortable, or even solvent. I see these women at every indexing meeting and in my classrooms. Almost all had never heard of indexing before, and they’re giving it a try. After all, being self-employed and reading books sounds a lot better than being buried in a cubicle.&lt;br /&gt;&lt;br /&gt;Then I tell her that once you’re truly self-employed, making enough money to pay monthly bills and quarterly taxes, in your life you will never experience unemployment again. You can’t be fired, you can’t be laid off, and you can’t be transferred to another division, location, building, team, or employer—not without first choosing it for yourself. As long as you have your core skills, you’re not much different from a child who can entertain himself with sticks, sand, and even an empty parking lot: the world is your workplace.&lt;br /&gt;&lt;br /&gt;Finally, I tell her that in my profession, everything I have every learned (and continue to learn) has a direct application to my job performance. My grandfather says “no knowledge is ever wasted,” and in my case he’s literally correct. Every conversation I have gives me better background on people and information, and every experience provides me with a potential story to share with my students.&lt;br /&gt;&lt;br /&gt;It’s not hard for me to be positive about what I do, because I love what I do. There are lot of perks, too, from being my own boss to sitting at my favorite table in the diner, where I am now. Kathleen, on the other hand, is pretty grumpy about her job. She gripes about a lack of benefits, a dearth of Social Security earnings, a regular 5:30am wake-up call. I have no doubt that I have it better than she does, at least in these ways.&lt;br /&gt;&lt;br /&gt;If there’s a professional lesson for me here, it’s that Kathleen is my audience. She is the person who reads the books I index, the person who might sit in my classroom. I have to remember that if I do my job correctly, I am building a bridge from me, the college-educated small business owner who works off his laptop, to people like her. The gap in our lives is the challenge I face every time I sit down to invent a keyword, and index entry, or a label for a hyperlink.&lt;br /&gt;&lt;br /&gt;In the world of indexing, these life differences are surmountable.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-6969070531079308532?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/6969070531079308532/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=6969070531079308532&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6969070531079308532'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/6969070531079308532'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2007/01/my-waitress-my-audience.html' title='My waitress, my audience'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-2700829619447235045</id><published>2006-12-28T10:49:00.000-05:00</published><updated>2006-12-28T10:58:58.539-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='spamming and similar behaviors'/><category scheme='http://www.blogger.com/atom/ns#' term='search engines'/><category scheme='http://www.blogger.com/atom/ns#' term='misspellings and other errors'/><category scheme='http://www.blogger.com/atom/ns#' term='web indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='social algorithms'/><title type='text'>Eighteen million people can't be wrong</title><content type='html'>&lt;p&gt;No matter how much you and I might like Google, the fact is that Google has some very serious problems with it comes to finding content. More specifically, if you're looking for the "right" answer, or if you're attempting to do any serious research, Google is likely to fail you miserably.&lt;/p&gt;&lt;p&gt;The flaw lies in &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0" onclick="BLOG_clickHandler(this)"&gt;Google's&lt;/span&gt; strength: &lt;em&gt;social algorithms.&lt;/em&gt; Social algorithms are processes in which decisions are made by watching and following the majority of people in a community. If blogger.com tends to be the place people go to create blogs, then a social algorithm will see blogger.com as "better." When a search engine is managed by a social algorithm, a website might appear first in search results not because of the quality of site, but rather because a larger number of people treated the site as if were of higher quality. In other words, social algorithms equate "majority" with "best," something that often looks right but actually is patently untrue. &lt;/p&gt;&lt;p&gt;When you perform a search at Google.com, your results are sorted based on majority behavior and little else. For simple questions about anything -- as well as complex questions about cultural issues, for which "lots of people" is critical -- frequently the majority opinion is rather close to what you want -- which is why Google is so successful. But the gap between "close to what you want" and "accurate" is an invisible one, and that makes it insidious and dangerous.&lt;/p&gt;&lt;p&gt;For example, search for "Seth &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1" onclick="BLOG_clickHandler(this)"&gt;Maislin&lt;/span&gt;." The first hit is &lt;a href="http://taxonomist.tripod.com/"&gt;my website&lt;/a&gt;. The second hit is &lt;a href="http://maislin.blogspot.com/"&gt;this blog&lt;/a&gt;. The third hit is &lt;a href="http://www.oreilly.com/news/seth_0799.html"&gt;an interview I did for &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2" onclick="BLOG_clickHandler(this)"&gt;O'Reilly&lt;/span&gt; &amp; Associates in July 1999&lt;/a&gt;. An investigation of why these are the top three sites is rather interesting. First of all, these are the only results in which my name actually appears in the title; the fourth link and beyond have my name in the document, but not the title. Second, my website appears at the top not because it's the definitive website about "Seth &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;Maislin&lt;/span&gt;," but because Google knows of 24 people linking to it. In comparison, the only person who ever created a link to this blog is me -- a number far less than 24! The same goes for the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4" onclick="BLOG_clickHandler(this)"&gt;O'Reilly&lt;/span&gt; interview, except that the single linker isn't even a valid site any more: it's broken. The popularity of my home page (in comparison to this blog, for example) is why it's a better hit for my name. But if you folks out there started to actually link to this blog, that would change.&lt;/p&gt;&lt;p&gt;You should look into the search results for the word "Jew." A website known as &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5" onclick="BLOG_clickHandler(this)"&gt;JewWatch&lt;/span&gt;.com, an offensive and inflammatory collection of antisemitic content, had appeared as the number-one result at Google.com for this one-word query. This happened because a large number of supporters of this site tended to build links to it; then, those were were outraged or amused also linked to it within their protestations. In the end, the social algorithms at Google recognized how popular (i.e., "linked to") this site was, and in response rated it very highly -- in fact, rated it first -- compared to all other websites with the word "Jew" in the title. Eventually, those who were enraged by this content fought back by asking as many people as possible to link somewhere else -- specifically, the &lt;a href="http://en.wikipedia.org/wiki/Jew"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6" onclick="BLOG_clickHandler(this)"&gt;Wikipedia&lt;/span&gt; definition of Jew&lt;/a&gt; -- just as I have here. Over time, more people linked to &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7" onclick="BLOG_clickHandler(this)"&gt;Wikipedia&lt;/span&gt; than to &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8" onclick="BLOG_clickHandler(this)"&gt;JewWatch&lt;/span&gt;, and so the latter dropped into second place at Google. This process of building networks of links in order to influence &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9" onclick="BLOG_clickHandler(this)"&gt;Google's&lt;/span&gt; social algorithm is called "Google bombing." In other words, when the people who hated the site acted together in a large group, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10" onclick="BLOG_clickHandler(this)"&gt;Google's&lt;/span&gt; social algorithms responded.&lt;/p&gt;&lt;p&gt;(By the way, you'll notice that I do not create a link to the offensive site. I see no reason to contribute to its success.)&lt;/p&gt;&lt;p&gt;Do you see the problem? The success of Google bombing is analogous to the squeaky wheel metaphor, that the loudest complainer gets the best service. Social algorithms reward the most popular, regardless of whether they deserve it. &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11" onclick="BLOG_clickHandler(this)"&gt;JewWatch&lt;/span&gt; made it to the top because it was popular first; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12" onclick="BLOG_clickHandler(this)"&gt;Wikipedia's&lt;/span&gt; definition moved to the top because those offended banded together to demonstrate even more loudly. And in the end, there's no reason for me to think either of these links is best.&lt;/p&gt;&lt;p&gt;Whether popularity is a good thing or a bad thing is often subjective. In language, some people lament the existence of the word &lt;em&gt;ain't,&lt;/em&gt; while others applaud its existence as an inevitable sign of change; either way, the word is showing up in our dictionaries because more and more people are using it. But I'm not talking about language; I'm talking about truth. &lt;/p&gt;&lt;p&gt;Do you think vitamin C is good at preventing colds? Well, it isn't; there have been no studies demonstrate its effectiveness, but there have been studies that show it makes no real difference. (It's believed that vitamin C will shorten the length of a cold, but studies are still inconclusive.) But after a doctor popularized the idea of vitamin &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13" onclick="BLOG_clickHandler(this)"&gt;megadosing&lt;/span&gt;, our entire culture suddenly believes taking the vitamin will keep you extra healthy. Untrue.&lt;/p&gt;&lt;p&gt;Do you know why "ham and eggs" is considered a typical American breakfast? Because an advertising executive in the pork industry used Freudian psychology to convince people to eat ham for breakfast. He did it by asking American doctors if they thought hearty breakfasts were a good thing (which they did); the ad-man then asked if ham were a hearty food. Voila: ham, sausage, and bacon are American breakfast staples, and the continental breakfast vanished from our culture.&lt;/p&gt;&lt;p&gt;In both of these examples, majority belief trumps the truth. And look at the arguments about global warming! I won't repeat the arguments laid out by Al Gore in &lt;em&gt;&lt;a href="http://www.climatecrisis.net/"&gt;An Inconvenient Truth&lt;/a&gt;,&lt;/em&gt; but his argument is that as long as enough people insist that global warming isn't true, its dangers will remain unheeded. In fact, I'm not even going to argue here whether global warming is a real thing or not; it doesn't matter what I believe. What matters is that the debate over global warming isn't a fight over the facts. Instead, it's a shouting match, in which the majority wins. Right now, so many influential people have argued that it doesn't exist (or isn't such a big deal) that very little has been done in this country in response to its possible existence. But as more and more people start to believe it's at least possible, it's becoming a reality. Doesn't that just drive you nuts? Why are the facts behind global warming driven by democracy? &lt;em&gt;Can't something be true even if no one believes in it?&lt;/em&gt;&lt;/p&gt;&lt;p&gt;One last look at this "majority rules" concept, only this time let's avoid politics and focus on simple word spelling. If you search for the word &lt;em&gt;millennium,&lt;/em&gt; correctly spelled with two &lt;em&gt;L&lt;/em&gt;s and two &lt;em&gt;N&lt;/em&gt;s, you'll get about 54 million hits at Google (English-language pages only). If you search for the word &lt;em&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_14" onclick="BLOG_clickHandler(this)"&gt;millenium&lt;/span&gt;,&lt;/em&gt; misspelled with two &lt;em&gt;L&lt;/em&gt;s and only one &lt;em&gt;N,&lt;/em&gt; you'll get 18 million hits. Twenty-five percent of all websites have this misspelling in them! For content that's published, that by its very nature is biased toward having only correct spellings, this error rate is monstrous! But does Google let you know that &lt;em&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15" onclick="BLOG_clickHandler(this)"&gt;millenium&lt;/span&gt; &lt;/em&gt;is misspelled? Does it ask you if you "meant to type &lt;em&gt;millennium&lt;/em&gt;?" No! After all, Google considers the misspelled word correct.&lt;/p&gt;&lt;p&gt;I mean, eighteen million people can't be wrong, right?&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-2700829619447235045?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/2700829619447235045/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=2700829619447235045&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/2700829619447235045'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/2700829619447235045'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/eighteen-million-people-cant-be-wrong_28.html' title='Eighteen million people can&apos;t be wrong'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-5936954812765437386</id><published>2006-12-24T15:33:00.000-05:00</published><updated>2008-01-02T20:11:15.962-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='search engines'/><category scheme='http://www.blogger.com/atom/ns#' term='keywording'/><title type='text'>I met a famous indexer the other day</title><content type='html'>In my &lt;a href="http://maislin.blogspot.com/2006/03/frustrated-by-lack-of-meaning.html"&gt;March 20 post ("Frustrated by a lack of meaning")&lt;/a&gt;, I made reference to a Microsoft clip art mess that was quite public. The story is that the keyword "monkey bars" caused certain images to appear when someone searched for the word "monkey," and that these results were misinterpreted in a strongly negative way.&lt;br /&gt;&lt;br /&gt;Well, I met the indexer who actually wrote those keywords -- someone I've known for a long time -- and I have to say, there's something really cool about realizing that one of your good colleagues was behind that story. I also find it reassuring that the indexer is someone who really knows what she's doing, because it emphasizes just how far apart good indexing is from good search: smart people, dumb tools.&lt;br /&gt;&lt;br /&gt;For more on this subject, I recommend reading &lt;a href="http://www.amazon.com/Inmates-Are-Running-Asylum-Products/dp/0672326140/sr=8-1/qid=1166420052/ref=pd_bbs_sr_1/102-4093742-0908149?ie=UTF8&amp;amp;s=books"&gt;The Inmates Are Running the Asylum&lt;/a&gt;. The book is about computer programming in general, but the sentiment is dead on.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-5936954812765437386?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/5936954812765437386/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=5936954812765437386&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/5936954812765437386'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/5936954812765437386'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/i-met-famous-indexer-other-day.html' title='I met a famous indexer the other day'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-7116072763820819135</id><published>2006-12-24T13:58:00.000-05:00</published><updated>2006-12-24T14:04:23.914-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing process'/><category scheme='http://www.blogger.com/atom/ns#' term='spamming and similar behaviors'/><category scheme='http://www.blogger.com/atom/ns#' term='search engines'/><category scheme='http://www.blogger.com/atom/ns#' term='web indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='keywording'/><title type='text'>A needle in a haystack with 100,000,000 blades</title><content type='html'>The Internet has more than 100 million websites, according to the &lt;a href="http://news.netcraft.com/archives/2006/11/01/november_2006_web_server_survey.html"&gt;November Netcraft survey&lt;/a&gt;. If you were standing on top of the growth curve, by now your stomach would have nothing left to vomit up.&lt;br /&gt;&lt;br /&gt;I did some math, and I've figured out a way to make sure that all of these websites are indexed. Here's what I discovered.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Between October and November 2006, approximately 3.5 million sites were created. Assuming that my team would be responsible for inventing a set of keywords for the whole site -- and not for individual pages or parts of pages -- we would have to build 3.5 million keyword sets.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Let's further assume that on average, every website would have four keywords or key phrases. For example, this blog would get the keywords "Seth Maislin," "indexing," "blog," and perhaps my company name, "Focus Information Services." Ideally we'd have the time to invent many more, since it's our goal to help the website perform well at the various search engines, but this team simply can't give everyone special attention. So I'm making the executive decision to limit ourselves to creating 4 terms each for 3.5 million sites.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Assuming that we can invent and type one keyword every two seconds -- a conservative estimate, given that my company name takes me a minimum of two seconds to type -- we'll need 28 million seconds to get the job done.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Now remember, we're just taking about the new sites created in October 2006. Consequently, we have only a month to get the job done before we have to start indexing the November 2006 sites. For this reason, I'm going to build a team of several people, with each one putting in eight hours per day, twenty days each month. That's 576,000 seconds per person per month.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Dividing 28,000,000 seconds per month by 576,000 seconds per person per month gives me 48.1 people, which I'll round to a nice 50 people. That means I need a team of just 50 people to get the job done.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;So there you go: a team of 50 people can index the Internet. That doesn't sound nearly as bad as I thought. Of course, everyone will have to type rather quickly, and we'll need a system in place to prevent us from accidentally indexing any one website more than once, but that shouldn't be too bad. And yes, I'm assuming that all of these websites are in English, but most of them are; I'll bring a few translators to work on the few remaining.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;At U.S.$50,000 per year per indexer, which is quite modest for a highly intense round-the-clock job like this, plus $100,000 for me as manager, I could probably put together a bid of about $350,000/year to get the job done. Given how many billions of dollars are spent or exchanged over the Internet today, that seems quite reasonable, too. Heck, I should triple the whole thing, since we'd have to re-index the old sites every once in a while. Maybe I should double it again, too, so we'd be allowed to use eight keywords instead of four.&lt;br /&gt;&lt;br /&gt;So let's see, that brings the total bill to to $2.1 million. Gosh, that isn't bad at all, is it? I mean, we all agree that indexing the Internet is at least a two-million-dollar-per-year business, right?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Except it's not. &lt;/strong&gt;Indexing the Internet is a &lt;strong&gt;zero&lt;/strong&gt;-dollar-per-year business. No one is doing it. Just about no one seems to care about quality keywords. In fact, there are only two industries that exist around keyword creation. One of them is misnamed "search optimization," which is about spamming the heck out of the Web. Optimize, I think not: this is the &lt;em&gt;opposite &lt;/em&gt;of the intelligent product my team would be build. The other business is the search business itself, companies springing up around those fancy algorithms that Google, Yahoo, Lycos, Ask Jeeves, and the rest use. The thing is, those algorithms are just word-matching machines. These engines are looking for keywords, but none of them is actually writing any. So you see, no one with indexing training is writing any keywords. The inexpensive market for human indexers is being completely overlooked.&lt;br /&gt;&lt;br /&gt;Guess it's not worth the two million.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-7116072763820819135?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/7116072763820819135/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=7116072763820819135&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/7116072763820819135'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/7116072763820819135'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/needle-in-haystack-with-100000000.html' title='A needle in a haystack with 100,000,000 blades'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-1101083982270892672</id><published>2006-12-18T15:31:00.000-05:00</published><updated>2006-12-18T15:35:11.841-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing tools'/><title type='text'>"Can I Delete All My ___ Entries in MS Word?"</title><content type='html'>(This is the newest entry in my &lt;a href="http://taxonomist.tripod.com/indexing/wordproblems.html"&gt;MS Word Indexing FAQ&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;Every now and then, there's nothing you want to do more than globally delete a bunch of entries. The problem is how this is supposed to happen. For example, suppose you have a common main entry for "publicity," when you decide that you're better off with a cross reference like "publicity. &lt;em&gt;See&lt;/em&gt; marketing." In addition to creating this cross reference, you need to &lt;em&gt;remove&lt;/em&gt; all of your original &lt;em&gt;publicity&lt;/em&gt; entries. Although you can search for marker text, you can't search for whole markers. In other words, you can search for the word "publicity" when it's used within index markers (look for hidden text), but you can't search for a whole marker like {XE "publicity"} or {XE "publicity:methods for"}. For this reason you can search globally and delete.&lt;br /&gt;&lt;br /&gt;The easiest approach to deleting all &lt;em&gt;publicity&lt;/em&gt; entries is the manual approach: generate your index, then delete everything that starts with the word &lt;em&gt;publicity.&lt;/em&gt; Unfortunately, manual edits will be undone as soon as you generate the index again; you'll have to remember that you want to make these manual changes every time you create a new version of the index. To help you remember to make these manual changes, I recommend changing the format and/or language for the word &lt;em&gt;publicity&lt;/em&gt; to make sure it jumps out at you. Search for &lt;span style="font-family:courier new;"&gt;XE "publicity&lt;/span&gt;, the unique text for all publicity entries, and replace it with boldface, all caps, and a shocking color like red. I also recommend that you change the word &lt;em&gt;publicity&lt;/em&gt; with something that will sort at the very beginning of your index, such as &lt;em&gt;aaa DELETE ME.&lt;/em&gt; Now, when you generate your index, you'll see some red, boldface, all-caps reminder at the top of your index file. Hopefully this will be enough for you to remember deleting your entries.&lt;br /&gt;&lt;br /&gt;Another approach, and by far the one I prefer, is to replace the marker syntax with something that Word can't interpret. Instead of using the letters XE in your marker, use something like DELETE_ME. In other words, globally change &lt;span style="font-family:courier new;"&gt;XE "publicity&lt;/span&gt; with &lt;span style="font-family:courier new;"&gt;DELETE_ME "publicity&lt;/span&gt;. Since markers are hidden text, your DELETE_ME markers will remain hidden from publications; further, they'll fail to become index entries since Word won't interpret them as XE markers. The biggest advantage to this method is that it works globally, and you only have to make these changes once. Another advantage is that you aren't actually deleting the entry, just rewriting it; if for any reason you need to reconstruct entries, you can always change DELETE_ME to XE. (This is a kludgy way of creating conditional text, but it might be just what you need.) The disadvantage is that you're not actually deleting anything, potentially cluttering your documentation.&lt;br /&gt;&lt;br /&gt;As a side note, whenever you remove an entry from your index, remember that you have to delete any cross references that target those now-removed entries. For example, if you replace your &lt;em&gt;publicity&lt;/em&gt; entries with "publicity. &lt;em&gt;See&lt;/em&gt; marketing," you'll need to rewrite or delete entries like "public relations. &lt;em&gt;See also&lt;/em&gt; publicity."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-1101083982270892672?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/1101083982270892672/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=1101083982270892672&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/1101083982270892672'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/1101083982270892672'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/can-i-delete-all-my-entries-in-ms-word.html' title='&quot;Can I Delete All My ___ Entries in MS Word?&quot;'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-4697293818207474156</id><published>2006-12-16T21:27:00.000-05:00</published><updated>2006-12-16T22:23:27.361-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='future of indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='American Society of Indexing'/><title type='text'>ASI President's Letter (December 2006)</title><content type='html'>Below are the first few paragraphs of my letter as ASI president, published in &lt;em&gt;Key Words.&lt;/em&gt; The full letter is available in the December 2006 issues of the bulletin, available to ASI members at the &lt;a href="http://www.asindexing.org"&gt;ASI website&lt;/a&gt;, as well as &lt;a href="mailto:president@asindexing.org"&gt;by request&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:verdana;font-size:85%;"&gt;&lt;strong&gt;ASI: Prospective and Retrospective, a Presidential Perspective&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;At the end this year, I’ll write a letter to “Seth of 2008.”&lt;br /&gt;&lt;br /&gt;For a number of years I’ve been sending snapshots of my life to future “selves,” capturing a year’s events, achievements, and desires onto a couple pages. Even though I’m writing to myself, however, I’m trying to communicate with versions of me that don’t yet exist. Who will I be in 2008? Why will I want to know about today’s “me”? What about the Seth of 2014? So, after warming up my pen with details about family, house, job, art, and health, I inevitably get to the tough stuff: ambitions, anxieties, hopes, and disappointments. There’s an irony to the whole thing, knowing I’ll be reading the letter with perfect hindsight. It’s an incentive to improve every year.&lt;br /&gt;&lt;br /&gt;ASI’s strategic plan is just such a letter. With its many strategies and priorities, we’re informing our future society about some critical information. Our members have shared with us a vision in which indexing will be recognized and respected more; to reach this vision we’ll have to look critically at who we are, now and soon. With the hindsight we’ll have in 2008 (and 2010 and 2014), I don’t want us to feel nostalgic when we look back. I want us to feel successful. I want us to be glad that we live in better times.&lt;br /&gt;&lt;br /&gt;The conflict between the needs of the immediate and our goals for the future is real. To function as a society, we need people in charge of what’s happening right now, as well as people in charge of what’s happening in the future. Week to week, ASI manages a long stream of important details: chapter name changes, SIG formations, PR construction, training course materials, administrative shifts, the Philadelphia conference, membership drives, and so on. The board gets a few dozen reports from committees, fourteen chapters, SIGs, and task forces. This is the “ASI of 2006,” focused on bylaws and meetings and content development.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-4697293818207474156?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/4697293818207474156/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=4697293818207474156&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/4697293818207474156'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/4697293818207474156'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/asi-presidents-letter-december-2006.html' title='ASI President&apos;s Letter (December 2006)'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-3212611517647692189</id><published>2006-12-16T21:17:00.000-05:00</published><updated>2006-12-18T00:07:18.470-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='information architecture process'/><title type='text'>Seth, a *different* enabler</title><content type='html'>Most coincidentally, taxonomist Seth &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0" onclick="BLOG_clickHandler(this)"&gt;Earley&lt;/span&gt; wrote about the enabling process in his own blog, "Not Otherwise Categorized...." Of course, his entry is less comic, more professional, and thus more meaningful than mine (12 Dec 2006), so it would be a shame for me not to reference it.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://sethearley.wordpress.com/2006/11/07/just-tell-me-the-answer-the-challenge-of-stakeholder-engagement/"&gt;"Just tell me the answer" (blog entry, 7 Nov 2006)&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There are more enabling &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1" onclick="BLOG_clickHandler(this)"&gt;Seths&lt;/span&gt; out there than you might have noticed at first.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-3212611517647692189?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/3212611517647692189/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=3212611517647692189&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/3212611517647692189'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/3212611517647692189'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/seth-different-enabler.html' title='Seth, a *different* enabler'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-116598657752435047</id><published>2006-12-12T23:56:00.000-05:00</published><updated>2006-12-18T00:07:45.994-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='information architecture process'/><title type='text'>Seth, the enabler</title><content type='html'>I rediscovered the two roles that a consultant can play in business. He can step forward, propose and perhaps implement a solution all by his lonesome, and then walk away; or he can sit quietly to what everyone has to say, push and prod in strategic ways, and get everyone else to do the work for him.&lt;br /&gt;&lt;br /&gt;Here's an analogy. Suppose a friend who needs a resume approaches you for help. "Would you write me resume?" she asks. On approach is to say "yes," ask a couple of questions, and then crank out a complete resume. Handing it to her you say, "Go ahead and make some edits, if you want." There are some wonderful advantages to this process: you get to work on your own, on your own terms, and for a very short period of time. On the other hand, what you're really supposed to do is sit down with your friend and say, "Well, what have you got so far?" Then you ask all sorts of clever questions like, "What do you think you do best?" and "What kind of job do you think you want?" She answers these questions, and as you nod wisely, you tell her to write all that stuff down.&lt;br /&gt;&lt;br /&gt;The greatest part about being an enabler is that you never have to make a decision at all. You're a Freudian psychologist asking all sorts of provocative questions, getting paid by the hour to watch someone else do all the work. The better they do, the better you look.&lt;br /&gt;&lt;br /&gt;I've discovered that being an enabler is the smartest, most lucrative, and most effective way to be a consultant -- but the fact that I never have to make a decision is very interesting. "What do you think? How would you do this? Do you think this would work? Before tomorrow, see if Joe agrees." I'm amazed at the power these kinds of questions have.&lt;br /&gt;&lt;br /&gt;Ask yourself how much enabling you do in your job. I'm starting to realize that helping people do things on their own is more rewarding than doing it myself. Frankly I'm unnerved by this. This wasn't at all what I learned in engineering school.&lt;br /&gt;&lt;br /&gt;But everything I've read says that this is now the right way to do this. Decisions made by people who don't actually use the system are less likely to succeed. Evidence-based practice is about moving forward not on what you think, but on what you know, such as from testing. So yes, it's about asking the right questions, and not about what you know. In fact, psychologists who &lt;em&gt;do &lt;/em&gt;know the answer have to play dumb if they're to succeed.&lt;br /&gt;&lt;br /&gt;If you had told me years ago that the subject matter experts are far less valuable than the subject matter dunces, I'd have said you were full of... what's the word? (I trust your opinion.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-116598657752435047?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/116598657752435047/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=116598657752435047&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/116598657752435047'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/116598657752435047'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/seth-enabler.html' title='Seth, the enabler'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-22121434.post-116598577231448937</id><published>2006-12-12T23:41:00.000-05:00</published><updated>2006-12-18T00:09:15.000-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='indexing process'/><category scheme='http://www.blogger.com/atom/ns#' term='cataloguing'/><category scheme='http://www.blogger.com/atom/ns#' term='embedded indexing'/><category scheme='http://www.blogger.com/atom/ns#' term='keywording'/><title type='text'>Indexing moving content</title><content type='html'>Has it been three months? Almost!&lt;br /&gt;&lt;br /&gt;Fact is, the world has a way of throwing curve balls on a regular basis. For me, those curve balls include a family-wide influenza epidemic, teething babies, travel plans, and the like. Trying to keep a grip on life is like trying to catch fish with your hands.&lt;br /&gt;&lt;br /&gt;Tonight I give a presentation about trying to index moving targets. I was surprised to discover that of all the presentations I've ever given, this was absolutely the hardest to write. In fact, I just finished a few minutes ago. I've taught three-day classes, with eight hours of material on each day, but this 45-minute presentation really stymied me. There are two reasons for this.&lt;br /&gt;&lt;br /&gt;First, trying to index moving content is, no matter what, a mess. The simplest example of a problem is creating an index entry like "software development, 111-121," and then finding out that pages 111 and 121 have moved respectively to pages 113 and 123. With standalone indexing (where you type in the page numbers), the only real way to fix this is manually: go back and rewrite all your page numbers. It's a MESS. So here I am, hoping to provide some tips to indexers and technical writers, something to help them avoid these kinds of corrections -- only to realize that there's no good answer. (A bad answer is to not index at all. :-)&lt;br /&gt;&lt;br /&gt;The second problem is that even if I did have a list of useful tools, they don't make for interesting presentation materials. The first draft of my presentation would have resembled a public reading of the weather report for ever American city, in alphabetical order: if you're lucky, you're interested in Albuquerque and Atlanta and can walk out early.&lt;br /&gt;&lt;br /&gt;The fact is, our growing reliable on live and custom information is wreaking havoc on the indexing world. It's becoming harder and harder to collate information in relevant chunks. Search will never do it; even if there were human beings out there developing controlled vocabularies, full-text search still retrieves a tremendous amount of flotsam. But creating keywords for something that won't live an hour seems kind of pointless, too. We're all just pounding sand.&lt;br /&gt;&lt;br /&gt;I'm looking forward to what the participants have to say. Must we accept the false imprisonment of uncatalogued real-time information flow, or will writers finally catch on that indexers have an important role on the creation side as well?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/22121434-116598577231448937?l=maislin.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://maislin.blogspot.com/feeds/116598577231448937/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=22121434&amp;postID=116598577231448937&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/116598577231448937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/22121434/posts/default/116598577231448937'/><link rel='alternate' type='text/html' href='http://maislin.blogspot.com/2006/12/indexing-moving-content.html' title='Indexing moving content'/><author><name>taxonomist</name><uri>http://www.blogger.com/profile/11832913832836400039</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06392069473165889148'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry></feed>