LibraryThings

If it once took a special type of person to be a library cataloguer — one comfortable in back offices & around heavy rule books, methodical, perhaps quiet — now everyone wants to get in on the action. The rise of self-cataloguing has been one of the more inexorable effects of digital media. The discovery within cataloguing of social connections now seems to be another.

Of course long before all this web stuff we were being trained to collect content in various forms, and value assemblages as inherent identifyers of taste. Siva Vaidhyanathan’s recent presentation at Columbia’s Correcting Course forum carries this age-old ritual into my lifetime; he talks about a mass paperback industry that marketed (unread? unreadable?) books as class identification… VHRs marooned on shelves, monuments of their owner’s cinematic pleasures …. the fine art of of mixed tapes, now supplanted of course by playlists….

The fetishistic Mac application Delicious Library wraps a collection database into a pretty package so… so… well, so you can have a virtual representation of all your books — all your video games — all your DVDs, right on the hard drive of your computer. Scan the item’s UPC barcode with a webcam, and presto, metadata from Amazon flies right into your own library database — including cover art. Awesome, right?

Ok it’s actually fairly purposeless. You can assign items ratings, and you can designate their location in actual space, but I doubt many are actually relying on Delicious Library to find stuff. If you lend out an item to a friend, you can track it with DL — but really, if you’re lending out more than you can remember & your friends can’t be trusted to return things, well, maybe a policy change is in order. And DL’s symbiosis with Amazon’s API is worrisome — Amazon-hosted One-Click Shopping recommendations are just a click away.

But describe Delicious Library to someone, and it’s possible that they’ll turn cataloguer right in front of your eyes: huh, my things in a database….

Delicious Library

^ Finding Nemo and other treasures: virtual shelving in Delicious Library

LibraryThing — straight outta Portland Maine, btw — is a web app significantly tastier than its desktop cousin because it networks people’s collections. LibraryThing still invites you to play with representations of your books on virtual shelves for yourself — but now you’re doing your assembling among & amid a myriad of intersecting libraries. Now metadata is up for grabs, unregulated by Amazon or any other detached entity: social tagging comes to the fore. You can hear the 2.0 pitch — it’s del.icio.us for books! — and lo, tagging abounds.

But just around books — LibraryThing valiantly resists the siren call of other media on favor of bibliomania. It links its bibliographic records to OCLC’s Find a Library as well as Amazon and library OPACs via the good old Z39.50 client server protocol, and hosts discussion of titles among those who share it in their libraries.

In short, if you love books, LibraryThing seems an unrigged communal playpen, as well as a self-inventory tool. It provides branching recommendations based on mutual ownership, not Amazonian purchases. It presents clouds of a book’s common tags unseeded by commerce. It offers RSS subscriptions for any given tag, so you can track books as collections, not products, come in.

LibraryThing Screenshot

^Adding to my library in LibraryThing: I enter in a title, and LT checks it against a bibliographic database of my choosing. And I choose LC! No snappy webcam scan, alas, though barcodes are acceptable identifiers.

LibraryThing screenshot

^Now that I’ve added my book to LibraryThing, I can see how others have tagged and rated it. Looks like some people don’t care for literary theory, and yet they own the book. Go figure. This title hasn’t been reviewed yet in LibraryThing, but many have.

LibraryThing screenshot

^My so-far small library (the books on my desk right now).

Most intriguing of all, LibraryThing has recently added Library of Congress subjects into the mix. The premise is that user-created tags can coexist with library-tended subject headings, that folksonomy can play off of controlled hierarchy. At times, tags and subject headers coincide. In other instances, they hardly ever do. LibraryThing has only just embarked on this odd tango, and who knows where it will lead — but at the very least it should generate some intriguing friction.

LibraryThing screenshot

^Exploring the tag “literary theory” on LibraryThing. I see heavy users of this tag, works most often tagged by the term, and the latest books into the system so tagged (and I can subscribe to the tag via RSS). I also see related LC Subject Headings, in case I feel like faceted browsing.

Already user-tags are sitting up a little straighter and paying more attention to themselves. Discussion on LibraryThing’s metablog, Thingology, has been spurred by subject headings to characterize — dare I say categorize — tags. Discussants finds tags to fall into recognizable camps: personal location notes (“living room,” “office”), personal use tags (“read,” “damaged,” “study”), broadcast opinion tags (“excellent,” “lame” ), and personal subject tags (anything in the uncontrolled descriptive universe). The half-hazard felicities of user-tag surfing is getting measured right up against the precision of subject headings.

All this driven by Tim Spalding, a web developer, not a librarian. Or is he? Should we settle for patron?

The means of conception

Nothing odd will do long. ‘Tristram Shandy’ did not last.
- Samuel Johnson

Wrong! — I gleefully thought, way back when I was slogging through an eighteenth century literature class in college — bored silly by Johnson’s lumbering, moralizing, psuedo-Oriental Rasselas, and, in contrast, completely delighted by Lawrence Sterne’s goofy carnival of the mind, Tristram Shandy. Wrong, you fat old authoritative Dr. Johnson, because here I am 220 years later savoring every Rabelaisian joke, every self-conscious pratfall, every typographic stunt of Tristram Shandy.

I had to admire the concision of the put-down, though. A quick slam of the sprawling, irresolute Shandy.

With the wisdom of age, I now am ready to concede that Johnson was half-right: nothing odd does “do” for long. Especially online. I’ll circle back to that emphasis in a moment — but first, let me submit that Tristram Shandy is far from odd, considered rightly. Part of the thrill of reading it in 1980-something *cough* was seeing evidence of postmodern friskiness that actually pre-dated the United States. Tristram’s obsessions stretched reflexivity back into exotically distant realms of bygone minutia (unlike the broad cardboard exoticism of Johnson’s Happy Valley). It seems that then, as well as now(-ish), conceptions were improbable, resolutions impossible; the world teemed with distraction, neurosis, and disordered influence; and authors invited readers to play games.

In fact, if we glance back at a couple of Tristram‘s more infamous tricks, we might feel that Sterne’s techniques are getting less odd by the day. When our author despairs at describing the concupiscible Widow Wadman, and throws open his pages to the reader (here’s paper ready to your hand. — Sit down, Sir, paint her to your own mind—as like your mistress as you can—and unlike your wife as your conscience will let you…) — is this not collaborative authoring space?

Tristram Shandy blank page

And when the narrator, picking up momentum by way of a vegitable [sic] diet, sits down and charts out the loopy plot lines of the novel as it’s progressed so far, even dropping in anchor points so we can check his graph against designated passages — is this not, however tongue-in-cheek, metadata visualization, or a mapping of information flow?

Tristram Shandy plotlines

L–d! said my mother, what is all this story about? —-
A COCK and a BULL , said Yorick —- And one of the best of its kind, I ever heard.

Indeed, and though I haven’t read it (which is to hear it) for, well, many years, Tristram sticks with me–probably because I prefer open concoction to moralistic bullying, especially when it comes to narration. And this preference has had currency for a long time; Tristram Shandy has lasted just fine.

Yet Johnson’s other snap judgment — nothing odd will do long — seems to me all the more true in the virtual places we increasingly come crowding for intelligence. Which is not to say that there aren’t odd things online — far from it — surf randomly, and the web seems a veritable cacophony of twaddle diddle, tweddle diddle, –twiddle diddle, —- twoddle diddle, –twuddle diddle, —- prut-trut — krish –krash — krush. Not to mention diddle diddle, diddle diddle, diddle diddle — hum — dum — drum.

But nothing odd does much online: you can park the most esoteric idiosyncratic wonderfully strange material on the web, but if you want it to get discovered, if you want it to work, if you want it to have an effect — if you want others to conceive of it (a favorite Shandyword) — then you must enter into common language and assumptions. This is so obvious it’s practically a truism — and yet see how many times we learn the lesson, how difficult it is to get out of our own heads.

Two quick, fairly pedestrian examples: John Kupersmith’s wonderful Library Terms that Users Understand shows how befuddled users can be by the simplest failure of librarians to realize that words like “Index” or “Database” or “Serial” can mean next to nothing to my Uncle Toby, just wanting to know where to find that Popular Mechanics article. Or let’s say you’ve given an OPAC a cute acronym and now you invite my Uncle Toby to “search EUNICE!” My poor uncle Toby blush’d.

Or have a look at Dan Cohen’s equally simple but solid advice about climbing up in Google ranks. Search engine optimization has its share of murk to it, but the basic path to visibility is: don’t be odd. Use a domain name that describes your resource (“chinook” or “aeoleus” sound great — but what are you airing?), use keywords in file names (with mod_rewrites, if necessary), get linked by highly linked sites (meaning, be understandable, and get understood by a widely understood site).

If this all sounds like it leads to a world as flat and predictable as, well, Johnson’s Rasselas, that’s not what I meant, not at all. It’s just that you can’t be *merely* odd or unique if you want to *do*: you need the sophistication to hook into conventional terms, general assumptions, broadly shared expectations. This involves a double-motion that might as well be called self-consciousness. Tristram‘s greatness is showing us how fun such contrivance can be. Sterne earns his pleasure (and ours too, he’s brought us jolting right along with him) when he sits back to marvel at himself, his magnificently clashing agendas: By this contrivance the machinery of my work is of a species by itself; two contrary motions are introduced into it, and reconciled, which were thought to be at variance with each other. In a word, my work is digressive, and it is progressive too, — and at the same time.

If it were all digression, Johnson would have been completely right about Tristram Shandy. But it is progressive too, which means that it sobers up just enough to realize, despite its irrepressible uniqueness, that above all things in the world, ’tis one of the silliest things in one of them, to darken your hypothesis by placing a number of tall, opake words, one before another, in a right line, betwixt your own and your readers conception.

Hogarth's frontpiece to Tristram Shandy

Mining the machines

Last year at the ARL symposium called Managing Digital Assets, I smiled inwardly to think of the grumbling likely to be kicked off by observations such as this by Donald Waters of the Mellon Foundation:

…what unites our interest in digitization and open access in a digital world is that the material becomes ‘processable,’ or subject to computational processing. That is, the growth in the market of readers is not among groups of humans, but of machines, which are programmed to index, manipulate, mine, aggregate, decompose, and build up scholarly and other forms of content by algorithm. It is this machine ‘processability’ that makes digitized objects and open access materials most valuable to scholars.

Protest, fume, rail against the subjection of your most exquisitely developed thought to the dumb imperatives of ones and zeros — Waters is absolutely right. You want influence? Or, more to the point, you want to avoid obliteration in the vast digital swamp? You’d better know how to demarcate, classify, and optimize your work for machine crunching — or find someone who does. And pray that the stewards of such crunching, the information managers you never thought about, have your best interests in mind.

All this occurred to me while reading a new D-Lib piece by Daniel Cohen, director of research projects at the very creative Center for History and New Media at George Mason University. Cohen also spoke at that ARL session, and at the time he sold me on Firefox scholar. His new article, “From Babel to Knowledge: Data Mining Large Digital Collections”, offers two nice examples of humantist-friendly manipulation of machine “processability.”

First: Syllabus Finder. Where was this godsend when I was inefficiently wandering around the chaff of the web, trying to crib ideas for my own syllabi? It’s a very sensible, very needed genre-based search tool. First, it defines “document classification” through a very simple dictionary of keywords endemic to syllabi (“assignment,” “office hours,” etc.). This classification is fed into Google through its API service, along with the search query, for optimized searches. The results can then be further refined through more automated analysis or combined with other search results.

I gave it a spin, using canonical writers from the Romantic era as search terms. To my happy surprise, good old Ashes Sparks & Hypertext, a six year old syllabus for a seminar I taught back in the day at UC Berkeley, kept showing up — and at or near the top of results. #1 for Coleridge, #2 for Byron, #1 for Wordsworth, #2 for Blake, #4 for Hemans. Yeah, baby! But we drop down to #14 for Keats, alas, and as for Shelley, he just kept coming up as a “fatal error,” an “Uncaught SoapFault exception.” So Syllabus Finder is a little buggy — but, dare we say it, a little poetic too. Maybe we’re just overly pleased by taking the silver for Byron:

Ashes Sparks is the second syllabus listed for Byron

I don’t know what to make of the way this tool seems to like the Ashes Sparks syllabus — certainly I indulged in no optimization — no thought about how the thing would be retrieved. The only distinguishing feature of that document, really, is that it’s been online steadily for six years. It’s just one of those Google-blessed mysteries. Perhaps cannier post-processing could promote syllabi more deserving of prominence. But Syllabus Finder works pretty well–I’d recommend it to a fledgling (and not-so-fledgling) instructor. As Cohen puts it, it does a surprisingly good job at achieving its modest goal – on most topics for every ten documents it retrieves, about nine are syllabi – and it has thus far found and catalogued over 600,000 syllabi, synthesizing a collection of course materials considerably larger than any created or maintained by a professional organization, educational institution, or library, or by any other effort on the web to aggregate syllabi.

A second and more complex treat today from the George Mason wizards: H-Bot. This is an automated historical fact finder that can field natural language queries. (Or at least ones that begin with ‘what’ or ‘when’ or ‘who’; it’s not ready to handle where, which, how, or why). The algorithm here is “question answering” — which involves the identification of relevant documents, some natural language processing (to interpret queries), and statistical/linguistic analysis of retrieved documents. (In addition to the D-Lib article, there’s more on H-bot here)

Playing with H-Bot is fun. When did Hitler die? The answer in an eyeblink, as the Germans say: April 30, 1945. When did Gandhi die? Here’s a quirk:

Fun with H-Bot

Well sure, but that wasn’t the Gandhi I meant. Interestingly, here’s what happens when I ask the same question but tell H-Bot not to “check trusted websites first”:

Fun with H-Bot

Here’s a case when the unfiltered swamp actually answered my question — or read my mind — better than “trusted websites.” Quantity over quality? Very sensibly, H-Bot demurs when I ask “Is God dead?” or “When did God die?” (“I’m sorry. I cannot provide any answer on that.”) But ask it “Who is God?” and H-Bot serves up a perky little answer:

Fun with H-Bot

Simple-minded? Sure. But viable. Arguments will rage, hairs will split, blood will spill, but our dumb machines have given us an efficient pulse of information in the midst of the cacophony, delivered by strategic sifting of great gobs of data.

Which brings us to a final point that Cohen makes about machine data-mining: “Quantity may make up for a lack of quality.” Even the most ardent humanist can’t deny: when it comes to information, we’ve got a whole lot of quantity these days. It’s how we draw from such quantity that counts.

Clipboards go social

Social bookmarking is swell, but suddenly it seems so limited, so 2005. Or so it seems to me after watching Dan Chudnov’s screencast unAPI and the Gates of the Dawn of Social Clipboards a couple of times. I can attest that it’ll get you thinking — even if, like me, your programming skills extend not much beyond the coffee maker.

You know about gates, you know about dawn, and you should know that APIs are blending web services in dynamic ways. unAPI (‘un’ pronounced as in “universal,” not as in ”poor Syd Barrett, he’s un’appy”) is, as the term might suggest, a simple website API convention that allows a broad array of services to be syndicated and harvested. This is a lightweight, generic tool, unlike an API tailor-made to a service (like, say, the GoogleMaps API). More on unAPI here. Now, for some hurried idea of how unAPI enables social clipboarding, get comfortable and spend some quality minutes with the dchud screencast:

D’ja get that? Social bookmarking = a straightjacketed social clipboard, in which we share only urls and tags. With something like unAPI, the straightjacket comes off, the information we share gets richer and more varied. Click, drag, and toss into the communal pot objects that are linked to full bibliographic metadata — toss even whole images in. Once, in order to share information on the web, you had to code in HTML and FTP your creation up to a server. Then, blogs, wikis, and various administration tools like let you publish content through a web interface. Soon, it seems, you’ll be clicking and dragging web objects around directly. It’s a weird feeling: try it at a demo for Microsoft’s similar new experiment, Live Clipboard.

Chudnov’s emphasis on the new social possibilities of clipboards seems typical of 2.0 library services. My professional mission as a librarian is this: (he’s written) Help people build their own libraries. That’s it. That’s all I care about. Note the plural ‘people.’ If web objects can be readily swapped, studied, shared — if their harvesting and dissemination is conducted, from beginning to end, in networked spaces — it’s easier than ever to see that ‘collection’ is molting ever more into a publicly driven and defined activity.

Librarians once spent time carefully assembling web links for their patrons, and what an onerous job — one plagued by link rot, bedeviled by the fluidity of the web. Social bookmarking is a welcome alternative to the professedly authoritative link collection because it leverages a vast range of expertise, instinct, and attention, while allowing for discovery and customization. A 2.0 librarian (for lack of a better term) will do everything he can to promote this kind of activity.

Similarly, digital collections were once mounted in standalone boxes, and left gathered in a corner of a library website. Social clipboarding is 2.0 collection because, once again, it drags assets out into the pale sunshine of use and interchange. The 2.0 librarian will do everything she can to ensure that a digital collection is easily discovered, harvested, tagged, swapped around, recontextualized, re-collected, and (whenever legal) re-published.

Such decentralized, user-driven, unpredictable shuffling of digital assets might seem to diminish the role of your library. You need not go there, you need not apply there for access, you need not be cognizant of the dimensions of its actual collection. But look at what’s going on behind the scenes, in terms of programming, standardization of conventions, preservation and exposure of assets. And in front of the scenes, you can bet that librarians will evolve ever more into consultants, offering strategies for the successful customization and manipulation of information. If APIs start scattering assets of all sorts onto communally shared clipboards, ‘collection’ takes another step towards the need-based, on-the-fly assemblage of information transforming our world (dare we say) into one big library.

MySpace invaders

Music promoters, child molesters, and now this. Rupert Murdoch’s social networking colonization, MySpace, is starting to be infiltrated by yet another band of predators. They tend to be around ninety years old, and most of them claim to be female. That ‘friend’ your sullen teen is busily adding to her MySpace collection may be none other than… a library?

Now this is a little embarrassing. Like the PG-13 cheap laugh, when the spunky granny grabs the mic and roks da house. Or like Helen Gurley Brown. Hey, Westmont Public Library is with it! Who I’d like to meet: You :) Westmont Public Library’s Interests: Books, Graphic Novels, Magazines, Music, Movies, Video Games. Status: Single. Zodiac sign: Capricorn. (Why are many of these MyFriendly libraries Capricorns? As in Tropic of? Isn’t that a Graphic Novel?) MyFriendly libraries tend to have other libraries in their friendspace. So with one click, here were are at the Thomas Ford Memorial Library. Interests: General — helping people. instant messaging. RESEARCH yo! Books — the ones inside me. You go, Tom Ford! And Brooklyn College Library is in the house–or, as ‘she’ puts it, BC Library — Here on Your Space!

And check out that sassy 100-year-old, the Tonganoxie PL:

As an Xer happily removed from the MySpace generation (though my friends in bands almost dutifully keep pages there), I don’t really understand the appeal. The pages are ugly and ungainly; text can be impossible to pick out against garish image backgrounds, tinny sound files unspool the moment a page opens — it’s all reminiscent of wayback web hideousness, which all too often isn’t so wayback. Still, for better or worse, this is space that teens of all ages build . I guess it’s easy to share music, real-time flirtation, self-branding, endless LOLs. Mostly MySpace seems like high school online — full of chatter, hormones, and the pursuit of popularity. “It’s an unphysical way of hanging out.” Sure kid, great, but someday you’ll want better unphysical spaces. Tonight, at least, MySpace times out constantly. Hey Fox, buy some servers!

As they say, the kids love it; 46 million members just can’t be wrong, can they? Isn’t this democracy? And aren’t libraries at the core of democracy? At least these libraries are trying–but in MySpace they have little to offer, aside from a campy Hello! Nothing to build here, nothing to interact with or collect. To be fair: some libraries link to ‘blog’ entries, like, say, the one posted by Angela at the New Castle-Henry County Public Library listing teen movies, pizza taste-offs, and – spa night? Hmmm… Or the Tanganoxie Public Library’s list of their New Music CD Collection (topped off by Kelly Clarkson! Breakaway! LOL!!) But it’s very unidirectional. Information emanates from the ancient single female Capricorns to all you undifferentiated kids. The full extent of the idea is to show up In Your Extended Network.

This is piggybacking, really, on the idea of social software–just showing up when you should be interacting. Of course, just showing up to the party is a hoot when you’re 90. Links back to the OPAC, indexes of holdings, announcements of teen-centered activity: that’s fine, but how about the actual music? Can I bring library images or videos into MySpace? Can I build immediate links to cool passages of my favoriate favorite favorite books? Can I make a montage out of those awesome graphic novels? How can I collect anything other than a thumbnail picture of the library–a cute little building facade to add to my friends collection? When libraries stop billboarding and start actually transforming themselves into MySpaces–then we’ll have something.

Well, it’s a first step, and Rome wasn’t built in a day–even MySpace wasn’t built in a day, though it might seem otherwise. Here’s the 100 year old Topeka and Shawnee County Public Library: Who I’d like to meet: Anyone! Really! Well, put it like that, & you might be irresistible. A bit pathetic, but, whatever, popular. Topeka and Shawnee County Public Library has, at this writing, 163 friends.

Mmashamashsmashh

Oh to have been a fly on the wall at the just-wrapped Mashup Camp – a fly safely high up on the wall, because a) I’m no programmer and would likely be in the way, and b) its ‘geek dating’ program – a frenetic dance of speed demos and the “law of two feet” – sounds downright dangerous.

But I would have loved to buzz with the buzz, because it’s clear that the proliferation of web applications and reusable APIs is causing an explosion of tinkering, playing, discovering. As Web 2.0 guru Dion Hinchcliffe puts it, The theory is that you can be much more valuable to the rest of the world if your software can be reused in unintended ways. In other words, don’t just provide a fully created end-product for one pre-intended use. Encourage others to use the good pieces of what you provide in new and innovative ways. And thus the torrent of new services cobbled together with bits of preexisting web services — some of which is tracked by Mashup Feed.

What can nontechnical endusers can expect from all this mashing? More customized information and the power that goes with that, as data feeds get mixed for real-time information on weather, parking, airfare, restaurants, skiing, and general calamity.

A glance at David Schorr’s Weather Bonk confirms, at once, that the Mission is the only somewhat warm place in SF, and the GG Bridge is flowing pretty well at the moment:


Looking for more monetizable information? Flyspy is planning to bring to you a 30-day overview of airfares:

But no matter how clever or useful the mashup, it’s only as good as its datafeeds. Another mashup service, Cheap Gas, looks great until you notice that the gas prices you’re being quoted, contributed by ‘anonymous’ (maybe Eddy from Texaco down the street), dated from last summer:

Such flashy inaccuracy is bound to make people who are in the business of reliable information — for example, librarians — nervous. Many mashups are anarchic sandboxes, and who knows what use your data will be put to or what company it will be keeping or to what ends it will be mashed (that’s the point).

As Tom Owad demonstrated a little while ago , pinpointing ‘subversive’ (yet acquisitive) persons is as easy as mashing up Amazon’s Wishlists with Yahoo People Search with Google Maps. Here’s a map of readers hoping someone buys them a shiny new copy of Orwell’s 1984:

And that’s all *legal* — just imagine what our government is up to.

Nevertheless, the rise of APIs may save libraries from the rusty chains of closed-box ILS packages , and allow them to dream up a range of new community-oriented services. Certainly we should be glad that programmers plugged into the potential of libraries, such as the Superpatron, were doing the monster mashup this week.

Scanning mashupfeed‘s indexes… here are some mashups that strike me as library-intriguing, with pasted descriptive blurbs (ie, I didn’t write ‘em, because I didn’t try ‘em all):

Using GoogleMaps API

  • Blosh Blosh finds blogs mentioning locations and displays them on a map.
  • Boston RSS Alley This map displays the locations of some of the companies and bloggers actively working with RSS in the Boston area.
  • Find the Landmark Test your knowledge of US landmarks with interactive, timer-based Google Maps game.
  • Flyr Search Flickr for geotagged photos and then plot them on a Google Map. Nice nested map-within-a-map.
  • GeoWorldNews The latest worldwide stories from the Washington Post plotted on a Google Maps satellite image.
  • Healthia Use the Healthia doctor search to find doctors the United States. 800,000 doctors listed.
  • History Timeline Wiki A history plus geography wiki that allows readers to contribute items of historical interest and plot their locations. Initial dataset is US battles.
  • Libraries411 Find public libraries in the US and Canada. Data for more than 20,000 libraries available.
  • Maplandia Comprehensive searchable gazeteer based on Google Maps. Referenc guide has full world coverage.
  • Placeopedia Geographically place Wikipedia articles on top of Google maps:

Amazon API

  • Albumart.org Uses the Amazon API and an Ajax-style UI to retrieve CD/DVD covers from the Amazon catalog.
  • O’Reilly Book Page Mashup of Backpack and Amazon.com APIs to generate Backpack pages with Amazon.com book data.

Flickr API

  • flickr graph Social network visualization using Flickr API:

  • Flickr Related Tag Browser Search and visualization tool that lets you surf Flickr’s tag space. Flickr tags are keywords used to classify images. Related tags shown based on clustered usage analysis.
  • Flickrscape Enter a word and watch the flickr photo stream. Click to interrupt stream and try another word.
  • geobloggers Google Maps + Flickr photos. It also consumes del.icio.us for geotagged bookmarks and the Upcoming.org for US events, which it then geocodes.

del.icio.us API

  • Delancey This nice del.icio.us enhancement allows you to see which of your del.icio.us bookmarks are used most frequently.
  • thumblicious Use thumblicious to quickly preview the most popular sites bookmarked on del.icio.us via thumbnail screenshots.

Google API

  • Copyscape A website plagiarism search tool that uses the Google Search API.
  • DoubleTrust Shows the best search results from both Google and Yahoo in a new way. Also allows user to alter his trust in either engine to bais combined rankings.
  • QTSaver Uses Google and Yahoo APIs to extract microcontent from multiple sites and allows you to rearrange the excerpts.
  • SpellWeb Compares relative popularity of spellings or concepts based on web frequency. An experiment in sidesifting the Web for useful patterns of information:

You get the idea… you probably get a thousand ideas. That’s the problem with mashups — too many ideas, too many variously commercial or incomplete datastreams, too much sheer buzz. But quickly, perhaps within a fly’s lifespan, your library may truly catch on.

Sticking around

Check out what’s new at that flagship of Library 2.0-ness — the plugged-in to plug-ins, blessed by superpatrons, interactively inventive Ann Arbor District Library: card catalogs!

Remember card catalogs? If you do, you’ll remember that uniquely tactile experience: the sliding out, the flipping through, the red-ink-mandated cross referencing, the peering & copying & replacing. You remember the yellowing card musk, the little codes and numbers, the misaligned typing of some librarian in some back office on some rainy afternoon in 1943.

There the cards were, so vulnerable in their long drawers, just waiting in to be smudged by indifferent sticky fingers, scribbled across by any lunatic with an agenda, ripped out by any patron too lazy to copy down call numbers. Card catalog maintenance must have been a heck of a job, Brownie–and good riddance.

Yet cards are where the public touched the library, and maybe that’s why (shaking ourselves out of pre-OPAC reverie) we see the inventive John Blyberg, AADL’s lead developer, reviving catalog cards in a virtual setting. None of the fuss, none of the muss — and now you don’t have to feel bad about writing on the cards, or grabbing them for yourself.

Here’s a look — the AADL OPAC listing for a book on marginalia offers a link to a “Card catalog image” (near the top of the record):

Click the link, and here’s the generated card — bottom perforation and everything. Someone has already scrawled a message on the card: Defacement is subjective. You, or anyone, could add another scrawl by entering text in one of the three position fields and clicking on that very 2.0 button, Add your marginalia!:

For patrons with accounts, cards can be gathered into personal collections which can, in turn, be shared with other patrons:

Blyberg writes in his description of the project that it was “black-ops” — no committee, no proposal, no approval, no testing, no advertising, no muss no fuss — so it remains a bit murky and provisional. Marginalia on a given card seems limited to three entries. A book can have several cards associated with it, and it’s not immediately clear how to look through all those cards. Also, I’m not sure whether or how cards gathered into one’s own collection can be inscribed by others.

If virtual card catalogs are merely proof-of-concept at this point, the concept reminds me a bit of a project that the Alchemical Muser and others were working on at Columbia’s CCNMTL called Plone Stickies. These Stickies initially allowed students to attach short notes to digital objects — but the fuller vision for them, I believe, involves client-side keyword tagging and community sharing.

What do virtual catalog cards and these stickies have in common, besides a general yellowness? They both draw on the desire to physically connect to thought-objects. As such objects recede into a intangible, fungible environment, it’s notable that old means of tracking them — those flopping and curling and awkward apparatuses of identification — persist in collective memory, and expand into markers of collectivity.

Beware of the blog

Anyone looking for a snapshot of the way digital communication is accepted (or not) as a viable part of the traditional scholarly process should hie, forthwith, to Ulises Ali Mejias’s discussion on his Ideant blog: “The Blog as Dissertation Literature Review?” and a followup post.

Mejias is a doctoral candidate specializing in education and technology, so it’s quite understandable that he chooses to ponder the academic value of social software on his blog. And the payoff is vivid: he draws two critical comments from the authors of the article he most engages, “Scholars Before Researchers: On the Centrality of the Dissertation Literature Review in Research Preparation” (2005).

The argument is lively enough — mercifully light on eduspeak — and I won’t spill cyberink retracing it completely. Mejias thinks about the function of that fine old ground-clearing of dissertations, the literature review, and argues for the efficacy of doing it within the framework of a blog. Why? Blogs are dynamic, flexibly tended and amended, self-catagorizable, dynamic, widely accessible, and open to (please can’t this word die?) feedback. Moreover, bibliographic lists can be interlinked with critical assessments of their worth (as Mejias demonstrates).

The responses of the two authors of the study that Mejias cites throughout the post, a librarian and a professor, are fascinating.

The librarian deplores the slippage of standards — she seems most exercised that she was not properly cited in Mejias’s post, but she also airs concerns that a digital environment is too unfixed — To fulfill the role and purpose of a dissertation, the literature review by nature is temporally bound and must reflect the work of an author at some point in time — and too open to comment and reaction from beyond the walls of academe — Who is his audience? Do they have the requisite authority to vet his work? By definition, a doctoral student’s peers are his or her fellow doctoral students, yet a doctoral candidate is writing for academicians to gain acceptance into their community. The heart of scholarly publication is review of the work by recognized authorities in the field.

What stands out for me here is this respondent’s treatment of a blog as uniquely uncontrollable — as if parameters of audience, commenting permissions, and posting timeframes were beyond anyone’s control. Sure, many a prof will resist spending the time it takes to learn about a new communication technology and how it can be adapted for traditional ends (that’s not yet what rewards professors), but this resistance to digital communication should not be confused with the defense of standards. Here is what seems like a promising recipe for dissertation literature evaluation to me: a blog bundled with citational management software, with levels of access and commentary defined, and — we’re dreaming here — integration with next-gen citation indexes and visualization tools. Who would argue that a broad discussion with a thesis advisor about core texts, pertinent categorization, and the scope and value of outside “feedback” would not be a fine way to kick off a dissertation project?

The professor respondent engages in some higher level handwringing: he rues that Mejias seems to be writing off the ‘social’ reach of traditional scholarship. As I think about my own graduate education and beyond, I see much of the same activity you claim to be novel on your blog – I drafted and circulated manuscripts for classes and colloquia, I presented papers at conferences large and small, I sent my papers to experts in my fields, and I submitted them to journals for review. Along the way I developed my ideas and, if I was lucky, got critical feedback on them. (Technologies come and go, but it seems we’re forever stuck with feedback. ) It’s a shame, this professor suggests, that grad students only imagine themselves as writing just for a dissertation committee, rather than contributing to broader endeavors, and squandering whatever faith they may have in social dynamics into blogs: I accept the possibility that blogging may help novice scholars and researchers as they seek to become socialized in their field. But I will assert that blogging, by itself, is nowhere near sufficient for this purpose.

Of course, Microsoft Word (or, to frame this in parallel, word processing) is nowhere near sufficient for that purpose either — yet I suspect many poor grad students use this tool to assemble elements of their dissertation. I fail to understand how an advance in organization and dissemination — in content management — turns into a true threat to scholarly standards. I’m under 40 (not by much, but still), yet I can remember typing college papers (now mouldering in some box) by hand, and researching my dissertation by writing reams of notes (now mouldering in some box) by hand. I can also remember the long lines outside a superstar professor’s office — the hurried and sometimes random consultations — the way one’s fate is held hostage by overloaded advisors.

Who would seriously begrudge a better way to store, retrieve, and air ideas? Is the process of writing a dissertation not bolstered by reaction from other scholars online, from peers at one’s stage of development, from Aunt Tillie in Florida who is the world’s last opponent of the dangling participle? Do advisors really believe that their hold on students is so tenuous that mere statistics — page views, machine-counted citations — and outside exposure will debilitate their control of a project? Is the portability of a student’s research into future assemblages of material for teaching and beyond-the-diss projects not worth consideration? Distributed learning and evaluation is barreling down the pike (see, for example, Biology Direct interesting peer review process – the subject of a future post). Do we really want to discourage students from acclimating to such an environment?

I’ll climb off today’s soapbox with a nod to that workhorse library term, the “crosswalk.” Just as efforts like METS tries to usher MARC bibliographic standards into a more digital friendly metadata scheme like DC, educational technologists, professors, and librarians need to define certified crosswalks between the traditional apparatus of scholarship and the blessings of digital publication.

Will Mejias get credit for sparking a dialogue so intrinsic to scholarship? Only if the credit-givers look at blogs — and accept the possibility.

2 Library 2.0 lists

Small pieces, loosely joined: is it any wonder that 2.0talk clumps into lists? I won’t embark on a whole metalist, but here, at least, are a couple of Library 2.0 itemizations I enjoyed today, garnished with a few glib comments.

1 Taking advantage of Web and Library 2.0, by John Blyberg. Smoothly written and illustrated — takes a list generated by Dion Hinchcliffe and applies it to libraries.

  • Encourage Social Contributions With Individual Benefitall for me, me for all
  • Make Content Editable Whenever Possible – and yes, not just in some little playpen
  • Encourage Unintended Uses - even for books?
  • Provide Continuous, Interactive User Experiences -
  • Make Sure Your Site Offers Its Content as Feeds and/or Web services -
  • Let Users Establish and Build On Their Reputations – hello, superpatron
  • Allow Low-Friction Enrichment of Your Information – but as for high-friction enrichment… build your own %^#* library
  • Give Users the Right To Remix – yes but beware ‘truthiness’
  • Reuse Other Services Aggressively – including lists…
  • Build Small Pieces, Loosely Joined – see Casey Bisson on a 2.0 Opac (or 2.Opac? toepack?)

2 Ten Techie Things for Librarians 2006, by Michael Stephens. Looser, sprawling entries — many emphasizing interactivity. Good intermixture of links.

  • User Centered Planning & User Perceptions – again, really they’re patrons
  • Building Resources & Comments Enabled – yes, right, why not try?
  • Open Source Software & Shared Development - but freedom ain’t free, of course
  • The Future of the ILS – tips to another Blyberg list, his ILS Customer Bill of Rights
  • Devices – you know, those little electronic communication thingies that users patrons obligingly tote
  • Electronic Resource Management & DRM – you know, those little electronic dis-communication thingies that thugs content providers relentlessly embed
  • Mash Ups & Playlists – more fun with APIs
  • Content & Experienceteen creators, Generation C, whatever you call them, when they make they wake
  • Web 2.0 – uh, right, see above
  • Librarians & the Heart - I think this has to do with personality, actually
  • Bonus: Balance, Breathing and Being Zen – hey, breathing’s interactive too

CiteI’dLike

If you were to invent del.icio.us for academics, how would it work? It would allow for bookmarking, tagging, and sharing. It would pull metadata from academic resource databases. It would allow me (the layprof) to organize collected essays and citations with a minimum of clickage. And it would do all these things in a browser, from on or off campus, independent of platform. In short, it would be quite like CiteULike.

This is a little story about my first pass into CiteULike, and if it’s not entirely a happy story, we should still bear in mind the possibilities, the promise, the 2.0ness of it all.

I abjectly learned about CiteULike just recently (designed by Richard Cameron over a year ago). Sitting through some screencasts made by Tannis Morgan at UBC , I saw how this social bookmarking tool could be useful not only as a way to track journal contents, specifically tagged articles, and other academics’ bookmarks — through RSS — but also as a means to build a library of collected resources — available anywhere and to all.

Holy digital hotness! said I. I’ll try it for myself! And here’s where minor chords start to well up in the background.

Creating an account on CiteULike was childsplay; in ten seconds I was ready to bookmark and collect. Stunned a bit by the possibilities, and revived a bit by narcissism, I decided to start a collection with articles I’ve written. Tough luck, bucko. Though CiteULike offers to browse through some 6500 journals, this roundup doesn’t include the ones that have sponsored my thoughts. In fact, many of the journals seem to be science-related. As ever, the humanist is the redheaded stepchild of resource sharing ventures.

That’s ok, said I. I’ll find some article that’s at least in my field. I saw that Nineteenth-Century Contexts was one of the proffered journals, and scanning a recent edition I saw listed an article about Mary Shelley by Diane Long Hoeveler. Very good, said I. I’ll collect that:

Two links offered to let me ‘view the article online’. Excellent idea! But these links led me to publisher sites, one of which offered a “free sample,” the other demanded $33.67 plus tax. Much disturbing mention of shopping carts. This will never do, said I. Since I am off campus, what I seem need is a way for CiteULike to create paths into Bowdoin’s collections.

So I added the citation to the mysterious Hoeveler article to my own collection, tagging it in the process. Only one-word tagging, please.

A couple of cool features to notice here: I (or anyone) can track my collection through RSS. And metadata from this collection can be gussied up for EndNote with just one click (note how my tags turned into keywords in this EndNote record):

But the problem remained: how to actually connect to the article? I dug around in CiteULike’s FAQs and felt more assured that offcampus proxy access to articles would make those shopping carts disappear. For this functionality, CiteULike pointed me to a COinS Browser Extension written by Dan Chudnov at Yale .

In order to install this little extension, I had to first install Greasemonkey in my Firefox browser — not too difficult, but, trust me, we’ve lost the layprofs by now. The COinS extension allowed me to designate my own institution’s OpenURL resolver, and plug that resolver into OpenURL links now ‘discovered’ in my browser. That way, theoretically, one could click on a resource link on any site and actually access that resource through one’s own institution. You can see this in action here: note the new link that invites me to “Check availability @ Bowdoin”.

But, alas, here’s what happened to me when I clicked that invitation to check availability@Bowdoin:

Note that none of the metadata for the article has been passed through except for the article’s date. At this point I had neither the time nor the skill nor the patience to figure out where the glitch was; I only knew that I was off campus and out of luck accessing an article I found on CiteULike.

Never give up, I told myself. With one last bit of inspiration, I decided to see whether the little bookmarklet that CiteULike distributes (“Post to CiteULike”, rather like del.icio.us’s “Remember this” bookmarklet) would work going the other way. That is, suppose I’m signed into Bowdoin’s databases, and I run across an article I’d like to post onto the CiteULike. That’s just a click of the button, right?

The FAQs warn me that automatic metadata export into CiteULike would only occur with supported databases, which are: AIP Scitation, Amazon, American Geophysical Union, American Meteorological Society, Anthrosource, Association for Computing Machinery (ACM) portal, BMJ, Blackwell Synergy, CiteSeer, HighWire, IEEE Xplore, IngentaConnect, IoP Electronic Journals, JSTOR, MathSciNet, MetaPress, NASA Astrophysics Data System, Nature, PLoS Biology, PubMed, PubMed Central, Science, ScienceDirect, SpringerLink, Usenix, Wiley InterScience, arXiv.org e-Print archive. (See what I mean about the humanities?) Well, JSTOR seemed my best bet, so I rooted around in Bowdoin’s library site until I found an article on Mary Shelley in JSTOR. Here was one from ELH: “Narratives of Seductions and the Seductions of Narrative: The Frame Structure of Frankenstein” (Ok I see what you mean about the humanities).

When I clicked my bookmarklet to Post to CiteULike, here’s what happened:

Hmm…. that really didn’t take the drudge out of drudgery, did it? I mean, yes, some barebones metadata is passed through, but all to the title field; I have a fair amount of tending, cutting, and pasting to do if I want this to be a real citation. If I feel like more work, I can download a PDF version of the article to my computer, then upload it into CiteULike so I can privately retrieve the article wherever I am. I can’t share the full text with other Mary Shelley aficionados, though: they have to try their own luck tunneling into their own publisher-paying institutions. Otherwise, you know, that’d be stealing.

I believe wholeheartedly that around the world, from within and without institutional walls, academics are happily collecting and sharing resources with CiteULike. I can see this happening minute by minute on the home page:

But at least right here & right now, I can’t fully play. And I feel swamped by “everyone”. How many of “everyone’s” tags link to articles I can understand, much less evaluate and collect?

Once the mechanics were ironed out, this would be my next wish for CiteULike: the creation of discipline-based communities, so I could track the tags of colleagues pondering British literature — and feel less intimidated by clustering geophysicists.