'Why, I'm Posterity -- and so are you.'

‘O little cloud the Virgin said, I charge thee to tell me…’

Posted: June 27th, 2008 | Author: Mark Phillipson | Filed under: Metawriting, Reading, Tagging, ^ | Tags: , , , | No Comments »

Every once in a while Clayfox drifts into the tag clouds. And yet its heart has never quite followed. Maybe that’s because most often those clouds don’t prove to be so very informative after all.

Let’s review: tag clouds are a way to visualize the frequency of application of (usually uncontrolled) keywords to a corpus of stuff by a number of people. In many — even most — cases I wouldn’t call these taggers a ‘community’, unless we water down the definition of ‘community’ to a collection of people who have signed up for an online service. Even within the context of one academic tagging experiment, that can be thin or lumpy tea….

Even populous and richly tagged environments like Flickr can puff up clouds that seem, well, rather vaporous. Look at the cloud of “all time most popular tags,” and what is revealed?

tagcloudflickr.jpg

It seems that when taking digital pictures with NIKONS and CANONS Flickrites gravitate to WEDDINGS and PARTIES, they focus on FRIENDS and FAMILY, they like to TRAVEL on VACATION to the BEACH or to places like CALIFORNIA and FRANCE and JAPAN. Well, well, blow me over with a feather.

Even as a means of self-portrayal, cloud tags come up short — at least to an unstrategic tagger like myself. I use and love del.icio.us — but the cloud that it serves up of my tagging activity has never been of more interest than, say, an alphabetical list of my tags. And I’ve never really discovered much about anyone else by scanning a cloud of their del.icio.us tags. Have you?

I’m willing to be convinced that appending tag clouds can be a smart search engine strategy. Perhaps this is their real utility: providing another way for the machines to read us.

***

But I’m not anti-cloud, far from it. I just happen to think that clouds are a lot more interesting to human beings when they are of words in a text, rather than of tags applied to objects. Tag clouds open up all kinds of blurry mysteries: who’s doing the tagging? how canny or consistent are the taggers? what is the extent of the corpus being tagged? But a word cloud of a given text can be as revelatory as word mining — a re-mapping of a document to bring out its frequencies, its quirks, its long tails.

And word clouds, at least those generated on the addictive new Wordle , can be quite beautiful as well. I can imagine students really learning from them, or at least investigating the vocabulary field of, say, a poem from new angles.

As an example, I’ve created word clouds of two poems by William Blake: the introduction to Songs of Innocence, and the introduction to Songs of Experience. Compare them below, and you’ll quickly see that the Innocence poem is more repetitious, aural, interactive, while the world of the Experience poem is more disperse, visual, occupied by distances. You could get all that by reading the poems themselves, without any scrambling of their words and plumping up of their frequencies. But word clouds are a way of remapping a fixed world of meaning, visually exploring it — an engaging thing to do even if they drive you back, in the end, into fresh appreciation for syntax and line structure and the very contexts they explode. Enjoy!

Innocence
William Blake word cloud - innocence

Experience
William Blake word cloud - experience


Changing the subject

Posted: May 22nd, 2008 | Author: Mark Phillipson | Filed under: Library musings, Tagging, ^ | Tags: , , , , , | 1 Comment »

Who is this woman, and why is she crying?

Mrs. Belmont at gunmen’s trial (LOC)

This photo, from a collection of early news photos housed at the Library of Congress, is part of an experiment that has that venerable institution dipping a toe into the Web 2.0 waters. Compare the photo on LC’s own website, versus on Flickr.

By publishing some of its holdings into Flickr, where items can be annotated by anyone, LC is taking seriously what you often hear now but rarely see yet: in a digital environment, libraries have to move beyond providing access and into facilitating use.

Access has been traditionally provided by libraries by the application of pre-determined, hierarchical subjects; that’s what allows physical objects to be sorted and found. It’s a system that puts the onus on one cataloger to master a relatively fixed universe of related subjects, and apply this system to an object so said object can be placed and later found in its correct place.

On the web, of course, objects are easily replicated, dispersed, recontextualized. They can be represented in any number of places, found through any number of pathways and connections. They travel unpredictably across an increasingly read-write landscape, wherein someone just might improve and embellish the guess of that lonely cataloger about what an object is ‘about,’ making it thereby more discoverable. Accommodation to an endless amount of comment and annotation seems a nascent effect of the dynamically networked use of objects.

But back to the photo: how has being Flick’d out of LC’s precincts improved our sense of its subject? Somebody had scrawled a title, “Mrs. Belmont at gunmen’s trial,” and the LC record left it at that. Just a few days after it appeared in Web 2.0-land, commenters had connected the photo to a Wikipedia entry about Alva Erskine Belmont –a rather remarkable socialite and promoter of the women’s suffrage movement–as well as another photo in the same LC collection documenting the sensational Rosenthal murder of 1912.

Wikipedia, blog postings, tags, and comments are bringing this photo to life on Flickr, giving us a better sense of its context and content. But lest we get carried away with the wisdom of crowds, we should also acknowledge a misogynistic annotation on the photo in Flickr: “dr_ass2001″ has taken up himself to draw a square around Ms. Belmont’s head and write, “Stop crying, you moron.”

***

So will LC be modifying its records based on the annotations these digitized photos catch in Flickr? Their FAQs about the project demure:

The Library will decide what to do with data added through Flickr once the pilot is over. Because resources to update catalog records are limited, the Library cannot promise to incorporate contributed data into its own records.

Still, on Flickr pages such as that housing Ms. Belmont, an LC librarian has promised to alter records based on contributed information; and as of this writing, a search for ‘flickr’ in LC’s Prints and Photographs online catalog calls up 127 instances of metadata being added or altered as a result of the “Flickr community project, 2008.”

So what are the criteria for bringing information contributed through this “community project” into LC’s more authoritative catalog? How much time and effort are LC librarians putting into that crosswalk? It will be interesting to learn answers. As a member of RLG Programs observed three months into this experiment:

Social tagging in this framework doesn’t mean letting others catalog your collections for you – it really means offering up materials for a conversation which you have to follow closely to extract the bits worth bringing back.

“Conversation” seems to be the operative word here — but until LC makes its activities in this experiment a little more transparent, it’s rather like a conversation held in a confessional booth. In any event, the move towards opening up cataloging into a conversation with the public over the web is certainly a paradigm shift. Web 2.0 endeavors like LibraryThing have for years now facilitated the interplay of LC Subject Headings and free-form annotation. But now here’s LC itself, the very mortar of brick and mortar libraries, striking up conversation.

***

This has implications that range into epistemology. A recent article by David Pimentel traces the implications of treating knowledge-making as conversational: “the nature of knowledge is increasingly viewed as an iterative process, with each individual attempting to make sense of the world s/he encounters.” We live in a world increasingly impatient with indexing done by professionals, “inevitably limited to one individual’s perceptions of an information object at one particular moment in time.”

A conversational world, growing out of Gordon Pask‘s Conversation Theory, Pimentel reminds us, is one of “participants communicating and seeking a shared agreement, or mutual understanding.” What is correct is formulated by participants in this communication, not some “external absolute.”

As Pimentel suggests in passing, an iterative and unfixed arena of exchange is of increasing importance in an world so often formulated as heterogeneous or interdisciplinary–the only way, perhaps, to “unif[y] theories and concepts across disciplines.” To be sure, most any uncontrolled conversation contains trivial or inane or erroneous noise, and crowd-tagging experiments seem especially full of that. It may be the price to pay for being able to talk at all in an environment that is still often known for the big stern Shushhhhh.

A post on Flickr that accompanied the launch of this LC experiment last January was cheerfully titled “Many hands make light work.” I doubt the LC librarians trolling the comments on the two photo collections so far released onto Flickr would agree–but assuredly, many hands make different work, and perhaps more interesting work all around.

Librarians get to come into a closer and more collaborative relationship with users of the objects they collect. Those ‘users’ (or patrons?) are able to participate in the detective work that is so often at the heart of subject identification, perhaps gaining a stake in culture as a result. The collection gets marked with new pathways through it, becoming less of a sterile pile and more of an ongoing seeding of discourse.

***

The very first aim of the pilot though, as outlined in the “Many hands” post, has less to do with rethinking cataloging or conversational theory or anything like that, and more to do with publicity: “to increase exposure to the amazing content currently held in the public collections of civic institutions around the world.” Indeed, if you look through the LC collection on Flickr, a goodly number of comments are, shall we say, merely appreciative:

Comments on an LC photo in Flickr

Like so much else about this pilot, this mere enthusiasm expressed for objects that have been online for many years –as if they have just now been made accessible–is striking. If LC had simply switched on annotation tools on their own site, I doubt that so much enthusiasm and activity would have arisen around these photographs.

The trick seems to have been to bring these objects to Flickr, a “major gravitational hub” that is “driven by network effects,” to borrow terms from Lorcan Dempsey. The willingness of LC , no slouch itself when it comes to gravitational hubs, to open up a dialog with a very different kind of hub, is heartening — less for the new exposure it can bring to the vast collections of august institutions (though that’s always valuable) than for the dynamic friction that is bound to arise from the commingling of authority and the crowd.

Though the immediate impulse is to breathe a vast sigh of relief that Mrs. Belmont has been released from the gloomy dungeon of LC’s sterile, unchanging gallery and is now facing a new public on Flickr, I suspect the ultimate value of such liberation will be renewed appreciation for the thin skein of metadata so laboriously pieced together by specialists over the years that can now be embroidered, tested, interrogated. From what little I now know of Alva, I think she would value the old standards, even while pushing for new ways of living.


Life in the taggregate

Posted: November 23rd, 2007 | Author: Mark Phillipson | Filed under: Libraryworld, Tagging, ^ | Tags: , , , , , , , , , , , , , , , , | 1 Comment »

From its earliest days, the promise of the Semantic Web has been to bring networked computers closer to the forms and priorities of human inquiry. This promise depends on mark-up language that gives data some structure, and frameworks that bring such structure into recognizable relationships. As a May 2001 Scientific American piece by Tim Berners-Lee and colleagues put it, “for the semantic web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning.”

Automated reasoning! This dream may be coming to life in e-science, with its highly structured and interoperable datasets, but in many other contexts the idea of a Semantic Web sits uneasily with the younger and more popular kid on the block, the Participatory Web. Web 2.0 environments amasses a lot of data and, more importantly, a lot of information about this data generated by humans downright impervious to the need of machines for identifiable and consistent structure. Such tags are generally free-form, non-hierarchical, not expressing relationships in a predictable and consistent way; they dance to “folksonomy” not “taxonomy”; they are blithely untethered to “ontologies,” to any URI-based language standards.

Nevertheless there is intriguing thought out there about the potential interplay of the Semantic Web and Web 2.0. The Tagcommons sites lays out Use Cases that envision sharing tags across databases, and sketches out some functional requirements to make that interoperability happen. Tom Gruber, in particular, has argued energetically for “collective intelligence systems” built from syntheses of structured data and social software; his travel-review site RealTravel uses a “snap-to-grid” model to disambiguate and structure user-supplied tags.

And now in Yahoo! Research Berkeley labs, algorithms are starting to take into account aggregate patterns in order to sift out meaning from vast oceans of community-generated tags despite all their unstructured messiness — or, as computer scientists like to say, despite all their “noise.” It’s a matter of inference and cluster analysis. Case in point: the photo-sharing site Flickr‘s new experiments in extracting “practical information about the world” from the snapshots and tags poured into it by the great unwashed. The report “How flickr helps us make sense of the world: context and content in community-contributed media collections,” describes a layered process of tag and image analysis–one that can be conducted entirely by machines–that identifies representational tags as well as place and event semantics.

What does all this do for us? For one thing, it can improve a search through piles of community-contributed materials; my search for “Harlem” stands a better chance of coming up with the most representative picture of the neighborhood, or a set of iteratively varied views of the neighborhood, or even a conglomeration of views for a composite view. I could determine the most visited place in the neighborhood, or the scenes of important events. Yahoo!’s researchers are even thinking about automatic tagging of photos, or suggestions for tags, that are generated by visual content abetted by contextual and geographical cues.

Here are a couple of spins of Yahoo! Labs’ TagMaps:

Flickr World Browser Harlem

^ TagMap’s World Browser analyzes Flickr tags to locate “Harlem” on a map and offer a set of representative photos (on the right). Harlem seems pushed to the west, and the chicken picture is a little odd, but this machine-generated guess seems viable enough.

TagMap World Browser Paris

^ A search for ‘Paris’ in TagMap’s World Browser whisks us to a city in the middle of France, not Texas, and avoids any pictures of over-photographed heiresses. See: machines have taste too.

Teasing meaning out of cacophony, evaluating ‘where what & when’ through dumb processing of inconsistent human traces: it’s not hard to sense an artificial intelligence awakening here with its own priorities, despite the human decision (conscious or not) to ignore machine-oriented information conventions. What is the ultimate effect of algorithms trained to crunch through the idiosyncratic and identify the representational? Could such aggregate processing of unstructured data fuel a general regression to the mean, as alchemist Jonah Bossewitch muses? As a Trekkie (or is it Trekker?) might say, streaming into yet another convention, resistance is futile.

The fear of human conglomeration coming into sudden sentience is nothing new, of course. I just re-read Frankenstein with a set of fresh young readers, and alarmist correlations of that good old story to a improbably persistent, flexible, and collective-mashed form of AI doubtlessly come too easily to me now. But I do sometimes wonder whether we too will wake up from our most logocentric tagging idylls to sense senseless and unblinking eyes, watching us in the dark and hungry for more.


Archiving a tragedy

Posted: May 3rd, 2007 | Author: Mark Phillipson | Filed under: Library musings, Tagging | 2 Comments »

Virginia Tech’s Center for Digital Discourse and Culture recently debuted The April 16 Archive, with some help from the prolific Center for History and New Media at George Mason,

…in order to support ongoing efforts of historians and archivists to preserve the record of this event by collecting first-hand accounts, on-scene images, blog postings, and podcasts.

It’s worth keeping an eye on this project as a model of user contributions, clustered around a contemporary and tragic event. How do we use new media to process such things? What does it enable us to capture and collect and learn?

So far the April 16 Archive is fairly bare-bones; it only accepts ‘images,’ ‘stories,’ and the vaguely termed ‘other files’. And as of now it’s impossible to search, hard to browse. There is some tagging, but the lumped-up organization makes you wish for some other ways in to the content–perhaps a map interface along the lines of the CHMN’s last tragedy-archive, the Hurricane Digital Memory Bank. A simple uploading interface provides a cut-and-paste field for Virgina Tech stories, or an upload field for files (maximum 5 MB). You can choose to just contribute to the archive, or to have your contribution appear on the website (with or without your name). Submitters are told that they retain copyrights to anything they contribute, which broadly bans use for any public purpose without the permission of the April 16 Archive and the original contributor. No CC options here.

The April 16 Archive FAQs take on the question of veracity: How do I know that the content of the April 16 Archive is factual? The answer here:

Every submission to the April 16 Archive–even those that are erroneous, misleading, or dubious–contributes in some way to the historical record. A misleading individual account, for example, could reveal certain personal and emotional aspects of the event that would otherwise be lost in a strict authentication and appraisal process.

Besides, this FAQ rather blithely continues,

…the April 16 Archive harvests metadata from every contributor–including name, email address, location, zip code, gender, age, occupation, date received–and suggests that these metadata be examined in relation to one another, in relation to the content of the submission, and in relation to other authenticated records. Sound research technique is the basis of sound scholarship.

After picking my way around the Archive for a little while, I’m struck by the number of images of Second Life memorials. I just don’t know what to think of such screen grabs. Collective therapy, sure — but an historical record of this tragedy? You tell me.

Second life mourners


Taking notes

Posted: September 19th, 2006 | Author: Mark Phillipson | Filed under: Academia, Metawriting, Tagging | No Comments »

Yo, can I borrow your notes?

Harkening back to the salad days of college, I seem to remember a free-floating faith in the power of someone else’s notes to fill in cracks of attendance & attention. I doubt that much significant learning took place in power-cramming sessions entirely reliant on someone else’s diligently indented transcription of wisdom. But I’m struck now, thinking back, by the instinct to herd together in such situations.

A study tool named stu.dicio.us has recently made its debut, promising del.icio.us-like value through aggregation of communal effort. Now maybe some stranger from West Virginia Tech will save you from the consequences of having slept through Chemistry. Or maybe that concept your prof seems so fond of has been dropped in another class somewhere, in a context just different enough to fuel your next paper. Or maybe you can meet that hottie on the far side of the lecture hall because you’ve done a search limited to your school and this class and lo & behold here you both are, believing in the power of networking your notes.

Sharing notes is not cheating, insists stu.dicio.us. Everyone should have every advantage possible in increasing individual knowledge. The site rather mysteriously claims to be created for students, by students, and is rather predictably in beta.

There are bugs, and slender participation makes any 2.0 service like this awkward at first, but give it time. After a little tour, I think that stu.dicio.us is actually more useful for its lightweight organizational tools. There’s a sortable todo function – handy even if you aren’t interested in checking peers’ todos. The basic Textile formatting for notes encourages precision (see this testimony), and auto-save is built in. You can use simple brackets for auto-links to Wikipedia, Google, or Google scholar. You can upload files and access them whenever you want –as long as the service remains online. For those times when you can’t get online, stu.dicio.us offers an offline mode.

Here are a couple of screenshots. First, my fake schedule, with grades, notes, files, todos, and (sadly) no friends. This would be useful, I’d say, especially if it were within a course management environment:

stu.dicio.us

… and someone’s notes, which i found by doing a search for history and columbia:

stu.dicio.us

Enlightening? I doubt it – but misery does love company – and if you’re casting around randomly for any mention of history in anyone’s notes, chances are that you’re feeling a bit miserable.


Dear PennTags

Posted: June 14th, 2006 | Author: Mark Phillipson | Filed under: Academia, Libraryworld, Tagging | 6 Comments »

Please don’t take this the wrong way. It’s not you, it’s me. It’s just that I was so excited to meet you — I had so many preconceptions, I had heard so much about you. And then when I actually met you, you seemed kind of standoff-ish and, I admit, sort of different from what I thought you’d be. But I still like you — don’t get me wrong.

When I first heard about you I thought: finally! A way for scholars to tag up an OPAC as well as electronic journals — a tool enabling social discovery by a defined community swimming through carefully selected resources. In short, I thought you’d be more sophisticated and more focused than del.icio.us. I thought: finally, it will be easy for a specific class or a set group of scholars to sift together through premium resources: collaborative discovery centered on the information source most unique to Penn, the Penn library.

But when we actually met you were so confusing (and I’m not alone in thinking so). Your home page hit me right off the bat with pictures of birds and a big tagcloud, a cloud that seemed more random than representative:

PennTags

What does it mean that Lauder_Institute_Area_Studies dwarfs united_states? I think it means that you haven’t gotten around enough to render a representative or even very interesting snapshot of the Penn community — so until you do, I suggest you don’t wear this raw data on your sleeve.

I know your type — you’re enamored of presenting data as it comes into your system — makes you seem extra dynamic. But until you get more play, you’re not delivering useful information with your overall clouds and ‘latest tagged’ lists. In fact, I doubt such look-ma-it’s-web2.0 features will ever be that useful to anyone, however big you get.

I guess my point is, first impressions are important — so you should use your home page to introduce yourself, rather than show off. I finally found my way to the “About” page (tiny button, my friend! why so shy?), a page that finally addresses the question, “What is PennTags”? And here you got kind of weird. You started pretending that del.icio.us doesn’t even exist. Or, to put it another way, you said almost nothing about yourself that couldn’t be said about del.icio.us. You bragged:

Have you ever bookmarked a web page and then can’t find it again in your mass of bookmarks? The beauty of PennTags is that it allows you to organize your bookmarks/resources exactly the way you want and it lets you share them with others. It’s both personal and portable.

Well ok, but I thought your beauty, PennTags, would be that you would be different from del.icio.us — that instead of letting anyone tag anything just ‘out there’ on the open web, you’d let a defined community — namely, Penn and sub-communities within Penn — tag things that are available by virtue of being at Penn. Otherwise, why reinvent the wheel? Ignoring the popular kid & just pretending to be him won’t impress many who are likely to be drawn to you in the first place.

Jumping into some of your posts, though, I found that your users are in fact using you as I thought they might — they are tagging your library’s catalog records, and they are tagging articles available in your library’s database, as well as outside websites. Following these links put me on quite different adventures.

When the item tagged is in the OPAC

OPAC tagging is pretty darn sweet — and you pulled this off with Voyager, no less. When I clicked on a post referring to a book on Godard, I didn’t get to access the book (obviously), but I was routed to its catalog record, and I found that the user-contributed tag and summary had made the trip with me, and appeared in a yellow box right in the OPAC:

PennTags

After seeing this trick, PennTags, I started to warm to you. People who know nothing about you or about tagging or even about bookmarking are bound to wonder what these yellow notes are on showing up on the bottom of OPAC records — maybe you’ll recruit more users this way, and get smarter. At the very least, you’re giving library records a sense of life; any way to enliven the OPAC with user contributions is a-ok with me.

But I wonder how you’ll manage any significant success — imagine ten such yellow PennTag records clinging onto a record in the catalog. You’ll have to be careful to keep a balance between authoritative metadata and folksonomy, between succinct official catalog records and long contributed summations.

When the item tagged is in a journal database

What about when someone posts and tags a journal article in you? I clicked on such a record, and, not to my surprise, got dumped at a Penn database log-in screen — which means that if I were affiliated with Penn, I’d go right to the article. Since I’m not, I see nothing — no user summations, no fun yellow boxes. This begs the questions again about who is using PennTags, and for what purpose. Frankly, I felt ignored by you here. If you are of, by, & for people behind Penn’s walls, then perhaps you should live behind that wall too — it’s not particularly interesting, for someone who can’t get at resources, to see how they’re being tagged.

That said, clicking on the title of another posted article, a JSTOR title, took me — much to my surprise — right into the article; I was ushered straight in thanks to my own institution. That experience started me dreaming again, PennTags, about an openURL world, filled with cross-institutional tagging of academic assets. At the very least it renewed my hope that I might find you of use while waiting for my own library to get tagging off the ground.

When the item tagged is an outside website

Then there are the outside websites that are being posted and tagged in you, just as they’re tagged in del.icio.us. As you know, I think it’s redundant and a little silly to use you just for this purpose, but I’m also warming to the idea of tagging websites right alongside OPAC records and journal articles. You see, PennTags, I’m open to persuasion; you just haven’t taken the time to articulate the benefits of this mix. You’re actually allowing your users to bring resources into your library, in a way. Rather than reinventing a wheel, you’re melting a wall. That’s a big step, and it’s one to think about — not take for granted.

Yeah, inside/outside tagging has plenty of potential, no doubt about it, but here again I’m a little let down. Here’s the deal, PennTags: I think you could be a little more proactive about what academic tagging could or even should be. Could it be hierarchical? Might it be user-faceted? Are there ways to enforce best practices? By offering little firm guidance, you’re once again playing pseudo-del.icio.us, leaving everything up to an undifferentiated swamp.

But look around, PennTags: you operate in a world full of productive distinctions. You even list some, shyly — they get buried in a section called “More Tagging Tips”:

PennTags

How hard would it be to invite your users to think along these lines, gently, somewhere in the tagging process? Can tagging evolve to something beyond a single ‘fill in whatever you want’ open field? I know you don’t want to come across as bossy or proscriptive or — god forbid — librarian-like, but I wonder if just a couple of criteria particularly useful to your academic community (say Topic and Relevance) could be quietly promoted, just as del.icio.us already subtly promotes tagging uniformity through ‘recommended tags.’

The thing to keep your eye on is use: how these tags are used by actual populations, in actual classes or other sub-groupings, for actual purposes. I find it pretty weird that you’re asking people to think about tagging with an uncle in mind — unless this is an uncle at Penn. Relevance is a subjective and fairly meaningless call against a wide-open horizon (where many uncles live), but within the context of english242 students working collectively on a presentation about Keats’s illness, say, “Relevance” becomes a powerful way of characterizing a resource.

Imagine, too, if you allowed any kind of distinction among users — how interestingly instructors and students, say, could interact within a classroom framework as what they are (in the institution’s eye) through you. Or professors and research assistants. Or members of a class and those outside the class. Or librarians. Or alumni. These distinctions shape the day-to-day life of your campus, and though I suspect you imagine yourself to be leveling the playing field in exciting new ways, you don’t have to dumb the field down that much. Nor do user distinctions need to control the way people use you. Building them in would only help when it become desirable to browse or subscribe to the tagging work of a certain subset of the campus community. Here’s your advantage over del.icio.us: you operate in a circumscribed world organized around definable purposes, roles, means, events.

I think you’d be even cooler if you presented yourself as not just another collective knowledge base, but as the way that only Penn could make the knowledge of the world work for definable ends. That’s why I think your most promising feature is ‘Projects’. Right now you only allow one owner post to a given project, but maybe in the future you’ll loosen up and let many users work on a given project — and maybe even specified classes of users. Then, I suspect, the RSS functionality you’ve already built in would start to be useful not merely to the curious, but to a much more involved user-base: the tasked.

Well, PennTags, you can guess by the way I’ve gone on here that I actually am pretty attracted to you, and I look forward to seeing how you mature. You’re raising awareness of tagging in academic settings — and you’re not just sitting around wondering about what that might mean — you’re actually putting tags into motion. That’s the only way any of us is really going to learn how this 2.0 phenom might work for us. So — way to be, & keep in touch.

Your PennPal,
Mark


LibraryThings

Posted: May 18th, 2006 | Author: Mark Phillipson | Filed under: Libraryworld, Tagging | 1 Comment »

If it once took a special type of person to be a library cataloguer — one comfortable in back offices & around heavy rule books, methodical, perhaps quiet — now everyone wants to get in on the action. The rise of self-cataloguing has been one of the more inexorable effects of digital media. The discovery within cataloguing of social connections now seems to be another.

Of course long before all this web stuff we were being trained to collect content in various forms, and value assemblages as inherent identifyers of taste. Siva Vaidhyanathan’s recent presentation at Columbia’s Correcting Course forum carries this age-old ritual into my lifetime; he talks about a mass paperback industry that marketed (unread? unreadable?) books as class identification… VHRs marooned on shelves, monuments of their owner’s cinematic pleasures …. the fine art of of mixed tapes, now supplanted of course by playlists….

The fetishistic Mac application Delicious Library wraps a collection database into a pretty package so… so… well, so you can have a virtual representation of all your books — all your video games — all your DVDs, right on the hard drive of your computer. Scan the item’s UPC barcode with a webcam, and presto, metadata from Amazon flies right into your own library database — including cover art. Awesome, right?

Ok it’s actually fairly purposeless. You can assign items ratings, and you can designate their location in actual space, but I doubt many are actually relying on Delicious Library to find stuff. If you lend out an item to a friend, you can track it with DL — but really, if you’re lending out more than you can remember & your friends can’t be trusted to return things, well, maybe a policy change is in order. And DL’s symbiosis with Amazon’s API is worrisome — Amazon-hosted One-Click Shopping recommendations are just a click away.

But describe Delicious Library to someone, and it’s possible that they’ll turn cataloguer right in front of your eyes: huh, my things in a database….

Delicious Library

^ Finding Nemo and other treasures: virtual shelving in Delicious Library

LibraryThing — straight outta Portland Maine, btw — is a web app significantly tastier than its desktop cousin because it networks people’s collections. LibraryThing still invites you to play with representations of your books on virtual shelves for yourself — but now you’re doing your assembling among & amid a myriad of intersecting libraries. Now metadata is up for grabs, unregulated by Amazon or any other detached entity: social tagging comes to the fore. You can hear the 2.0 pitch — it’s del.icio.us for books! — and lo, tagging abounds.

But just around books — LibraryThing valiantly resists the siren call of other media on favor of bibliomania. It links its bibliographic records to OCLC’s Find a Library as well as Amazon and library OPACs via the good old Z39.50 client server protocol, and hosts discussion of titles among those who share it in their libraries.

In short, if you love books, LibraryThing seems an unrigged communal playpen, as well as a self-inventory tool. It provides branching recommendations based on mutual ownership, not Amazonian purchases. It presents clouds of a book’s common tags unseeded by commerce. It offers RSS subscriptions for any given tag, so you can track books as collections, not products, come in.

LibraryThing Screenshot

^Adding to my library in LibraryThing: I enter in a title, and LT checks it against a bibliographic database of my choosing. And I choose LC! No snappy webcam scan, alas, though barcodes are acceptable identifiers.

LibraryThing screenshot

^Now that I’ve added my book to LibraryThing, I can see how others have tagged and rated it. Looks like some people don’t care for literary theory, and yet they own the book. Go figure. This title hasn’t been reviewed yet in LibraryThing, but many have.

LibraryThing screenshot

^My so-far small library (the books on my desk right now).

Most intriguing of all, LibraryThing has recently added Library of Congress subjects into the mix. The premise is that user-created tags can coexist with library-tended subject headings, that folksonomy can play off of controlled hierarchy. At times, tags and subject headers coincide. In other instances, they hardly ever do. LibraryThing has only just embarked on this odd tango, and who knows where it will lead — but at the very least it should generate some intriguing friction.

LibraryThing screenshot

^Exploring the tag “literary theory” on LibraryThing. I see heavy users of this tag, works most often tagged by the term, and the latest books into the system so tagged (and I can subscribe to the tag via RSS). I also see related LC Subject Headings, in case I feel like faceted browsing.

Already user-tags are sitting up a little straighter and paying more attention to themselves. Discussion on LibraryThing’s metablog, Thingology, has been spurred by subject headings to characterize — dare I say categorize — tags. Discussants finds tags to fall into recognizable camps: personal location notes (“living room,” “office”), personal use tags (“read,” “damaged,” “study”), broadcast opinion tags (“excellent,” “lame” ), and personal subject tags (anything in the uncontrolled descriptive universe). The half-hazard felicities of user-tag surfing is getting measured right up against the precision of subject headings.

All this driven by Tim Spalding, a web developer, not a librarian. Or is he? Should we settle for patron?


Clipboards go social

Posted: March 13th, 2006 | Author: Mark Phillipson | Filed under: Libraryworld, Tagging | No Comments »

Social bookmarking is swell, but suddenly it seems so limited, so 2005. Or so it seems to me after watching Dan Chudnov’s screencast unAPI and the Gates of the Dawn of Social Clipboards a couple of times. I can attest that it’ll get you thinking — even if, like me, your programming skills extend not much beyond the coffee maker.

You know about gates, you know about dawn, and you should know that APIs are blending web services in dynamic ways. unAPI (‘un’ pronounced as in “universal,” not as in ”poor Syd Barrett, he’s un’appy”) is, as the term might suggest, a simple website API convention that allows a broad array of services to be syndicated and harvested. This is a lightweight, generic tool, unlike an API tailor-made to a service (like, say, the GoogleMaps API). More on unAPI here. Now, for some hurried idea of how unAPI enables social clipboarding, get comfortable and spend some quality minutes with the dchud screencast:

D’ja get that? Social bookmarking = a straightjacketed social clipboard, in which we share only urls and tags. With something like unAPI, the straightjacket comes off, the information we share gets richer and more varied. Click, drag, and toss into the communal pot objects that are linked to full bibliographic metadata — toss even whole images in. Once, in order to share information on the web, you had to code in HTML and FTP your creation up to a server. Then, blogs, wikis, and various administration tools like let you publish content through a web interface. Soon, it seems, you’ll be clicking and dragging web objects around directly. It’s a weird feeling: try it at a demo for Microsoft’s similar new experiment, Live Clipboard.

Chudnov’s emphasis on the new social possibilities of clipboards seems typical of 2.0 library services. My professional mission as a librarian is this: (he’s written) Help people build their own libraries. That’s it. That’s all I care about. Note the plural ‘people.’ If web objects can be readily swapped, studied, shared — if their harvesting and dissemination is conducted, from beginning to end, in networked spaces — it’s easier than ever to see that ‘collection’ is molting ever more into a publicly driven and defined activity.

Librarians once spent time carefully assembling web links for their patrons, and what an onerous job — one plagued by link rot, bedeviled by the fluidity of the web. Social bookmarking is a welcome alternative to the professedly authoritative link collection because it leverages a vast range of expertise, instinct, and attention, while allowing for discovery and customization. A 2.0 librarian (for lack of a better term) will do everything he can to promote this kind of activity.

Similarly, digital collections were once mounted in standalone boxes, and left gathered in a corner of a library website. Social clipboarding is 2.0 collection because, once again, it drags assets out into the pale sunshine of use and interchange. The 2.0 librarian will do everything she can to ensure that a digital collection is easily discovered, harvested, tagged, swapped around, recontextualized, re-collected, and (whenever legal) re-published.

Such decentralized, user-driven, unpredictable shuffling of digital assets might seem to diminish the role of your library. You need not go there, you need not apply there for access, you need not be cognizant of the dimensions of its actual collection. But look at what’s going on behind the scenes, in terms of programming, standardization of conventions, preservation and exposure of assets. And in front of the scenes, you can bet that librarians will evolve ever more into consultants, offering strategies for the successful customization and manipulation of information. If APIs start scattering assets of all sorts onto communally shared clipboards, ‘collection’ takes another step towards the need-based, on-the-fly assemblage of information transforming our world (dare we say) into one big library.


Mmashamashsmashh

Posted: February 22nd, 2006 | Author: Mark Phillipson | Filed under: Libraryworld, Tagging, Wikiwatch | 4 Comments »

Oh to have been a fly on the wall at the just-wrapped Mashup Camp – a fly safely high up on the wall, because a) I’m no programmer and would likely be in the way, and b) its ‘geek dating’ program – a frenetic dance of speed demos and the “law of two feet” – sounds downright dangerous.

But I would have loved to buzz with the buzz, because it’s clear that the proliferation of web applications and reusable APIs is causing an explosion of tinkering, playing, discovering. As Web 2.0 guru Dion Hinchcliffe puts it, The theory is that you can be much more valuable to the rest of the world if your software can be reused in unintended ways. In other words, don’t just provide a fully created end-product for one pre-intended use. Encourage others to use the good pieces of what you provide in new and innovative ways. And thus the torrent of new services cobbled together with bits of preexisting web services — some of which is tracked by Mashup Feed.

What can nontechnical endusers can expect from all this mashing? More customized information and the power that goes with that, as data feeds get mixed for real-time information on weather, parking, airfare, restaurants, skiing, and general calamity.

A glance at David Schorr’s Weather Bonk confirms, at once, that the Mission is the only somewhat warm place in SF, and the GG Bridge is flowing pretty well at the moment:


Looking for more monetizable information? Flyspy is planning to bring to you a 30-day overview of airfares:

But no matter how clever or useful the mashup, it’s only as good as its datafeeds. Another mashup service, Cheap Gas, looks great until you notice that the gas prices you’re being quoted, contributed by ‘anonymous’ (maybe Eddy from Texaco down the street), dated from last summer:

Such flashy inaccuracy is bound to make people who are in the business of reliable information — for example, librarians — nervous. Many mashups are anarchic sandboxes, and who knows what use your data will be put to or what company it will be keeping or to what ends it will be mashed (that’s the point).

As Tom Owad demonstrated a little while ago , pinpointing ‘subversive’ (yet acquisitive) persons is as easy as mashing up Amazon’s Wishlists with Yahoo People Search with Google Maps. Here’s a map of readers hoping someone buys them a shiny new copy of Orwell’s 1984:

And that’s all *legal* — just imagine what our government is up to.

Nevertheless, the rise of APIs may save libraries from the rusty chains of closed-box ILS packages , and allow them to dream up a range of new community-oriented services. Certainly we should be glad that programmers plugged into the potential of libraries, such as the Superpatron, were doing the monster mashup this week.

Scanning mashupfeed‘s indexes… here are some mashups that strike me as library-intriguing, with pasted descriptive blurbs (ie, I didn’t write ‘em, because I didn’t try ‘em all):

Using GoogleMaps API

  • Blosh Blosh finds blogs mentioning locations and displays them on a map.
  • Boston RSS Alley This map displays the locations of some of the companies and bloggers actively working with RSS in the Boston area.
  • Find the Landmark Test your knowledge of US landmarks with interactive, timer-based Google Maps game.
  • Flyr Search Flickr for geotagged photos and then plot them on a Google Map. Nice nested map-within-a-map.
  • GeoWorldNews The latest worldwide stories from the Washington Post plotted on a Google Maps satellite image.
  • Healthia Use the Healthia doctor search to find doctors the United States. 800,000 doctors listed.
  • History Timeline Wiki A history plus geography wiki that allows readers to contribute items of historical interest and plot their locations. Initial dataset is US battles.
  • Libraries411 Find public libraries in the US and Canada. Data for more than 20,000 libraries available.
  • Maplandia Comprehensive searchable gazeteer based on Google Maps. Referenc guide has full world coverage.
  • Placeopedia Geographically place Wikipedia articles on top of Google maps:

Amazon API

  • Albumart.org Uses the Amazon API and an Ajax-style UI to retrieve CD/DVD covers from the Amazon catalog.
  • O’Reilly Book Page Mashup of Backpack and Amazon.com APIs to generate Backpack pages with Amazon.com book data.

Flickr API

  • flickr graph Social network visualization using Flickr API:

  • Flickr Related Tag Browser Search and visualization tool that lets you surf Flickr’s tag space. Flickr tags are keywords used to classify images. Related tags shown based on clustered usage analysis.
  • Flickrscape Enter a word and watch the flickr photo stream. Click to interrupt stream and try another word.
  • geobloggers Google Maps + Flickr photos. It also consumes del.icio.us for geotagged bookmarks and the Upcoming.org for US events, which it then geocodes.

del.icio.us API

  • Delancey This nice del.icio.us enhancement allows you to see which of your del.icio.us bookmarks are used most frequently.
  • thumblicious Use thumblicious to quickly preview the most popular sites bookmarked on del.icio.us via thumbnail screenshots.

Google API

  • Copyscape A website plagiarism search tool that uses the Google Search API.
  • DoubleTrust Shows the best search results from both Google and Yahoo in a new way. Also allows user to alter his trust in either engine to bais combined rankings.
  • QTSaver Uses Google and Yahoo APIs to extract microcontent from multiple sites and allows you to rearrange the excerpts.
  • SpellWeb Compares relative popularity of spellings or concepts based on web frequency. An experiment in sidesifting the Web for useful patterns of information:

You get the idea… you probably get a thousand ideas. That’s the problem with mashups — too many ideas, too many variously commercial or incomplete datastreams, too much sheer buzz. But quickly, perhaps within a fly’s lifespan, your library may truly catch on.


Sticking around

Posted: February 16th, 2006 | Author: Mark Phillipson | Filed under: Libraryworld, Tagging | 1 Comment »

Check out what’s new at that flagship of Library 2.0-ness — the plugged-in to plug-ins, blessed by superpatrons, interactively inventive Ann Arbor District Library: card catalogs!

Remember card catalogs? If you do, you’ll remember that uniquely tactile experience: the sliding out, the flipping through, the red-ink-mandated cross referencing, the peering & copying & replacing. You remember the yellowing card musk, the little codes and numbers, the misaligned typing of some librarian in some back office on some rainy afternoon in 1943.

There the cards were, so vulnerable in their long drawers, just waiting in to be smudged by indifferent sticky fingers, scribbled across by any lunatic with an agenda, ripped out by any patron too lazy to copy down call numbers. Card catalog maintenance must have been a heck of a job, Brownie–and good riddance.

Yet cards are where the public touched the library, and maybe that’s why (shaking ourselves out of pre-OPAC reverie) we see the inventive John Blyberg, AADL’s lead developer, reviving catalog cards in a virtual setting. None of the fuss, none of the muss — and now you don’t have to feel bad about writing on the cards, or grabbing them for yourself.

Here’s a look — the AADL OPAC listing for a book on marginalia offers a link to a “Card catalog image” (near the top of the record):

Click the link, and here’s the generated card — bottom perforation and everything. Someone has already scrawled a message on the card: Defacement is subjective. You, or anyone, could add another scrawl by entering text in one of the three position fields and clicking on that very 2.0 button, Add your marginalia!:

For patrons with accounts, cards can be gathered into personal collections which can, in turn, be shared with other patrons:

Blyberg writes in his description of the project that it was “black-ops” — no committee, no proposal, no approval, no testing, no advertising, no muss no fuss — so it remains a bit murky and provisional. Marginalia on a given card seems limited to three entries. A book can have several cards associated with it, and it’s not immediately clear how to look through all those cards. Also, I’m not sure whether or how cards gathered into one’s own collection can be inscribed by others.

If virtual card catalogs are merely proof-of-concept at this point, the concept reminds me a bit of a project that the Alchemical Muser and others were working on at Columbia’s CCNMTL called Plone Stickies. These Stickies initially allowed students to attach short notes to digital objects — but the fuller vision for them, I believe, involves client-side keyword tagging and community sharing.

What do virtual catalog cards and these stickies have in common, besides a general yellowness? They both draw on the desire to physically connect to thought-objects. As such objects recede into a intangible, fungible environment, it’s notable that old means of tracking them — those flopping and curling and awkward apparatuses of identification — persist in collective memory, and expand into markers of collectivity.