To Wiki or not to Wiki

Wikipedia is one of the poster children of the New Web. I use it every day. It's a constructive cure for boredom. However, I'm convinced it's an exception. Wikis manage knowledge in the same way that my desk manages paper. I can throw whatever I want on it and given enough time I can usually find something specific if I set out to search for it. Occasionally, I'll find something I didn't know was there. Sometimes, I can't find what I thought was there. If I find there's something I keep having to look for, I'll set it somewhere prominent. That works for a while, until the real estate of that prominent place becomes needed to make way for more crap. Eventually, there's just physically no way to fit anything else on my desk, so I spend a few hours going through everything and throwing out about 99% of what I, at one time, deemed "important." Repeat the process.Is there another way? A better way that doesn't require a constant maintenance effort on my part? I don't know. I haven't found it after 30 years.

Knowledge Management in a Nutshell

  • Do you want a space where lots of people can easily put random crap on the web? Apache and SSH.
  • Do you want a space where lots of people can easily edit other people's random crap? Wiki.
  • Do you want a space where lots of people can easily edit other people's useful crap and be able to find it again? Hire someone to manage your content for you.

The bottom line is that systems don't manage content, people do.

One More Than Hrair

As an opening salvo in my bid to recreate the categorization being used in the KB, I recently finished up a content inventory of our current documentation. I've got a decent idea conceptually of where I want to go, but I'm sort of flying by the seat of my pants as far as the specific steps involved in getting there. I'm sure my methods fly in the face of certain best practices (of which I am unaware) but, as they say for hooked on phonics, it worked for me.

I started out with a snapshot of the KB in the form of a Lucene index. According to Lucene there were 1948 documents. This number exceeds my hrair limit, so I needed some way of breaking it up into logical chunks. Since every document has exactly one owner, and since people generally tend to write about the same things, I figured document owner was as good a way as any to slice things up.

Just dividing documents up by owner doesn't constitute a content inventory, though. I still needed to go through each document by hand and drop it into a meaningful box that could be counted later. This process involved three different steps, each using a tool that required very little to no effort on my part to adapt it to my use.

Step 1

Open up the index using Luke. Luke is a really fantastic tool for interactively working with the document set. By first breaking things up by owner, I could refine the query to shuffle documents around to where I needed them. Essentially, it allowed me to say, I want these 30 documents in this pile, these 6 in this pile and these 15 over here on this pile.

Step 2

I don't really want to print 1948 documents. I don't think my coworkers want me to print 1948 documents either. So I didn't want to actually put documents into piles, I just wanted to do that virtually. I could use a spreadsheet, or a word doc or just a text file for that matter, but why use the old method when there's new h0tness at my fingertips? I went with teh new h0tness. I'm speaking of RDF, of course. A quick and dirty python script using PyLucene took my refined Lucene queries and transformed the results into RDF, to which I added some triples about the content of each document. In other words, I just added a single skos:subject to each document. My goal was to make an initial pass at breaking the KB up into manageable content areas. I have roughly 40 subjects, each with somewhere around 40 documents.

Step 3

This step wasn't strictly necessary. I just used it for my own sanity. I loaded the RDF I generated into Piggybank and tagged everything with a "KB" tag so I could easily access the document set and chart my progress. Doing this really gave me some cool ideas about how to integrate the KB with other info management efforts across campus, which I'll hopefully be able to revisit at a latter date.