Things being as they are around here, I haven't had a chance to give the recent discussion about vCard and FOAF more than a cursory glance. One thing that did catch my attention though, was the SquirrelRDF LDAP mapper. Cool stuff and probably useful for our purposes.
November 9th, 2006|
Posted in RDF
After a quick run through Redland to add an rdf:type to each KB document (Piggy Bank apparently doesn't like untype subjects?) I dumped the list of unpublished, unarchived docs that Adam sent into Piggy Bank. I'm continually impressed by how great an interface it is for managing the KB (screenshot). It would be cool if there were a way to more closely tie the two together. As it stands now, the process for getting RDF out of the KB is a bit cumbersome.
September 29th, 2006|
Posted in RDF
As an opening salvo in my bid to recreate the categorization being used in the KB, I recently finished up a content inventory of our current documentation. I've got a decent idea conceptually of where I want to go, but I'm sort of flying by the seat of my pants as far as the specific steps involved in getting there. I'm sure my methods fly in the face of certain best practices (of which I am unaware) but, as they say for hooked on phonics, it worked for me.
I started out with a snapshot of the KB in the form of a Lucene index. According to Lucene there were 1948 documents. This number exceeds my hrair limit, so I needed some way of breaking it up into logical chunks. Since every document has exactly one owner, and since people generally tend to write about the same things, I figured document owner was as good a way as any to slice things up.
Just dividing documents up by owner doesn't constitute a content inventory, though. I still needed to go through each document by hand and drop it into a meaningful box that could be counted later. This process involved three different steps, each using a tool that required very little to no effort on my part to adapt it to my use.
Step 1
Open up the index using Luke. Luke is a really fantastic tool for interactively working with the document set. By first breaking things up by owner, I could refine the query to shuffle documents around to where I needed them. Essentially, it allowed me to say, I want these 30 documents in this pile, these 6 in this pile and these 15 over here on this pile.
Step 2
I don't really want to print 1948 documents. I don't think my coworkers want me to print 1948 documents either. So I didn't want to actually put documents into piles, I just wanted to do that virtually. I could use a spreadsheet, or a word doc or just a text file for that matter, but why use the old method when there's new h0tness at my fingertips? I went with teh new h0tness. I'm speaking of RDF, of course. A quick and dirty python script using PyLucene took my refined Lucene queries and transformed the results into RDF, to which I added some triples about the content of each document. In other words, I just added a single skos:subject to each document. My goal was to make an initial pass at breaking the KB up into manageable content areas. I have roughly 40 subjects, each with somewhere around 40 documents.
Step 3
This step wasn't strictly necessary. I just used it for my own sanity. I loaded the RDF I generated into Piggybank and tagged everything with a "KB" tag so I could easily access the document set and chart my progress. Doing this really gave me some cool ideas about how to integrate the KB with other info management efforts across campus, which I'll hopefully be able to revisit at a latter date.
July 25th, 2006|
Posted in RDF