Summary of June 15, 1998, Web-Walkers meeting

The Campus Search Engine

Jim Murrell, Academic Technology and Networks
Elizabeth A. Evans, Carolina Population Center

Jim Murrell talked from http://www2.unc.edu/~murrell/search.html

UNC-CH started with Harvest and Glimpse for searching. It could take a week to create and index, and the index would often get corrupted. We put up Verity last March. We pay about $25,000 for Verity.

On page http://www2.unc.edu/~murrell/sites.html, servers marked with an asterisk are indexed every day. The others are indexed on weekends. Currently, we do not index the School of Medicine (they asked us not to).

For each collection, there are many options that can be specified to the spider, that does the indexing. The existence of an index.txt file prevents Verity from indexing.

Libby talked from http://www.unc.edu/~uevans/verity_searching.html. Carolina Population Center (CPC) uses htdig on its own server in addition to being indexed by Verity. http://www.cpc.unc.edu/searching/cpcsearch.html is a modifed version of a Web page Jim Murrell gave Libby. CPC has several subcollections, including newsletters, conference papers, and a list of research projects. The search interface includes a pre-imbedded part of the search query, which is based on standards in titles. Other organizations can do something similar, if they have standard titles or similar elements.

Libby provided some examples under Help. Libby and Jim both said that NCSU has good instructions and tips at http://search.ncsu.edu/tips/. NCSU also has some tips for authors at http://search.ncsu.edu/tips/authoring.html. Hallman will look at the NCSU files and probably link to them in the UNC-CH help section.

Searches are case sensitive. Searching CPC information for Elder (a person) produced 36 hits, while searching for elder got 43.

Stemming is automatic (elder also gets elders). Put the term in quotes to prevent stemming.

NCSU has found metatags are useful in ranking results. Someone said that they used metatags and it brought up the ranking of their page to sixth, but not to the top.

Verity has an "accrue" operator that causes higher ranking for documents with more of the terms.

Verity has a <near> operator. For example

    (loan or scholarship or grant) <near> student 
This search, as well as many others, produces a lot of hits on the ILS server. We tried
    (loan or scholarship or grant) <near> student <not> ils 
but that did not eliminate hits on the ils server.

It might be nice to have a search page that has all the collections selected but lets you click on the ones you don't want included.

Libby demonstrated wild card searches by searching for

    namvar z*ri
which brought up hits on Namvar Zohoori. Wild cards can be at the end or imbedded.

Libby said that if you search on teach, stemming will also cause Verity to search for taught.

You can also restrict searching based on zones, like <title> and headings (for example, <h1>).

Someone mentioned that the results page may have entries for pages not at UNC-CH, because Verity indexes every reference, including URLs. Someone asked if we could turn that off.

There was interest in learning more about how to search part of a collection.

Libby handed out out a list of 20 questions for us to see if we could find answers. For example:

  1. When are staff Christmas holidays this year?
  2. I'm interested in finding out what jobs are available at UNC-CH.
  3. What do I need to do to apply for admission?
  4. Who is teaching Psychology 125 this semester?
  5. I need to find a speaker on long-term implications of the stock market drop. Does UNC-CH have anyone who can help?
  6. I want to be in the pep band.
  7. How do I buy tickets to a football game?
  8. How do I set up a first-last alias for my electronic mail?
  9. How do I contact the Employee Forum/Faculty Council?
  10. I'm a student, and I'd like to find a part-time job on campus.
  11. What are Davis Library's hours during spring break?
  12. What is <name a person>'s electronic mail address?
  13. What is the phone number for the Carolina Population Center?
  14. Where is the text of the speech made at the last University Day?
  15. Where is today's Daily Tar Heel crossword puzzle?
  16. I understand UNC-CH has a sports medicine clinic. I'm interested in making an appointment to see someone there. How do I do that?
  17. How much are bus passes?
  18. How do I get to the Smith Center: I'm coming from Greensboro.
  19. Does UNC-CH offer a master's in public health administration?
  20. I'd like to create a Web page for my class. Someone mentioned the Simple Start program to me. Where is the Simple Start page so I can get more information?

Attendees

Francesca Allegri, Health Sciences Library, allegri.hsl@mhs.unc.edu
Elizabeth A. Evans, Carolina Population Ctr, evans@unc.edu
Judy Hallman, ATN, judy_hallman@unc.edu
Barbara Levergood, Davis Library, leverg.davis@mhs.unc.edu
Gary Lloyd, Registrar's Office, grl.our@mhs.unc.edu
Sarah Madry, A & S Foundation, sarahmadry@unc.edu
Ruth Marinshaw, Carolina Population Center, rlm@unc.edu
Sherry Morrision, Student Affairs, sherry_morrison @unc.edu
Jim Murrell, ATN, jim_murrell@unc.edu
Leslie Quattlebaum, Division of Student Affairs, ucc@email.unc.edu
Jim Rosinia, Office of Info. & Comm, jim_rosinia@unc.edu
Loren Watterson, CPC, loren_watterson@unc.edu
Landon Whitt, Psychology, landon_whitt@unc.edu
Philip Young, Div. Cont. Educ., pyoung@imap.unc.edu

Judy Hallman (judy_hallman@unc.edu, http://www.unc.edu/~hallman/)
Campus Webmaster, UNC-Chapel Hill