Jim Murrell talked from http://www2.unc.edu/~murrell/search.html
UNC-CH started with Harvest and Glimpse for searching. It could take a week to create and index, and the index would often get corrupted. We put up Verity last March. We pay about $25,000 for Verity.
On page http://www2.unc.edu/~murrell/sites.html, servers marked with an asterisk are indexed every day. The others are indexed on weekends. Currently, we do not index the School of Medicine (they asked us not to).
For each collection, there are many options that can be specified to the spider, that does the indexing. The existence of an index.txt file prevents Verity from indexing.
Libby talked from http://www.unc.edu/~uevans/verity_searching.html. Carolina Population Center (CPC) uses htdig on its own server in addition to being indexed by Verity. http://www.cpc.unc.edu/searching/cpcsearch.html is a modifed version of a Web page Jim Murrell gave Libby. CPC has several subcollections, including newsletters, conference papers, and a list of research projects. The search interface includes a pre-imbedded part of the search query, which is based on standards in titles. Other organizations can do something similar, if they have standard titles or similar elements.
Libby provided some examples under Help. Libby and Jim both said that NCSU has good instructions and tips at http://search.ncsu.edu/tips/. NCSU also has some tips for authors at http://search.ncsu.edu/tips/authoring.html. Hallman will look at the NCSU files and probably link to them in the UNC-CH help section.
Searches are case sensitive. Searching CPC information for Elder (a person) produced 36 hits, while searching for elder got 43.
Stemming is automatic (elder also gets elders). Put the term in quotes to prevent stemming.
NCSU has found metatags are useful in ranking results. Someone said that they used metatags and it brought up the ranking of their page to sixth, but not to the top.
Verity has an "accrue" operator that causes higher ranking for documents with more of the terms.
Verity has a <near> operator. For example
(loan or scholarship or grant) <near> student
This search, as well as many others, produces a lot of hits on the ILS server. We tried
(loan or scholarship or grant) <near> student <not> ils
but that did not eliminate hits on the ils server.
It might be nice to have a search page that has all the collections selected but lets you click on the ones you don't want included.
Libby demonstrated wild card searches by searching for
namvar z*ri
which brought up hits on Namvar Zohoori. Wild cards can be at the end or imbedded.Libby said that if you search on teach, stemming will also cause Verity to search for taught.
You can also restrict searching based on zones, like <title> and headings (for example, <h1>).
Someone mentioned that the results page may have entries for pages not at UNC-CH, because Verity indexes every reference, including URLs. Someone asked if we could turn that off.
There was interest in learning more about how to search part of a collection.
Libby handed out out a list of 20 questions for us to see if we could find answers. For example:
Judy Hallman (judy_hallman@unc.edu, http://www.unc.edu/~hallman/)
Campus Webmaster, UNC-Chapel Hill