Library 2.0 September 21st Program Notes
From Metro Collaborate
(Redirected from September 21st Program Notes)
[edit]
Library 2.0 SIG Business
- Next Meeting, Friday November 12
- Participate|Present in the Future
- Take Our Interest Survey
- Join the Mailing List
[edit]
Current State of Catalogs
- Clunky Interfaces
- Difficult Data Migrations
- Software that does some things okay, but nothing well
- Can we take our data back from this outdated model?
- Representation of Electronic Resources is Generally Unsatisfactory
- Sharing data is becoming more important, is the current typical catalog implementation sustainable in this environment?
- Are they accessible to search engines
- Are search results available in anything but poorly formed HTML or Z39.50?
- Vendors help us manage complexity - how can the systems we've invested a lot of money in become more...
- More Responsive - Rapid Development Environment
- Integrated with Electronic Resource Data
- With Digital Libaries/Repositories?
- With Metasearch tools, etc.
[edit]
Major Issues
- Interoperability
- Support for Web Services like SRU/W, RSS, Opensearch
- Management Functions are Tightly Coupled with the User Interface
- How do you get a better interface but still..
- Get holdings data into the display?
- Provide a search tool powerful enough for staff power users
- the need to index large data sets
- marc friendly
- character set issues
- data normalization - a new interface is likely to show a lot of warts in your data
[edit]
Library Catalog Reclamation Approaches
- ILS Replacements see http://liblime.com/
- Solutions for libraries of all sizes?
- Are these ready for primetime?
- Koha A number of public implementations at small libaries
- and the Evergreen Project consortia open source ILS application for the State of Georgia Library System
- Overlays
- Add Software to improve search and display functionality
- Generally integrate holdings information after the point of discover - this will need to be formally addressed for these systems to really take off
- Most feature *faceted* browsing
[edit]
Some Current Overlay Projects
- http://www.plymouth.edu/library/ powered by http://about.scriblio.net/
- http://blacklight.betech.virginia.edu/ powered by Solr/Lucene
- http://aqua.queenslibrary.org/ powered by Aquabrowser
- http://uwashington.worldcat.org/ powered by http://worldcat.org/
- http://www.lib.ncsu.edu/catalog/ powered by Endeca
- Endeca in 250 Lines or Less demo http://catalog.spl.org/catalog/
[edit]
Traditional ILS Model
- Slow Development Cycle
- Long wait for new features
- Web 2.0 Technologies enable semi-skilled technicians to rapidly add features and experiment
- We need to move to more open, standards-based systems
- Open systems generally work with Web 2.0 style technologies Standard 2.0 Demonstration - how long would you need to wait for a vendor something like this?
[edit]
Solr/Lucene
- Examples taken/based upon http://code4lib.org/node/139 - Erick Hatcher Pre-Conference Workshop at Code4lib 2007
- Solr Home
- Open Source
- Java Based Information Retrieval Platform
- Solr is the configuration, results, and management layer
- Lucene is the index tool
- Lucene has been around, Solr is new. Solr makes lucene usable for people in this room
- Requirements JDK 1.5 or later, java application server, i.e. tomcat
- Comes bundled with Jetty Jave Web App Server to support easy experimentation
- Commit|Update using XML, receive search results
- Maybe Open Source is responsive, incremental commits|deletes, the ability to run more multiple indexes at the same time weren't there before.....
- Very Fast, does one thing very well SEARCH
- Enterprise level performance
- Some ILS vendors are starting to incorporate solr and or lucene into commercial products
- A number of User Interfaces have been built, some we've already seen
- Most production services currently for special collections
- Interface Layer is completely separate from the index/data layers
- Easy to have multiple interfaces for different user groups
[edit]
Solr/Lucene in the Catalog Environment
- The Update Layer is currently the most Underdeveloped this is what vendors are starting to sell: http://www.exlibrisgroup.com/primo.htm
- Data standards make this kind of integration possible
[edit]
Important Solr Indexing Features
- Two config files
- schema.xml
- solrconfig.xml
- Ability to Combine Index Fields on the Fly
- Wildcard matching in index
- Strong support for dates
- Highly Customizable - Write you own custom field type, etc.
- Multi-valued Fields
- Copy Field Values to other Fields (Faceted Display in a number of the systems we've talked about is executed through this type of configuration)
- Run multiple indexes on the same solr instance
[edit]
Solr - Library Workflow
- Select Data You want to Index
- From Catalog
- Special Collections/Digital Projects
- Other Data Sources
- Process the Data
- Data needs processed into the Solr XML format
- Simple Key=>Value
- See http://wiki.apache.org/solr/UpdateXmlMessages
- Configure the Solr config files http://wiki.apache.org/solr/SchemaXml
- Index the data, can be done using the command line and the post.jar file included in the solr distribution
- Try out on of the Solr interfaces described at http://wiki.apache.org/solr/IntegratingSolr
[edit]
Testing Solr
- Download the software from the solr website http://lucene.apache.org/solr/
- Follow the instructions at http://lucene.apache.org/solr/tutorial.html
- Potential Pitfall: Make Sure you have a full jdk installed, not just the jre, solr needs a full jdk to run properly
[edit]
Solr Library Projects
- http://wiki.apache.org/solr/Solr4Lib
- http://code.google.com/p/fac-back-opac/
- Smithsonian Cross Search http://siris-collections.si.edu/search/
- Columbia Special Collections http://www.columbia.edu/cu/lweb/archival/
- http://www.vufind.org/
- 19th Century Scholarship - Collex http://www.nines.org/collex
[edit]
Web Services
- Catalog Web Service Example Holdings, Item Control, etc. http://www.slideshare.net/eby/free-the-data-creating-a-web-services-interface-to-the-online-catalog/
- Amazon Web Services Book Covers, reviews
- OCLC Xisbn other editions, etc.
[edit]
Open Source Software for Libraries
- http://code4lib.org/
- http://code4lib.org/planet
- http://code4lib.org/2007/schedule
- Code4lib Conference in Feb. in Portland, Ore http://code4lib.org/conference/2008/
- Access http://access2007.uvic.ca/

