Below is my report from the Eastern Content dm User’s Group Conference. The presentations and discussions turned out to be very worthwhile for our situation here at FDU.
Content dm User’s Conference
Towson University, Towson, MD
August 2-3, 2011
Using Content dm as a Tool for Crowdsourcing Museum Artifact Descriptions
This session turned out to be incredibly useful for a number of reasons. The presenters are from the National Institute for Standards and Technology (NIST), and have a large number of scientific artifacts that they cannot identify. They decided to use “crowdsourcing” to identify these objects. The NIST collection can be found at: http://nistdigitalarchives.contentdm.oclc.org
They listed 6 ways that crowdsourcing has been used by institutions, and gave examples:
Transcription (getting users to transcribe handwritten materials).
Contextualization (getting users to describe or identify objects)
Examples: Google Maps matchups with Yelp, Flickr Commons.
Collecting (getting users to add their own material to a collection)
Classification (Adding tags and subjects to materials)
Example: Beyond Brown Paper
Co-Curation (users decide what goes into an exhibit)
Example: Brooklyn Museum’s Project Click
Fundraising (users can make financial contributions to your project)
NIST reached out to alumni to get the project started. They posted the artifacts in a collection on Content dm, and added text to the page asking for help in identifying the objects, with an e-mail address for contact. Their public relations department included the project in their newsletter, which ended up being picked up by Wire.com. After that, the project had worldwide coverage and input, with over 22,000 pieces of input from commenters. Many comments were in e-mail, but the project coordinator also checked the Web for articles about the project, and read the comments in those articles.
On a technical note—she was able to show multiple views of one object by creating a compound object for each item, and added a metadata field “crowdsource” that allowed them to tag those objects as needing more information. They used Webalyzer for statistics, which apparently now comes with Content dm.
This presentation was of great use to me, because 1. I am working on cataloging artifacts at the Heritage Center, 2. Besides artifacts, we have a large number of photos that need identifying, and 3. In the absence of other funding, using “Kickstart” might be a way to solicit funding, if the university approves using that method. There is great potential for a collaboration with the Public Relations Office, The Alumni Office, and University Advancement for getting some of our currently unfunded projects off the ground.
Electronic Thesis and Dissertations Without a Programmer: Manipulating Proquest XML Metadata into Content dm
This presentation was given by the University of Maryland, Baltimore County library. Since 2007, UMBC has received PDF copies of their theses and dissertations, with a separate XML metadata file. When they had a programmer on staff, they used a Perl script to harvest the metadata into spreadsheets. When the programmer left and the script broke, they had to use a more manual process to manipulate the data for import into Content dm.
The first thing they did was delete CVs from the Proquest copies (first 2 pages), as they didn’t want private information in the archive. Then they imported the XML into Excel, and used the developer-import feature to put it into their specially created Content dm metadata worksheet. They explained how they used macros to make repetitive changes, and when all the data was right, used the “add multiple items” feature in Content dm to import the metadata and pair it up with the documents. The specifics of their process were included.
This was a interesting session on using Excel and its features to simplify the creation and importing of Content dm metadata. After the session, a few of us wondered why adding the dissertations to Content dm would be necessary if their campus used and had access to Proquest UMI. They mentioned embargo and copyright issues, which made it even stranger to us that they bothered to add these. However, I did learn something else—Content dm collections CAN be restricted to internal use only if there is a need to do that.
New York Heritage and Consortial Collaboration
This session was about the New York city and state libraries merging their Content dm collections onto one server. All of them had their own pages with their own identities, and while 3 libraries in the consortium preferred to keep their separate identities, the other 9 councils opted to be on the same server. This eliminated federated searching issues for those 9 sites. The problem with federated searching across multiple sites is that once an item is selected from a search, it opens the item in that particular council’s Content dm instance, and then won’t let you go back to your federated search. This is something they are working on for sites with multiple servers. While interesting, the issue is not specific to FDU.
Beyond the Library: Managing Metadata for a University Publications Portal
This presentation was made by the director of the National Defense University, who deals with several different libraries and research centers, as well as four different university presses. The portal she showed us demonstrated a work in progress thus far. As she was also merging different kinds of description into one Content dm system, she worked with other sites to come up with a core spreadsheet of required metadata, which she shared with the group. The spreadsheet is very useful, and can be easily tweaked to create a standard template for our own metadata.
Content dm Software Update
Geri Ingram from OCLC gave the highlights of the new version of Content dm. Recently they rolled out version 6 (which we have at FDU). Her update included what will be coming this Fall in version 6.1. FDU will also get this automatically, as all OCLC hosted sites will be upgraded automatically.
The features of 6.1 include:
• Ability to have social tagging and comments (these will be in a separate SQL database, not with the metadata)
• Ability to download and print photos (no “.exe” files showing up when you right click on the photos)
• The return of the “My Favorites” feature.
• Newspaper article viewer, and plug-in for PowerPoint presentations
• New features in admin for customizing the home page
• If you use METS/ALTO metadata (we currently don’t), you can have article-level metadata for archived newspapers and magazines
• Image rights can be more easily edited
• New FIND search engine included—the one used by OCLC Worldcat
In addition to the new features listed here, I learned about several other current capabilities:
• FlexLoader is an application that allows us to map and import XML metadata ourselves (without retyping)
• Content dm has a flow player that allows you to use video formats. I also learned that you can create compound objects that combine things like photos and videos.
• All 55 Dublin Core elements are now supported.
• 6 of 10 authority files used in Content dm have been updated and refreshed
• New web configuration tools have been included
• The Project Client now has EXIF/IPTC metadata extraction
• There is a “Catcher” service that allows you batch-edit metadata for single items (not compound objects, like the Twombly ledgers).
• Content dm training is always available for free at http://training.oclc.org
• At the User Support Center, it is possible for us to advertise our collection.
• The old Content dm listserv was replaced by forums on the User Support site.
• New APIs and customization documentation are available; however, OCLC advises trying the new configuration tools before using these.
Geri also announced something called the Digital Collections Gateway. This allows you to upload metadata, and also harvest metadata from OCLC for your collections. Those of us with hosted Content dm just have to change our server settings to “Worldcat sync” to take advantage of this feature.