ArchiveGrid includes over 7 million records describing archival materials, bringing together information about
historical documents, personal papers, family histories, and more. With over 1,400 archival institutions represented,
ArchiveGrid helps researchers looking for primary source materials held in archives,
libraries, museums and historical societies.
If you'd like to see your collections included
in ArchiveGrid or have questions about the ArchiveGrid project,
please get in touch with us.
Frequently Asked Questions
How do we get our collections included in ArchiveGrid?
One of the best ways to be represented is to include your collection descriptions in OCLC's WorldCat database.
ArchiveGrid is largely made up of MARC records from WorldCat; if you are already contributing your MARC cataloging to
WorldCat and don't see your records in ArchiveGrid, please
let us know.
Otherwise, if you have finding aids in EAD, HTML or PDF format and aren't currently contributing MARC records to WorldCat,
please complete the form to
include your collections. Your role in getting your finding aids into ArchiveGrid is simply to give us permission harvest and use them.
OCLC membership is not required to contribute your finding aids to ArchiveGrid, and there are no costs involved, beyond the time and effort on your part to provide us with a way to harvest your finding aid documents.
We harvest your finding aids from a webpage you provide to us, and index them in ArchiveGrid. Any changes you make on to your finding aids, such as adding, editing, or removing finding aids,
will be reflected in ArchiveGrid as your contributed data is updated.
Finding aids and MARC records are updated as part of ArchiveGrid's index maintenance, though this work is not currently carried out on a regular schedule.
If you find that your contributed data needs to be updated in ArchiveGrid, please let us know.
What's the connection between ArchiveGrid and OCLC Research?
As a discovery system focused on archival materials, and as a means of making these collections easier to find in search engines and elsewhere,
ArchiveGrid illustrates
OCLC Research's interest in advancing
issues of importance to the archival community. It also serves as a basis for text mining and data
analysis projects, and for experimentation and evaluations of discovery system features, carried out by OCLC Research staff.
Where do you get your collection descriptions?
We index finding aids harvested directly from contributors, and we also include MARC bibliographic records from WorldCat which we identified as Archival material.
WorldCat records describing archival materials constitute more than 90% of the collection descriptions in ArchiveGrid.
How do you select WorldCat records for inclusion in ArchiveGrid?
There isn't a simple way to identify a MARC record that describes the types of materials held in archives, manuscript collections,
and special collections.
In order for a WorldCat record to be extracted into ArchiveGrid, it needs to meet the following criteria:
- Has only one library holding symbol attached (though we relax this rule for NUCMC records)
- Has a value of f, g, i, j, k, p, r, or t in MARC Leader byte 6, or the value "a" (language material) in Leader byte 6 and the value "c" (collection) in Leader byte 7, or the value "a" (archival) in Leader byte 8
- Has no value of any kind in MARC 260 or 264 subfield "a" or "b" (to filter out published works), though 264 fields with a 2nd indictor of 0 (indicating Production, not Publication) are accepted.
- Does not have "Bibliography" in the beginning string of a MARC subject heading subfield "a" or "v"
- Does not have a MARC 502 field (Theses or dissertation note)
- Does not have a MARC 015 field (National bibliography number)
- If the record has the material type "book" or "serial", has no value in the MARC 008 or 006 "Nature of Contents" bytes (to eliminate theses, reference works, and other non-archival materials)
This filter isn't always successful. Especially for minimally-cataloged materials, we sometimes see descriptions of unpublished manuscripts of various kinds filter through. But we continue to evaluate and improve the filter as best we can.
Our finding aids are only available as HTML or PDF files. Will those formats work well in ArchiveGrid?
With MARC records and EAD XML finding aids, we can take advantage of the document markup to identify "facets" of information in the description, including the
names of people, groups, places, topics, and events. HTML pages and PDF files do not provide us that same level of detailed markup for their descriptive data, but there are ways to optimize
those formats for searching and display in ArchiveGrid. For HTML files, it's important to include a specific title for the finding aid in the HTML Head Title element.
PDF files include internal document properties for the title, author, subject and other fields. If you supply values in these fields, the ArchiveGrid system
can use those values for the document title and for an abstract in brief search results.
Elizabeth Post and her colleagues at Boston College Libraries have put together an excellent summary of their experiences in enriching PDF documents, for improved
interoperability not just for ArchiveGrid but for any search engine that indexes their finding aids. Read more about it in their paper
"Embedding metadata in PDF finding aids to enhance discoverability".
Why does the same collection sometimes have two different entries in the ArchiveGrid index?
Duplicate entries result from harvesting finding aids and extracting WorldCat MARC records for the same collection.
While we continue to work on a way to effectively cluster or de-duplicate these two forms of the collection description from the same contributor,
we have hesitated in favoring one over the other as each includes access points not found in the other.
We are trying to maximize access and discovery, so in this situation we've decided to favor recall over precision.
What about copyright?
OCLC does not claim copyright ownership of individual collection descriptions contributed to ArchiveGrid.
More information on rights and responsibilities and a legal statement are available on the web page where
you can
request to include your collections.
How many institutions contribute?
Collection descriptions from around 1,500 institutions - libraries, museums, historical societies, etc. - are included in ArchiveGrid.
Why is ArchiveGrid a free service?
We transitioned from a subscription-based service to a freely available system in 2012, in order to make the index
available to everyone - faculty, scholars, family history researchers, students, and others.
ArchiveGrid remains an OCLC Research project, allowing us to improve archives and special collections research through studies of researchers, description, and discovery.
What do you know about your users?
When ArchiveGrid started in the late 1990s as a way to test if EAD finding aids from different sources could work well together in one search system,
we mostly had faculty and college students and genealogists in mind as our researchers. We later gathered data about these user groups from studies,
and we continued to make system design improvements based on their needs.
In a study conducted in 2012, we surveyed archives and special collections users and learned that primary source research is still an important focus for
faculty, students and genealogists, but it also plays a role for people motivated to search for archival materials by something in their personal and professional lives.
This important segment of archives and special collections include filmmakers, writers, designers, hobbyists, and much more.
What are your site visit statistics like?
We use Google Analytics to track site visits and
show selected current statistics.
We recognize that discovery frequently begins with search engines, so we put special effort into promoting ArchiveGrid's collection descriptions in Google, Yahoo, Bing, and other
popular services.
Does ArchiveGrid make use of cookies?
Yes it does. We create temporary "functional" cookies that last only for the length of your session, to help remember your preferred location on the map shown on the ArchiveGrid home page.
ArchiveGrid also uses Google Analytics to evaluate how the system is behaving, and temporary performance tracking cookies are set for that purpose.
We're not using cookies to personally identify users or for behavioral advertising purposes. You're not required to have cookies enabled in your browser to use ArchiveGrid.
For more information, consult the
OCLC Cookie Policy document describing the use of cookies in OCLC services.
Statistics
ArchiveGrid Index Growth
Interpreting These Statistics
The growth of MARC records in 2014 represents identification and inclusion of additional contributors of data to WorldCat, and the on-going
growth of WorldCat in general. While there has also been a concerted effort to increase the number of finding aid contributors, that form of
archival description is a relatively small percentage of the entire aggregation. And some MARC records may be removed from the current set, as we improve our
selection algorithms to identify archival collection descriptions.
ArchiveGrid Monthly Sessions
| Count | Type |
| 4,034,989 | Sessions |
| 3,317,395 | Users |
| 11,480,193 | Pageviews |
| 59% | Visits referred by search engines |
| 20% | Visits via links from other websites |
| 20% | Visits via direct links |
| 1% | Visits via other starting points |
ArchiveGrid Contributors
By Country
By State in the Lower 48 United States
Interpreting These Statistics
ArchiveGrid contributing institutions are somewhat difficult to precisely count. In some cases, what might be considered as one institution may be represented in ArchiveGrid by several
different contributors, one for each archive or special collections department affiliated with the larger institution. But in other instances, one contributing institution
provides descriptions for a great many more individual archives (for example, the NUCMC records contributed by the Library of Congress). Until we can more accurately
identify and count these contributors, the current count of contributors will be much lower than the actual number of institutions represented.