Date: Mon, 15 Sep 97 13:30:19 PDT From: "Katharine Martinez" To: jtrant@archimuse.com Subject: REACH data export instructions and element set Status: RO Dear Jennifer, Here's the export instructions for REACH participants. I hope it is helpful to the process of launching AMICO. Regards, Katharine * * * * * * * * * * * * * * * * * * * [13013] THU 07/17/97 16:46 FROM BL.KCM "Katharine Martinez": REACH Project data export instructions; 533 LINES (FILED) Keywords: k reach TO: REACH Project participants Nancy Allen, Museum of Fine Arts, Boston Rachel Allen, National Museum of American Art Patricia Barnett, The Frick Collection Lynn Black, National Park Service Gwen Bitz, Walker Art Center Steve Dietz, Walker Art Center Michael Fox, Minnesota Historical Society Ann Hitchcock, National Park Service Leslie Johnston, Stanford Art Museum Deirdre Lawrence, Brooklyn Museum Elizabeth O'Keefe, Pierpont Morgan Library Richard Rinehart, Berkeley Art Museum Rodi York, Mystic Seaport David Edwards, Re:discovery Software Marcia Finkelstein, Gallery Systems Jay Hoffman, Gallery Systems Lenore Sarasan, Willoughby Associates Ilene Slavick, Cuadra Associates FROM: Katharine Martinez, RLG Thank you for agreeing to participate in the REACH Project. This is a long message, in two parts. The first part contains text about REACH that will soon appear at the RLG website. The second part contains the instructions regarding the format of the data to be exported to RLG, together with the element set for the project. I will be serving as project manager. A listserv has been set up for the participants, the RLG project team, and Getty staff with the following address: REACHPART@LISTS.RLG.ORG If you wish to communicate with other participants and the RLG project team, please address your message to the above address. The REACH-L listserv is still in existence, but please remember that it includes names of individuals who are not participants. Please contact me if you have any questions about the project, the instructions, or the element set -- particularly if you have concerns about the required elements. Timeframe: In earlier communications to REACH-L we proposed a timeline that now is out of date and should be discarded. After you have had time to study this message and think about workflow at your end, I will pose a question to the listserv proposing a timeframe for the project. Question: can you let me know *as soon as possible* if you are using any standard vocabularies, especially for subject terms, geographic terms, nationality terms, and language terms. cc: RLG Project Team: Altimus, Carles, Chapman, Cromwell-Kessler, Jones, Martinez Getty staff: Busch, Baca ********************************************************* PART 1: BACKGROUND, PURPOSE, AND GOALS OF THE REACH PROJECT The REACH Project is an effort to create a testbed database of museum object records. The goal is to test re-purposing of collection management system data as public access tools. The project will involve the export of existing machine-readable data from heterogeneous museum collection management systems and analysis of the research value of the resulting database when researchers use a single interface to search the testbed database in conjunction with RLG's other resources, including bibliographic and archival records in RLIN, auction catalog records in the SCIPIO datavbase, finding aids, plus abstracting and indexing tools such as the Bibliography of the History of Art, and Anthropological Literature. The REACH testbed database will be comprised of at least 10,000 records from art and cultural heritage institutions. The focus of the project is on core data that identifies museum objects, including a broad range of both art objects and material culture artifacts. Some participants may additionally contribute digital images with the data. RLG hopes that its focus on data-related issues in the REACH Project will be applicable to other projects in the museum community that address digitizing museum collections, including the Association of Art Museum Director's AMICO Project and the American Association of Museum's Museum Licensing Collective. The REACH Project originated as a follow-on to the Getty Information Institute's Museum Educational Site Licensing (MESL) Project. The MESL Project tested academic pedagogic use of digitized images and accompanying data from museum collections. For the REACH Project RLG has chosen to focus on data-related issues that were highlighted in the outcome of the MESL Project: the challenges involved when pooling data from heterogeneous museum collection management systems, and the need to continue analysis of the value and use of museum information, particularly when a single search engine is available and when the data can be used within a networked environment of related resources. RLG is a member of the Consortium for the Interchange of Museum Information (CIMI) whose projects, such as CHIO, are premised on the model of a distributed database environment. The REACH Project contributes to CIMI in that it explores issues of data mapping between heterogeneous platforms. As the museum community increasingly pursues collaborative projects, sharing data and interoperability become greater challenges, especially for smaller insstitutions who may not have staff and infrastructure to maintain clients and servers to participate fully in the arena of international distributed databases. RLG recognizes that there is no single standard for museum object records comparable to the MARC format for library materials, and that for the REACH Project several existing museum standards needed to be taken into account to identify common data elements. The REACH elements are based on the elements in the following museum and cultural heritage standards: * Categories for the Description of Works of Art (CDWA) * Visual Resources Association's Core Categories * Dublin Core * Canadian Heritage Information Network (CHIN) Data Dictionary * MESL Data Dictionary * Consortium for the Interchange of Museum Information (CIMI) Access Points * Museum Documentation Association Standards * The International Committee for Documentation of the International Council of Museums (ICOM/CIDOC) Information Categories RLG recognizes that museums have limited staff time available for outside projects. Participants in the REACH Project are not expected to create new records or alter existing records in order to participate in the project; in fact, an important aim of the project is to test whether existing collection management records can be re-purposed. There are several questions that will be explored during the REACH Project, associated with the current and likely future use of museum data by researchers. Chief among them is the question whether museum object records are useful without digital images. Art historians have traditionally relied heavily on catalogue raisonnees and inventory projects where visual surrogates may not accompany every object record, such as the Inventories of American Painting and Sculpture of the National Museum of American Art, and the Provenance Index of the Getty Information Institute. Are research practices and scholar's attitudes similar in the case of material culture object records? A second set of questions arise when considering the value and use of individual object records vs. collection records. These questions suggest that what information goes into the Notes field of a record could be enormously valuable. The REACH Project will facilitate analysis of search results when certain naming standards may not be universally applied. What kinds of search results will be available if participants do not use standard vocabularies for geographic names, subject terms, and personal names, for example? Can a search engine compensate for the lack of uniformity? REACH was initiated by RLG and Getty Information Institute staff in discussions with officers of the Museum Computer Network, the MESL participants, vendors of museum collection management systems, and museum data experts. The project was launched at a meeting held to coincide with the "Museums and the Web" conference in Los Angeles in april 1997. A listserv was established after that meeting, other interested museums were brought into the conversation, an element set was proposed and discussed, and an invitation to participate in the project was subsequently distributed. Recently the project manager outlined the project and solicited comments from members and officers of the Art Libraries Society of North America, the Art Libraries Society of the U.K. and Ireland, the Visual Resources Association, and the Museum Documentation Association. Within RLG, the Art and Architecture Group will be participating in the evaluation phase of the project. As of July 1997 participants in the REACH Project include * Berkeley Art Museum * Brooklyn Museum * The Frick Collection * Minnesota Historical Society * Museum of Fine Arts, Boston * Mystic Seaport * National Museum of American Art * National Park Service * Pierpont Morgan Library * Stanford Art Museum * Walker Art Center plus the following collection management system vendors: * Cuadra Associates * Gallery Systems * Re:discovery * Willoughby Associates The evaluation phase will draw on the expertise of many individuals, including staff and members of * Canadian Heritage Information Network * Chicago Historical Society * CIMI * Getty Information Institute * MESL participants * Museum Computer Network * Museum Documentation Association * Smithsonian Institution * RLG's Art & Architecture Group * Visual Resources Association Data Standards Committee ********************************************************* PART 2: INSTRUCTIONS REGARDING FORMAT OF DATA EXPORTED TO RLG FOR THE REACH PROJECT Museums participating in the REACH Project by contributing records to the testbed database will provide RLG with structured data (object records) in the following way: 1) Data will be transmitted to RLG via FTP. When you are ready to transmit your first file of data, please contact Dave Grolle (BA.DFG@RLG.ORG or (415) 691-2248) for instructions. 2) Data file must be in the ASCII text format. 3) The character set must be ISO 8859-1 Latin 1 Standard. 4) The record should conform to the REACH element set, including the order of the fields. (see below) 5) The record delimiter (|) should occur at the end of a record 6) The field delimiter (}~) should occur at the end of a field 7) The repeatable field delimiter (;=) should occur after the first, and each subsequent occurrence of a particular field, except the last repeat (which will be followed by the field delimiter). The repeatable field delimiter is necessary because there are a fixed number of fields allowed in a record, so it is necessary for repeats of a field to be contained within a single field (i.e., a single field contains the first, and all other occurences of that field.) A repeatable field delimiter will only be used if a particular field is defined as repeatable. 8) If a field is not required, and if you have no data for that particular field, a blank space should appear, followed by the field delimiter. ************************************************ REACH Project Element Set PLEASE NOTE: This element set is *not* put forward as a standard or as an alternative to any other element set: Dublin Core, CIDOC, Spectra, CDWA, CHIN, Common Agenda, VRA, CIMI, or MESL. This element set represents RLG's best effort to 1) identify fields common to all the above element sets, and 2) identify fields appropriate for this project, i.e., not fields for proprietary information, such as the value of an object. Field #1: Type of Object Type: Repeatable Required? Yes Definition: The classification of the object by type. Preferred Use: This field is for the term(s) that indicate the classification of the object. For material culture collections, this will tend to be the object name (for example, chair, canoe, etc.); fine art institutions should use this field to specify object genre or format (for example, painting, engraving, etc.) Field #2: Date of Creation/Date Range Type: Repeatable Required? No Definition: The year in which the object was created; if specific year not known, or if object executed over several years, give date range. Preferred Use: Dates should be in the format YYYY, or YYYY-YYYY to indicate a date range. Where day and month are available, format should be MM/DD/YY. Where the date represents a BC date, enter as a negative integer, if possible. Note that this field is defined for searching purposes only: no attribution or qualifying information such as circa should be recorded here. Where uncertainty exists, information can be given in the "Notes" field, or, apply the following models: use 19xx for a twentieth century object or 187x for an object dated in the 1870s. Field #3: Place of Origin/Discovery Type: Repeatable Required?: No Definition: The geographical location in which an object was created. Preferred Use: This field is for the name for the place where the object was created. Creation place may be a landmass/continent, country, region or city. Levels of hierarchy may be placed in repeating fields (if possible) or incorporated in text (when not stored separately in source database). Separate multiple places with a semicolon followed by an equal sign (;=) Field #4: Object Name/Title Type: Repeatable Required? Yes Definition: The name or title given to the object by the creator/maker, curator, or owner, or the text of a caption that appears with the image as in prints, cartoons, and photographs. Preferred Use: The field for a title or name of the object. Descriptive titles or names based on classification terms or object type should be provided for objects that do not have formal titles. Field #5: Techniques/Process Type: Repeatable Required?: No Definition: A term describing how the object was created. Preferred Use: This field is for the term(s) that describe how the object was created. Terms used here should preferrably be in the AAT. Separate multiple terms with a semicolon followed by an equal sign (;=). Field #6: Medium/Materials Type: Repeatable Required?: No Definition: The substance(s) of which the object is made. Preferred Use: This field is for the term(s) that describe the media or material of which the object is made. Terms used here should preferrably be in the AAT. Separate multiple terms with a semicolon followed by an equal sign (;=). Field #7: Dimensions Type: Repeatable Required?: No Definition: Measurements associated with any particular dimension of the object. Preferred Use: This field is for object measurements, preferrably in metric or U.S. units. The structure of this field is measurement extent (e.g., height, width, depth, etc.), number, and unit of measure without internal punctuation. Use a semicolon followed by an equal sign (;=) to separate multiple measurements. Field #8: Subject Matter Type: Repeatable Required?: No Definition: The content or subject matter of the object. Preferred Use: This field is for the word or string of words that describes the subject content of the object. Use a semicolon followed by an equal sign (;=) as the break character between multiple terms. Field #9: Style/Period/Group/Movement/School Type: Repeatable Required?: No Definition: A term identifying a style or period in the history of art. Preferred Use: This field is for the term(s) identifying a style or period whose characteristics are represented by the object. These terms should preferrably be in the AAT, except where the AAT is too Western Art centric. Use a semicolon followed by an equal sign (;=) as the break character between multiple terms. Field #10: Creator/Maker Type: Repeatable Required?: Yes Definition: The name of a person or corporate entity responsible for the design or creation of the object. Where an individual artist is unknown, this field should contain a designation by school and period or the name of the culture group responsible for the creation of the work. The name should represent the attribution currently accepted by the holding institution. Birth and death dates, if known, should go in this field, after the name. Preferred Use: This field is for the creator/maker name, preferrably in inverted order (surname, first name(s)). Corporate names are the full legal name. For multiple artists, enter their names separated by a semicolon equal sign (;=). The birth and death dates should preferrably be in the format YYYY-YYYY. Field #11: Nationality/Culture of Creator/Maker Type: Repeatable Required?: No Definition: The name of the culture group responsible for creation of a work that is not attributed to an individual, or the nationality of the individual creator/maker. Preferred Use: The person's nationality should preferrably be expressed as the adjectival form of an existing nation or historic geographic entity. Multiple nationalities for multiple artists should be order-keyed to the creator/maker name field and separated by a semicolon equal sign (;=). Field #12: Current Owner Type: Non-Repeatable Required? Yes Definition: The name of the current owner of the object. Preferred Use: The full name of the owner is preferred. Field #13: Current Repository Name Type: Non-Repeatable Required? Yes Definition: The full name of the current repository of the object. Preferred Use: The full name of the current repository of the object is preferred. Field #14: Current Repository Place Type: Non-Repeatable Required? Yes Definition: The location of the current repository of the object. Preferred Use: Data should preferrably be in the following order: city, or place, followed by country. Field #15: Current Object ID Number Type: Non-Repeatable Required? Yes Definition: The inventory number currently assigned to the object by the current repository. Preferred Use: This field is for the object's accession number or ID number or current inventory number or any unique identifying number as assigned by the current repository. Inventory numbers or other identifiers that may have been assigned to the object by former owners should be reported in the Notes field. Field #16: Provenance Type: Repeatable Required: No Definition: The name of a previous owner of the object. Preferred Use: Enter the name of a person, institution, or organization that formerly owned the object. Field #17: Language Type: Non-Repeatable Required? No Definition: The language in which the data is recorded. Preferred use: If there is language or text associated with the object, this field is where that language should be indicated. Field #18: Electronic Location & Access Type: Repeatable Required? No Definition: The URL linking the object record to a digital image of the object or the filename for that digital image. Preferred Use: Give full URL or unique file name. Field #19: Related Objects Type: Repeatable Required? No Definition: Object(s) related to the object. For example, when object is part of a collection or a set, suite, ensemble, etc. or, a panel that is part of an altarpiece, etc. Preferred Use: This field is for information identifying the related object(s)' record. When applicable, include in this field the object ID number from the record of the related object. Field #20: Notes Type: Repeatable Required?: No Definition: Textual description of object; object history: associated people, organizations, places, and events in the object's history; distinguishing features; inscriptions/marks; condition; edition/state. Any descriptive text, remarks and comments documenting the object or commenting on it from an interpretive/curatorial perspective. Preferred Use: This field is for any text or comments describing the object from an interpretive/curatorial perspective. This could be the text of a wall label, a full entry from a published catalog, or a multiple page essay. Please consider adding to this field any words or descriptions that would be useful for retrieval. ------------------------------------------ Katharine Martinez Research Libraries Group, Inc. 1200 Villa Street Mountain View, CA 94041-1100 Voice: (415) 691-2231 FAX: (415) 964-0943 BL.KCM@RLG.ORG http://www.rlg.org ------------------------------------------ To: JTRANT@ARCHIMUSE.COM cc: BL.KCM