Description of Text Encoding Initiatives (TEI)
Header Elements and Corresponding USMARC Fields
Appendix to TEI/MARC Best Practices

 

The following is a quick reference definition of TEI Header Element Set. The description conforms to the document found in University of Virginia's ETEXT Center site http://etext.virginia.edu/TEI.html.   For complete description on TEI encoding initiatives, please refer to TEI home site, http://www.uic.edu:80/orgs/tei/.

Unlike HTML, the coding in TEI is case-sensitive.

    The version of the TEI header (based on TEI P3), teilite.dtd that UVa uses comprised of four major sections:

    <teiHeader>

     I.   <fileDesc>...</fileDesc>

    II.   <encodingDesc>...</encodingDesc>

    III.   <profileDesc>...</profileDesc>

    IV.   <revisionDesc>...</revisionDesc>

    </teiHeader>

 

Based on the recent OCLC code changes (March 15, 1998), all TEI headers fall under the new practice: MARC records for text-based electronic resources are coded Type: "a", File: "d" or "e" and a 006 field. Fied 007 is also required. (See:  Jay Weitz. OCLC Guidelines on the Choice of Type and BLvl for Electronic Resources. Dublin, OH : OCLC, [cited 23 March 1998]. Available at URL: http://www.oclc.org/oclc/cataloging/type.htm)


I.   The File Description -- <fileDesc> contains a full bibliographical description of the online version of the text: title, author, creator of the electronic version, size of the file, date of creation, publisher of the electronic version, and information about the print source from which the electronic version was created.

     
    <fileDesc> 
      <titleStmt> 
        <title type="main, uniform, etc.">...</title> 
        <author> ... </author> 
        <editor> ... </editor> 
        <respStmt> 
          <resp>Compiler ; Illustrator</resp> <name>....</name> 
        </respStmt> 
        <respStmt> 
          <resp>Conversion to TEI.2-conformant markup: </resp> <name>University of Virginia Library Electronic Text Center. </name> 
        </respStmt>
      </titleStmt> 
      <editionStmt> 
                   <p>...</p> 
      </editionStmt> 
      <extent> .... </extent> 
      <publicationStmt> 
        <publisher>University of Virginia Library</publisher> 
        <pubPlace>Charlottesville, Virginia</pubPlace> 
        <idno>...</idno> 
        <availability> 
               <p>Publicly accessible</p> 
               <p n="public">URL: http://</p> 
        </availability> 
        <date>...</date>
      </publicationStmt> 
      <seriesStmt> 
                   <p> ... </p> 
      </seriesStmt> 
      <notesStmt> <note>... </note></notesStmt> 

      <sourceDesc>   <biblFull> 

        <titleStmt> 
          <title>...</title> 
          <author>... </author> 
          <respStmt> 
            <resp>...</resp> <name>....</name> 
          </respStmt>
        </titleStmt> 
        <editionStmt> 
                     <p>...</p> 
        </editionStmt> 
        <extent> .... </extent> 
        <publicationStmt> 
          <publisher>...</publisher> 
          <pubPlace>...</pubPlace> 
          <idno type="callNo">Copy consulted:...</idno> 
          <date>...</date>
        </publicationStmt> 
        <seriesStmt> <p>...</p></seriesStmt> 
        <notesStmt> <note>... </note></notesStmt>
      </biblFull> </sourceDesc>
    </fileDesc>
     
     
    TEI Tags Recommendations USMARC Tags
    <teiHeader type="***"> Standards to which the header applies, e.g., <teiHeader type="ISBD(ER)">, <teiHeader type="AACR2">  
    <fileDesc>
    <titleStmt>
        <title type="****">
    Only uniform title and main title should be entered here, e.g., <titleType="uniform"> or <titleType="main">. See <sourceDesc> for other title forms for documents where where a user might seek the documents under titles other than those assigned. Where a title is provided by the header creator rather than the document creator, the title should be enclosed in square brackets using standard English language conventions for editorial insertion. 130
    240
    245
    246's
    <titleStmt>
        <author>
    Author of original source (electronic or print) should be entered into the <author> tag before the <respStmt>. Use discrete tags within <author> tag for "last name", "first name", "middle name", "date", "position title" to allow future flexibility in display, indexing, and in transferring to MARC. Whenever possible, establish or use nationally established forms of names. The name should be inverted and entered in the established form. 1XX   1st author
    534    |a   1st          author
    7XX   2nd and        3rd authors
    <titleStmt>
        <editor>
    Editor of original source (electronic or print) should be entered into the <editor> tag before the <respStmt>. 7XX   all editors
    <respStmt>
        <resp>editor, etc.
             <name>
    The editor (also compiler, illustrator, etc.) of an electronic version should be entered into the appropriate tag in the <respStmt>. The name should be inverted and entered in the established form. 500 
    7XX   |e
    <editionStmt> <p> Caution:
    Remind users that the edition statement refers to the electronic piece, not the original item. This field should be used sparingly as there are currently no standards as to when versions become editions. Users should refer to the instructions in the TEI manual.
    250
    <extent> Use the standard text:
    "ca. **** kilobytes".
    256
    <publicationStmt> Caution: This statement describes the electronic file.  
        <publisher> The publisher is whoever has collected the electronic text and has made decisions concerning it. 260  |b
        <pubPlace>   260  |a
       <distributor> The distributor is whoever makes the electronic text available. 260  |b
      <idno> Any unique identification number determined by the publisher 090,099
    <availability><p>

        <p n="restriction, public, etc.">

    Use specialized elements when anticipating sharing of the header or free text if only local usage is expected.
    Caution:
    Know your audience.
    500

    590    local restriction note
    856  4X |3  |z  |u

    <date> Rrefers to the date of the publication of the electronic document. Fixed field: Date1
    260  |c
    <seriesStmt> <p> Whenever possible, establish the national authority file authorized form for the electronic locally created series. 440
    <notesStmt> <note> Optional, depending on display decisions. Should be used for indicating questionable attributions for title, author, etc. 500
    <SourceDesc> In order to effectively represent the source(s) when many documents are represented by the TEI header, we see the need for structured elements that minimally allow us to identify parent-child and component relationships. In the absence of these structures, we suggest that multiple source descriptions be employed with relationships described in free text. Relationships also could be useful in other portions of the TEI header. Cataloger may need to do research to establish the original source.  
    <bibl>
    <biblStruct>
    <biblFull>
    Prefer <biblFull> to allow searching on parts of the description.  
    <biblFull>
        <titleStmt>
        <title> 
    It is possible to have multiple <titles> in <biblFull>. Alternative titles (cover, running, spine titles) should be entered in separate <title> fields in the <biblFull> field in the <sourceDesc> where they are searchable. 246
    700 X2
         |a   |t
    730 
    740 
    534   |t
    <biblFull>
        <titleStmt>
        <author>
    If the name of the author(s) in the originating source differs from the established form, include here the form from the source tagged <author type="alternate">.   
    <biblFull>
        <editionStmt> <p>
    Enter edition statement as found on the original source. 534   |b
    <biblFull>
        <extent>
    Enter physical description for the original source. 534  |e
    <biblFull>
        <publicationStmt>
        <publisher>
    Don't repeat field. Enter multiple publishers divided by semicolons. 534  |c_2
    <biblFull>
        <publicationStmt>
        <pubPlace>
    Don't repeat field. Enter multiple publishers divided by semicolons. 534  |c_1
    <biblFull>
        <publicationStmt>
        <date>
    Imprint date for the original source. 534  |c_3
    <biblFull>
       <publicationStmt>
      <idno type="***">
    In this location, <idno> refers to identification numbers for the source document. They can be used to indicate the source's location in an individual institution's collection. If a formal standard location system is being used, indicate the nature of the system, e.g., <idno type="LC call number">. 534 |n
    <biblFull>
        <seriesStmt>
        <p>
    Establish via national authority file the series statement of original document. 534  |f
    <biblFull>
        <notesStmt>
        <note>
    Caution:
    Notes made here should refer to the original source.
    534  |n
     

II.  The Encoding Description -- <encodingDesc> -- describes the process of the normalizaton of the text during transcription, the encoder's decisions about resolving ambiguities in the source, the levels of encoding or analysis applied, etc.
 
 

     
    <encodingDesc> 
              <projectDesc> 
                   <p> ... </p> 
              </projectDesc> 
              <editorialDecl> 
                   <p> ... </p> 
                   <p id="ETC"> ... </p> 
              </editorialDecl> 
              <refsDecl> 
                   <p> ... </p> 
              </refsDecl> 
              <classDecl> 
                   <taxonomy id=LCSH> 
                        <bibl> 
                        <title>Library of Congress Subject Headings</title> 
                        </bibl> 
                   </taxonomy> 
              </classDecl> 
         </encodingDesc> 
 
     
    TEI Tags Recommendations USMARC Tags
    <encodingDesc>
    <projectDesc>
        <p>
    Enter a description of the purpose for which the electronic file was encoded. 500
    550
    <editorialDecl>
        <p>
    Enter general and specific statements about how the electronic file has been treated. Record here editorial decisions made during encoding. 516
    500
    <refsDecl>
        <p>
    <refsDecl> seems a possibility for administrative metadata, e.g., pagination and page sequencing. 500 
    516
    <classDecl>
        <taxonomy id=***>
    If used, identify the appropriate taxonomy definitions or descriptive sources in the <taxonomy> tag followed by id, e.g.,
    <taxonomy id=LCSH>,
    <taxonomy id=AAT>.
     
     

III.  The Profile Description -- <profileDesc> -- describes the non-bibliographic aspects of the text, the languages used in the text, the situation in which it was produced, the participants, and their setting.

The <date> field in the <creation> section is vital. OpenText uses this to construct its "Centuries" document structures.

     
    <profileDesc> 
              <creation> 
                   <date> ... </date> 
              </creation> 
              <langUsage> 
                   <language id="eng">English</language> 
              </langUsage> 
              <textClass> 
                   <keywords> 
                          <term>non-fiction; prose</term> 
                   </keywords> 
                   <keywords scheme="LCSH"> 
                        <term type="***">Crane, Stephen, 1871-1900|xCriticism and interpretation.</term> 
                   </keywords> 
              </textClass> 
         </profileDesc> 
     
 
     
    TEI Tags Recommendations USMARC Tags
    <profileDesc>
    <creation>
       <date>
    Use the date as it comes from the creator.  
    <langUsage>

     


      <language id="***">

    Language usage is specified by document creators. Use standard language names.

    Use the ISO 639-2 standard (which is the same as the MARC language codes).

    041 
    546
    <textClass>
        <keywords>
         <term>


    <keywords scheme="LCSH"> 
    <term type="****">

    True classification numbers as opposed to call numbers can be entered here.

    Use for controlled vocabulary as specified in <encodingDesc> taxonomy id.

    653,
    654 (for AAT),
    655, 690, 691

    600, 610, 611, 630, 650, 651

     
 

IV.  The Revision Description -- <revisionDesc> -- provides a history of changes made during the development of the electronic text.
 

<revisionDesc> 
          <change> 
               <date> ... </date> 
               <respStmt> 
                    <resp>corrector/cataloger</resp> 
                    <name> ...  </name> 
               </respStmt> 
               <item> ... </item> 
          </change> 
 
 
TEI Tags Recommendations USMARC Tags
<revisionDesc>
   <change>
       <date>
   <resp>
       <name>
       <item>
Use the specific codes to note revisions rather than prose description. Include the entire date. 9XX local processing note
 

 


TEI/MARC Best Practices   |  TEI/XML in Digital Libraries

Created on Oct-12-98.

 Jackie Shieh
Original Cataloger for Electronic Resources
University of Virginia Library
Charlottesville, VA 22903
Email: shieh@virginia.edu
Voice: 434.924.3206
Fax: 434.924.1431