[Electronic Text Center] [Introduction to TEI]

Guidelines for SGML Text Mark-up at the Electronic Text Center

David Seaman, Electronic Text Center, University of Virginia


ETEXT Center Guidelines for the Creation of Archivable Illustrations

Etext staff: if in doubt see David Seaman for guidance on the settings at which to scan an image.

We have eight years of experience in the creation of digital copies of book illustrations, typescript, and manuscript, so don't try to "go it alone". The .tiff files may be sizeable -- don't be offput, and especially don't be tempted to scan at too low a resolution (or God forbid, at 8-bit colour), just because a tiff is a big file. And we don't want to have to re-scan at a later date. The tiffs go off onto a CD as soon as we have made jpeg and gif versions for current everyday use.

Rules and Regulations of Image Scanning and Encoding

The following list explains the items we typically scan, their specifications for scanning, and how to name them for our electronic text database at the Etext Center:

Illustration and Image Tags

The following tags are used to tag illustrations and information that goes with illustrations.

Other Simple Examples

<figure entity="EliMid10">
An engraved portrait of Dorothea posed thoughtfully at a writing table. Three stacked books stand in the right foreground. Dorothea's right hand holds a quillpen.</figDesc>

Click here to see image.

<figure entity="EliMid50">
<head>Mr. Casaubon and Dorothea</head>
<figDesc>An engraving by W.L. Taylor showing Mr. Casaubon and Dorothea, presumably in their "hour's <hi>tête-à-tête</hi>." Casaubon sits in an upholstered wooden chair in the left background corner, facing the viewer, with Dorothea's right hand in his own. Dorothea sits on a footstool at center-right, turned towards Casaubon. The left quarter of her face is visible to the viewer. The setting is a sunny room with one curtained window and one uncurtained, open window behind the figures. </figDesc>

Click here to see image.

SGML Text Embedded in Image Files

A growing number of our electronic texts have book illustrations and other book-related images along with the tagged ASCII text. To include an attribution record in these book illustrations we bury a version of the TEI header into the binary code of the image. The user who saves an image from a text on our etext server now gets -- in Trojan Horse fashion -- a tagged full-text record of the creation of that image as part of the single image file they save. The image header and related <figDesc> information gives us a searchable SGML text database for our images.

For a description of an early implementation of "text in images", see David Seaman: "Campus Publishing in Standardized Electronic Formats: HTML and TEI." in Scholarly Publishing on the Electronic Networks, 1994.

Specific Procedures for Adding Image Headers

Image Processing on Unix: ImageMagick

The mogrify part of this impressive Unix tool allows us to perform batch image conversions from one format to another (e.g., TIFF to JPEG) and to add tagged text headers into the images as we convert.

ImageMagick, is available from
and is on the UVa etext machines. See the ImageMagick README file for more information.

For an interactive on-line implementation of ImageMagick, see the Image Machine at:



Step by Step Instructions for UVa Etext processors

1. Use the new TEI header template in etext/Done; it has several new fields:

To add a header to an image:

You can now simultaneously convert your tifs to jpgs and add in the header information above to those jpgs.

If the header text file is called AutWork.header, and your various tiff files are image1.tif, image2.tif, image3.tif, and image4.tif, then this is what you do:

You have now converted all the image*.tif files into image*.jpg files, and those .jpg files have the textual information from the header embedded within them; the .tif files have remained unchanged. (You can view the text in the images by viewing the .jpg files in xv, calling up the control window, and choosing the "comments" button.)

If you want textual information that's specific to one particular image, you need only do the following:

Fill in the fields with the information appropriate to the individual image. (These tags will also need the hash mark and space before them.)

Repeat step 3 above.

Image Processing on the Mac: ADDJFIFcomment
Alternative, and much less preferable methods, used before ImageMagick

NOTE: The text header must have a pound symbol and a space at the beginning of every line:

# text of header goes here

| Back | Next |
[UVA Electronic Centers] [Electronic Text Center Home] [UVA Library]