Etext HomeGeneral InfoCollectionsServicesFeaturesStandardsContact UsQuestions?VIRGO

Text Analysis Software Available at the Electronic Text Center

Word Processors

The easiest text-searching system is the one that is probably most familiar; most word processing programs (WordPerfect, Microsoft Word) allow one to search for text in a document, even one as large as several megabytes.

OpenText

OpenText, our primary text analysis software, is the searching tool for all our on-lin texts, used through its Web interfaces. OpenText handles SGML tagging well, and its great speed produces little slowing in function even on files as large as one gigabyte (1000 megabytes).

For more information, see the OpenText homepage.
http://www.opentext.com

Collate

A Macintosh program, Collate enables a user to compare the text of up to one hundred different versions of a single work. Users are able to switch base texts, alter the process of collation by telling Collate to ignore certain idiosyncracies if necessary, and produce many different output formats, all with the point- and-click simplicity of the mouse.

Collate accepts SGML mark-up in its texts, and tacitly encourages its use by requiring some form of tagging within the document.

TACT 2.1.4

UVa TACT helpsheet

TACT 2.1.4, a text analysis program that operates under MS-DOS, will not only generate a keyword-in-context display for any word in a given text, but also bar graph displays to track frequency of use for a given word through the course of a text, and an index of one- line KWIC citations with their appearance. Users can search for strings of words based on similarity, or construct a thematic search; additionally, they can create hierarchical structures for their own purposes, and save all their queries to an ASCII file for another day.

See also TACTWEB
http://tactweb.mcmaster.ca/tactweb/doc/tact.htm

MonoConc for Windows

http://www.athel.com/mono.html

WordSmith

http://www4.oup.co.uk/isbn/0-19-459286-3

WordCruncher

Recently resurrected, WordCruncher is a text reader/analysis program that does not read ASCII files directly, but rather generates an index from that file to read. Recognizing this subtle distinction makes it easier to understand WordCruncher's "15,000 word limit;" it is not simply words that WordCruncher counts, but unique words. The entire corpus of an author, for instance, can be read under WordCruncher, provided the author used less than 15,000 different words in all his/her works.

Once texts are indexed, WordCruncher enables the user to search large texts easily, as well as generate keyword-in-context (KWIC) concordances, word frequency lists, and what WordCruncher calls "book-style" indexes.