While its principal function is to produce concordance output from the
corpus - that is, display each use of a given word, phrase or other
pattern in its surrounding context - the program in fact provides a range of
functions that go well beyond simple concordancing. Some of these functions
are inherited from the SARA server, while others are unique to BNCweb. The list of features includes:
-
detailed specification of the categories of text, and/or
sections of a text to constrain a search to
-
detailed specification about what to search for
-
a user-friendly interface for navigating through concordance output
-
facility to subsequently manipulate the concordance
-
facility to access metatextual information and frequency data
These functions can perhaps best be understood by example:
Illustrating (1), the user can search for the word lovely
- within written or spoken texts only,
- within fiction texts of high circulation, written in the 1980s by female authors
- within the headlines of newspaper texts
- within spoken texts (i.e. transcriptions)
- within spoken live sports commentaries
- within spoken texts, among the words uttered by male speakers, aged 35-60, from the
South of the UK.
Illustrating (2), it is possible to find the word clean
- in proximity to other items, e.g. clean and tidy within
the same sentence, or within the same document
- belonging to a particular part-of-speech category, e.g. clean as a
verb, or clean as an adjective
- lemmatized, finding e.g. positive, comparative and superlative forms of the
adjective (clean, cleaner, cleanest), or the
variant forms of the verb (clean, cleans,
cleaned, cleaning)
- more advanced search patterns, e.g. words beginning or ending with
clean (cleanliness, unclean etc.), or other words
containing only vowels between cl and n (e.g.
clan, clone, decline, etc.)
Illustrating (3), BNCweb's interface allows you
- the choice of viewing concordance lines with sentence context, or in
Keyword-in-Context mode
- display of structural markup in the BNC, e.g. pauses and overlaps in spoken
texts, paragraphs and headings in written texts
- instant access from any concordance line to
- the larger context of the citation
- a bibliographic record describing the source of the citation
- part-of-speech analysis of the words in the citation
Illustrating (4), you can manipulate your concordance by
- alphabetically sorting on any position within 10 words of the search item
- 'thinning' a large concordance to a more manageable size
- deleting examples by hand
- displaying frequency distributions for the item searched for, in different
parts of the corpus
- displaying collocations with a choice of measures of collocational strength
- specificying simple and complex patterns of grammatical tags to be
retrieved in the neighbourhood of the search item
- saving the query result, i.e. the concordance in its last state after all
manipulations, for re-use at a later date.
- exporting the concordance to other software, with fine-tuned control of
categories to be downloaded with the concordance
Illustrating (5), you can
- compile frequency lists of lexical items (based on user-definable criteria such as POS-tags,
lemma types or character patterns)
- create lists of texts based on:
- keyword or title information
- David Lee's genre classification
These lists can be used to define subcorpora.
|