
BNCweb Manual

   What is BNCweb
   Feature list
   Limitations of BNCweb
  Manual pages
   Main page options
   Post-query options
  Credits and acknowledgements
   Authors of BNCweb
   Authors of the manual
  Notes and links
   Site map
   BNCweb home
   Last updated: 8.5.2002

Limitations of BNCweb

Inherent limitations

BNCweb inherits many of its strengths from the SARA server software. However, it also derives from SARA some of its limitations.

The possibly most noticeable limitation is the fact that a BNC query always has to start off as lexical in nature. In other words, it is not possible to search for grammatical patterns (using POS-tags) from the outset. The following searches therefore cannot be carried out:

  • Find all nouns in the BNC
    (But you can compile a Frequency list of all nouns and then perform queries individually.)
  • Find any instance where pretty is followed by an adjective
    (But you can first search for pretty and then use the Sort feature to retrieve this data).
  • Find any instance where pretty is followed within five words by a noun
    (But you can first search for pretty and then use the Tag sequence search feature to retrieve this data).

In its current version (2.0), BNCweb cannot from the outset restrict searches to user-defined subcorpora. Rather, queries need to be performed over the whole BNC first, to be followed by a restriction to the subcorpus of your choice. It is, however, possible to restrict queries by a whole range of metatextual categories from the outset. Please consult the Standard query and Create/edit subcorpora manual pages for clarification.

BNCweb does not offer a user-friendly interface to all functions supported by the SARA server software. Some minor features may be added in the future. It is, however, possible to enter any query conforming to CQL-query syntax into the search box of the main page in BNCweb.

System-dependent limitations

BNCweb offers some features which are highly CPU-intensive and require a lot of disk-space. Default limits are therefore imposed on the following four features in terms of the number of hits to which they can be applied:

  • Collocations
  • Distribution analysis
  • Sort
  • Tag sequence search

A warning message will be displayed when your query result has more hits than allowed. If you need to use any of these features with a larger number of hits, contact your system administrator who can increase the limit globally or for individual users.

Methodological limitations

BNCweb has been designed to offer user-friendly access to a whole wealth of data in the BNC. It produces descriptive statistics on the fly that would require considerable "manual" work on the part of the researcher with other corpus linguistics tools. While this is certainly one of its advantages, it also has drawbacks: In our experience, users have sometimes exhibited too much enthusiasm at being able to compile endless lists and tables. It is important to stress that BNCweb produces only raw data - a meaningful interpretation of this data remains the task of the researcher. BNCweb can't replace human intuition - but it can relieve the careful scholar of a lot of tedious work. (See also the note on the Distribution feature.)