[Standard query | Lemma query | Browse a file | Word lookup | Scan keywords/titles | Explore genre labels | Frequency lists | User settings | Query history | Saved queries | Create/Edit subcorpora | Post-query options ]

Explore genre labels

The Explore genre labels feature can be used to retrieve a list of BNC text files according to genre classification criteria. (The Genre classificiation scheme provided for the corpus is by David Lee.) The list of files you obtain can be used to create a subcorpus.

There are 3 basic steps to this function:

  1. Specify genre categories to scan
  2. Choose the files you require
  3. Choose whether to make a new subcorpus, or add to a pre-existing one

Each of these steps is explained in more detail below.


Step 1: The screenshot below shows you the options available at the first step:



To choose just ONE genre/sub-genre, simply use the drop-down menu to select your category, then press "Submit".

To choose more than one genre/sub-genre or have finer control over which pre-categorised texts to include in your subcorpus, use one or all three of the columns shown above. (A key to the abbreviations is given here.)

As the instructions on the screen indicate, you can select more than one category from each column:

In the above case, we have chosen Fiction texts ('fict') from the first column and "either poetry or prose" from the second (this means you land up with a corpus of just poetry or prose fiction). Choosing just 'fict' from the first column would have included fictional drama texts as well. 

HINT: All three columns above contain the same categories. Many of these categories are mutually exclusive, so there is some redundancy. For example, it is actually not strictly necessary to select 'fict' in the above example since 'poetry' and 'prose' are labels which are only applied to fictional texts anyway. Also, none of the labels for written and spoken genres/subgenres overlap, so it is actually never necessary to select "W" or "S" in any column (e.g. 'sports' is used for written texts (W_newsp_brdsht_nat_sports and  W_newsp_other_sports) whereas for the spoken corpus the label is 'sportslive', which stands for sports 'live' broadcast commentaries.

Step 2: Select individual files to include by clicking the relevant box on the right, or simply choose "include all files" at the bottom. 




Step 3: You can now either "Add"  the chosen files to a "New subcorpus" (the default) or add them to a pre-existing subcorpus (e.g. any other previously created subcorpora). The screenshot below shows the resulting page when the user chooses to add three files to a pre-existing subcorpus called "test2":

You are now ready to perform a Subcorpus query.

KNOWN BUG in Internet Explorer: Unfortunately, there appears to be a problem with some versions of Internet Explorer (running under Windows): if there are too many files in a particular genre search result (or a subcorpus you want to define using manually-entered filenames), the 'Submit' button either refuses to work or IE fails to submit all the filenames back to the BNCweb server and generates an error message. The only way you can find out whether or not your potential subcorpus list is too large (i.e. contains too many file IDs) is when you receive an error message telling you that your subcorpus cannot be created or when the 'Submit' button does not function. Two other popular browsers, Netscape and Opera, do not seem to have this problem, so use these browsers instead if you find that you cannot create a subcorpus.

 

 

Notes

  1. An argument for using genres in corpus-based work is given in David Lee's article in Language Learning and Technology (available on-line here). This article also outlines  the basis on which the texts were classified under the various genre categories in the BNC Index. For off-line composing of subcorpora, go here to retrieve a copy of the BNC Index.
[Standard query | Lemma query | Browse a file | Word lookup | Scan keywords/titles | Explore genre labels | Frequency lists | User settings | Query history | Saved queries | Create/Edit subcorpora | Post-query options ]