DSpace Repository

Identifying Technical Terms

Show simple item record

dc.contributor.author Chung, Teresa Mihwa
dc.date.accessioned 2008-07-30T02:22:24Z
dc.date.accessioned 2022-10-26T00:23:56Z
dc.date.available 2008-07-30T02:22:24Z
dc.date.available 2022-10-26T00:23:56Z
dc.date.copyright 2003
dc.date.issued 2003
dc.identifier.uri https://ir.wgtn.ac.nz/handle/123456789/23779
dc.description.abstract This thesis examines four possible approaches in identifying technical terms: (1) using the meaning of a word (referred to hereafter as the rating scale approach), (2) using clues provided in the text (the clue-based approach), (3) using a technical dictionary (the dictionary-based approach), and (4) using the range and frequency of word forms (the corpus comparison approach). As a pilot test, the four approaches were applied to a 5500 token anatomy corpus in order to decide which approach to identifying terms was the most effective. In order to identify terms using the meaning of a word, a four-point scale was designed according to the specificity of the meaning of each word to the subject of anatomy. Then an interrater reliability check was carried out to show to what extent the measures were agreed on by different raters. The accuracy score was 0.95. The rating scale approach was used as the basis for comparing and further evaluating other approaches to finding technical terms. As a result of this comparison, the clue-based approach and the dictionary-based approach were found to be unsatisfactory with around a 48% or 56% overlap with items identified by the rating scale approach. The clue-based approach had limitations due firstly to the nature of the clues and secondly to the fact that the writer selected items (and used clues to signal them) for his or her own purposes which had little or nothing to do with identifying terms. Furthermore it was not a practical approach because it required a great deal of extra decision making. The dictionary-based approach had weaknesses as a method due to the fact that the basis for including words in a dictionary is not clear and consistent, and the main goal of a medical dictionary is not to mark off technical terms but to assist people to comprehend unknown words in medical discourse. Dictionary makers therefore include many words that are not terms. The fourth way of identifying terms, the corpus comparison approach, used a ratio based on the range and frequency of word forms involving a technical base corpus and a general comparison corpus. The pilot study revealed that the corpus comparison approach is quite effective in identifying technical terms and their common collocates, and is reasonably simple and practical because (1) extra judgment is not required to the same extent as in the clue-based approach, (2) checking the meaning of each word by looking at the context or by looking up the technical dictionary is not required, and (3) sorting items and calculating formulas can be done using the computer. To check the advantages described above, the corpus comparison approach (using a ratio based on range and frequency) was applied to a 452,192 token anatomy corpus and a 93,445 token applied linguistics corpus. This approach worked reasonably consistently on the two quite different kinds of technical text and there was around 85% overlap between the items identified by the corpus comparison approach and the rating scale approach. These results suggest that if the aim is to make a rough estimate of the number of terms and their coverage of a technical text, the corpus comparison approach is satisfactory. However it is not adequate if the aim is to obtain a complete definitive list of terms. The more valid and reliable approach (but unfortunately the least practical and most labour intensive) is to use the rating scale approach. en_NZ
dc.format pdf en_NZ
dc.language en_NZ
dc.language.iso en_NZ
dc.publisher Te Herenga Waka—Victoria University of Wellington en_NZ
dc.subject Computational linguistics en_NZ
dc.subject Technical English en_NZ
dc.subject English language en_NZ
dc.subject Word frequency en_NZ
dc.subject Terms and phrases en_NZ
dc.subject Vocabulary en_NZ
dc.title Identifying Technical Terms en_NZ
dc.type Text en_NZ
vuwschema.type.vuw Awarded Doctoral Thesis en_NZ
thesis.degree.discipline Applied Linguistics en_NZ
thesis.degree.grantor Te Herenga Waka—Victoria University of Wellington en_NZ
thesis.degree.level Doctoral en_NZ
thesis.degree.name Doctor of Philosophy en_NZ


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account