Using lexical patterns in the Google Web 1T corpus to deduce semantic relations between nouns
This paper investigates methods for using lexical patterns in a corpus to deduce the semantic relation that holds between two nouns in a noun-noun compound phrase such as "flu virus" or "morning exercise". Much of the previous work in this area has used automated queries to commercial web search engines. In our experiments we use the Google Web 1T corpus. This corpus contains every 2, 3, 4 and 5 gram occurring more than 40 times in Google's index of the web, but has the advantage of being available to researchers directly rather than through a web interface. This paper evaluates the performance of the Web 1T corpus on the task compared to similar systems in the literature, and also investigates what kind of lexical patterns are most informative when trying to identify a semantic relation between two nouns.
| Item Type | Conference or Workshop Item (Paper) |
|---|---|
| Copyright holders | © 2009 Association for Computational Linguistics |
| Departments | Methodology |
| Date Deposited | 08 Jul 2014 16:08 |
| URI | https://researchonline.lse.ac.uk/id/eprint/57582 |
Explore Further
- http://dl.acm.org/citation.cfm?id=1621980 (Publisher)
- http://www.aclweb.org/policies (Official URL)