We report on the sum project which applies automatic summarisation techniques to the legal domain. We describe our methodology whereby sentences from the text are classified according to their rhetorical role in order that particular types of sentence can be extracted to form a summary. We describe some experiments with judgements of the House of Lords: we have performed automatic linguistic annotation of a small sample set and then hand-annotated the sentences in the set in order to explore the relationship between linguistic features and argumentative roles. We use state-ofthe-art nlp techniques to perform the linguistic annotation using xml-based tools and a combination of rule-based and statistical methods. We focus here on the predictive capacity of tense and aspect features for a classifier. 1. INTRODUCTION Law reports form the most important part of a lawyer's or law student's reading matter. These reports are records of the proceedings of a court and their importance derives from the role that precedents play in English law. They are used as evidence for or against a particular line of legal reasoning. In order to make judgments accessible and to enable rapid scrutiny of their relevance, they are usually summarised by legal experts. These summaries vary according to target audience (e.g. students, solicitors). Manual summarisation can be considered as a form of information selection using an unconstrained vocabulary with no artificial linguistic limitations. Automatic summarisation, on the other hand, has postponed the goal of text generation de novo and currently focuses largely on the retrieval of relevant sections of the original text. The retrieved sections can then be used as the basis of summaries with the aid of suitable smoothing phrases.
|Title of host publication||ICAIL|
|Number of pages||9|
|Publication status||Published - 2003|