Abstract / Description of output
We describe research carried out as part of a text summarisation project for the legal domain for which we use a new XML corpus of judgments of the UK House of Lords. These judgments represent a particularly important part of public discourse due to the role that precedents play in English law. We present experimental results using a range of features and machine learning techniques for the task of predicting the rhetorical status of sentences and for the task of selecting the most summary-worthy sentences from a document. Results for these components are encouraging as they achieve state-of-the-art accuracy using robust, automatically generated cue phrase information. Sample output from the system illustrates the potential of summarisation technology for legal information management systems and highlights the utility of our rhetorical annotation scheme as a model of legal discourse, which provides a clear means for structuring summaries and tailoring them to different types of users.