Abstract
We describe experimentsin building a classifier which determines the rhetorical
status of sentences. The research is part of a text summarisation project for the
legal domain and we use a newly compiled and annotated corpus of judgments of the UK House of Lords. Rhetorical role classification is an initial step which provides input to the sentence selection component of the system. We report results from experiments with four classifiers from the Weka package (C4.5, naive Bayes, Winnow and SVMs). We also report results using maximum entropy models both in a standard classification framework and in a sequence labelling framework. The SVM classifier and the maximum entropy sequence tagger yield the most promising results.
status of sentences. The research is part of a text summarisation project for the
legal domain and we use a newly compiled and annotated corpus of judgments of the UK House of Lords. Rhetorical role classification is an initial step which provides input to the sentence selection component of the system. We report results from experiments with four classifiers from the Weka package (C4.5, naive Bayes, Winnow and SVMs). We also report results using maximum entropy models both in a standard classification framework and in a sequence labelling framework. The SVM classifier and the maximum entropy sequence tagger yield the most promising results.
Original language | English |
---|---|
Title of host publication | In Proceedings of the 17th Annual Conference on Legal Knowledge and Information Systems (Jurix |
Number of pages | 10 |
DOIs | |
Publication status | Published - 2004 |