Abstract / Description of output
This paper describes a method for linear text segmentation that is more accurate or at least as accurate as state-of-the-art methods (Utiyama and Isahara, 2001; Choi, 2000a). Inter-sentence similarity is estimated by latent semantic analysis (LSA). Boundary locations are discovered by divisive clustering. Test results show LSA is a more accurate similarity measure than the cosine metric (van Rijsbergen, 1979).
Original language | English |
---|---|
Title of host publication | Proceedings of the Conference on Empirical Methods in Natural Language Processing |
Pages | 109-117 |
Number of pages | 9 |
Publication status | Published - 2001 |