OpenLID-v2 (180325)

  • Laurie Burchell (Creator)

Dataset

Description

OpenLID-v2 is a high-coverage, high-performance language identification model covering 200 language classes. Its labels are compatible with the FLORES+ test set.

Data Citation

Burchell, L. (2025). OpenLID-v2 (180325) (2.0.1). Zenodo. https://doi.org/10.5281/zenodo.15056559
Date made available18 Mar 2025
PublisherZenodo
  • An open dataset and model for language identification

    Burchell, L., Birch, A., Bogoychev, N. & Heafield, K., 9 Jul 2023, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Rogers, A., Boyd-Graber, J. & Okazaki, N. (eds.). Toronto, Canada: Association for Computational Linguistics, p. 865-879 15 p. (Proceedings of the ACL Conference).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File

Cite this