New version of Glossa
The Text Laboratory is currently developing a new version of the search and post-processing tool Glossa.
The new Glossa version offers a modern, simple and functional search interface with advanced post-processing possibilities for both written corpora, multilingual corpora and speech corpora. The new version of Glossa is so easy to install that institutions and researchers can create their own corpora and put them on their own server or even on their laptop.
The new version will also be used to search corpora located on different servers from the one where Glossa itself is installed. This is possible by using CLARIN federated content search.
In addition, Glossa has a modern interface which can be easily themed for individual Glossa installations. The new version of Glossa offers several versions of the search interface: In addition to a version similar to the interface of the old Glossa version, there is also a simpler (Google-like) interface and a more advanced interface that allows the use of regular expressions.
Work on the new version of Glossa is funded by CLARINO, and the entire system is open source software that can be freely downloaded. The system is still under development, and some of the functionality of the old version has not yet been transferred to the new one. The new version offers login through Feide and eduGAIN and has a system for authorization of different corpus licenses.
More and more corpora from theText Laboratory will use the new version of Glossa.
Read a user guide for the new version of Glossa: User guide for the NORINT Corpus (in Norwegian)
The new Glossa was selected as a showcase during the annual CLARIN meeting 2013 in Prague. See the presentation on the CLARIN web page.
Nøklestad, Anders, Hagen, Kristin, Johannessen, Janne Bondi, Kosek, Michal and Joel Priestley. 2017. A modernised version of the Glossa corpus search system. In Jörg Tiedemann (ed.): Proceedings of the 21st Nordic Conference on Computational Linguistics (NoDaLiDa). 2017, 251-254. Read the paper.
Kosek, Michal , Anders Nøklestad, Joel Priestley, Kristin Hagen, and Janne Bondi Johannessen. 2015. In Gintarė Grigonytė, Simon Clematide, Andrius Utka and Martin Volk (eds.): Visualisation in speech corpora: maps and waves in the Glossa system, Proceedings of the Workshop on Innovative Corpus Query and Visualization Tools at NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania, NEALT Proceedings Series 25, 23–31. Read the paper.