Glossa

New version of the search and post-processing tool Glossa.

Search interface for Nordic Dialect Corpus

The new version of the Glossa system is developed with financing from the infrastructure project CLARINO+. In this project the main focus has been on developing different solutions for searches and result presentations.

NEWS

See how to use the new version of Glossa in two instruction videos (in Norwegian):

Video 1: About simple and advanced searches, how to choose metadata and results shown as concordances
Video 2: Results shown as maps, frequency lists and metadata distribution, metadata selection using a map and how to explore a corpus with Voyant, a web-based application for performing text analysis

See available corpora in the new version of Glossa today

Download the new version of Glossa from GitHub

More about Glossa

Glossa offers a modern, simple and functional search interface with advanced post-processing possibilities for both written corpora, multilingual corpora and speech corpora.

Glossa is easy to install so that institutions and researchers can create their own corpora and put them on their own server or even on their laptop.

Glossa can also be used to search corpora located on different servers from the one where Glossa itself is installed. This is possible using CLARIN federated content search.

In addition, Glossa has a modern interface which can be easily themed for individual Glossa installations and metadata menus. Glossa offers several versions of the search interface: a simple (Google-like) interface, a more advanced interface with menus, check boxes etc. for e.g. lemma or POS searches, and an interface that allows the use of regular expressions.

Glossa offers login through Feide and CLARIN/eduGAIN and has a system for authorization of different corpus licenses.

Work on Glossa has been funded by CLARINO+, CLARINO and the LIA project, and the entire system is open source software that can be freely downloaded.

Download the previous version of Glossa from GitHub

Most corpora have user guides in English or Norwegian. Read e.g the user guide for LIA Sápmi or Nordic Dialect Corpus .

References:

Nøklestad, Anders, Hagen, Kristin, Johannessen, Janne Bondi, Kosek, Michal and Joel Priestley. 2017. A modernised version of the Glossa corpus search system. In Jörg Tiedemann (ed.): Proceedings of the 21st Nordic Conference on Computational Linguistics (NoDaLiDa). 2017, 251-254. Read the paper.

Kosek, Michal , Anders Nøklestad, Joel Priestley, Kristin Hagen, and Janne Bondi Johannessen. 2015. In Gintarė Grigonytė, Simon Clematide, Andrius Utka and Martin Volk (eds.): Visualisation in speech corpora: maps and waves in the Glossa system, Proceedings of the Workshop on Innovative Corpus Query and Visualization Tools at NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania, NEALT Proceedings Series 25, 23–31. Read the paper.

Glossa was selected as a showcase during the annual CLARIN meeting 2013 in Prague. See the presentation on the CLARIN web page.

Published Oct. 7, 2016 3:47 PM - Last modified Nov. 6, 2023 2:12 PM