Automatic genre identification: a survey Taja KuzmanNikola Ljubešić Survey Open access 16 November 2023
Correction: The DELAD initiative for sharing language resources on speech disorders Alice LeeNicola BessellSatu Saalasti Correction Open access 06 November 2023
LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI Ishan TaruneshSomak AdityaMonojit Choudhury Original Paper 04 November 2023 Pages: 427 - 458
Building the VisSE Corpus of Spanish SignWriting Antonio F. G. SevillaAlberto Díaz EstebanJosé María Lahoz-Bengoechea Original Paper 26 October 2023 Pages: 585 - 607
Beyond plain toxic: building datasets for detection of flammable topics and inappropriate statements Nikolay BabakovVarvara LogachevaAlexander Panchenko Original Paper 21 October 2023 Pages: 459 - 504
Text augmentation for semantic frame induction and parsing Saba AnwarArtem ShelmanovChris Biemann Original Paper Open access 21 October 2023 Pages: 363 - 408
A new corpus of geolocated ASR transcripts from Germany Steven Coats Project Notes Open access 21 October 2023
NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese Sidney Evaldo LealMagali Sanches DuranSandra Maria Aluísio SURVEY 17 October 2023 Pages: 73 - 110
A semi-supervised method to generate a persian dataset for suggestion classification Leila SafariZanyar Mohammady Original Paper 29 September 2023 Pages: 839 - 858
NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links Natalia LoukachevitchEkaterina ArtemovaAlexey Yandutov Original Paper 21 September 2023 Pages: 547 - 583
A survey and study impact of tweet sentiment analysis via transfer learning in low resource scenarios Manoel Veríssimo dos Santos NetoNádia Félix F. da SilvaAnderson da Silva Soares Original Paper 14 September 2023 Pages: 133 - 174
An eye-tracking-with-EEG coregistration corpus of narrative sentences Stefan L. FrankAnna Aumeistere Original Paper Open access 29 August 2023 Pages: 641 - 657
Data augmentation strategies to improve text classification: a use case in smart cities Luciana BenckeViviane Pereira Moreira Original Paper 23 August 2023 Pages: 659 - 694
The development of a labelled te reo Māori–English bilingual database for language technology Jesin JamesIsabella ShieldsKeoni Mahelona Original Paper 20 August 2023
Comparative performance of ensemble machine learning for Arabic cyberbullying and offensive language detection Marwa KhairyTarek M. MahmoudTarek Abd El-Hafeez Original Paper Open access 13 August 2023 Pages: 695 - 712
RUN-AS: a novel approach to annotate news reliability for disinformation detection Alba Bonet-JoverRobiert Sepúlveda-TorresMario Nieto-Pérez Original Paper Open access 06 August 2023 Pages: 609 - 639
Fine-tuning language models to recognize semantic relations Dmitri RoussinovSerge SharoffNadezhda Puchnina Original Paper Open access 23 July 2023 Pages: 1463 - 1486
Assessment of pragmatic abilities and cognitive substrates (APACS) brief remote: a novel tool for the rapid and tele-evaluation of pragmatic skills in Italian Luca BischettiChiara PompeiValentina Bambini Original Paper 23 July 2023
The limitations of irony detection in Dutch social media Aaron MaladryEls LefeverVéronique Hoste Original Paper Open access 23 July 2023
MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish Ismael Garrido-MuñozFernando Martínez-SantiagoArturo Montejo-Ráez Original Paper Open access 23 July 2023
The Visual Language Research Corpus (VLRC): an annotated corpus of comics from Asia, Europe, and the United States Neil CohnBruno CardosoIrmak Hacımusaoğlu Original Paper Open access 14 July 2023 Pages: 1729 - 1744
adaptNMT: an open-source, language-agnostic development environment for neural machine translation Séamus LankfordHaithem AfliAndy Way Original Paper Open access 14 July 2023 Pages: 1671 - 1696
FullStop: punctuation and segmentation prediction for Dutch with transformers Vincent VandeghinsteOliver Guhr Original Paper 14 July 2023
Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis Angelina GašparAni GrubišićInes Šarić-Grgić Original Paper 10 July 2023 Pages: 1431 - 1461
Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models Daniela ViannaFernando CarneiroAline Paes Original Paper 28 June 2023 Pages: 223 - 272
The C-ORAL-ESQ project: a corpus for the study of spontaneous speech of individuals with schizophrenia Tommaso RasoBruno Neves Rati de Melo RochaHeliana Mello Original Paper 27 June 2023
CachacaNER: a dataset for named entity recognition in texts about the cachaça beverage Priscilla SilvaArthur FrancoDenilson Pereira Original Paper 17 June 2023
The robotic-surgery propositional bank Marco BombieriMarco RospocherPaolo Fiorini Original Paper Open access 13 June 2023
The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond A. BrancoM. EskevichC. Zinn Original Paper Open access 12 June 2023
A benchmark dataset and evaluation methodology for Chinese zero pronoun translation Mingzhou XuLongyue WangZhaopeng Tu Original Paper 10 June 2023 Pages: 1263 - 1293
Using BERT models for breast cancer diagnosis from Turkish radiology reports Pınar Uskaner HepsağSelma Ayşe ÖzelAdnan Yazıcı Original Paper 10 June 2023
Content-free speech activity records: interviews with people with schizophrenia Francesco CangemiMartine GriceKai Vogeley Original Paper Open access 07 June 2023
OMCD: Offensive Moroccan Comments Dataset Kabil EssefarHassan Ait BahaIsmail Berrada Original Paper 05 June 2023 Pages: 1745 - 1765
Assessing linguistic generalisation in language models: a dataset for Brazilian Portuguese Rodrigo WilkensLeonardo ZilioAline Villavicencio Original Paper 02 June 2023 Pages: 175 - 201
The DELAD initiative for sharing language resources on speech disorders Alice LeeNicola BessellSatu Saalasti Project Notes Open access 14 May 2023
PolitePEER: does peer review hurt? A dataset to gauge politeness intensity in the peer reviews Prabhat Kumar BhartiMeith NavlakhaAsif Ekbal Original Paper 14 May 2023
Automatic language identification: a case study of Pahari languages Rachana GusainSatya Ranjan DashGirish Nath Jha Special Focus: Applications of established methods to new language 12 May 2023 Pages: 1361 - 1387
Evaluation of the Brazilian Portuguese version of linguistic inquiry and word count 2015 (BP-LIWC2015) Flavio CarvalhoFabio Paschoal JuniorGustavo Guedes Original Paper 03 May 2023 Pages: 203 - 222
CsFEVER and CTKFacts: acquiring Czech data for fact verification Herbert UllrichJan DrchalVáclav Moravec Original Paper Open access 03 May 2023 Pages: 1571 - 1605
A study on methods for revising dependency treebanks: in search of gold Cláudia FreitasElvis de Souza Original Paper 03 May 2023 Pages: 111 - 131
Human-inspired computational models for European Portuguese: a review António TeixeiraSamuel Silva Survey Open access 03 May 2023 Pages: 43 - 72
Automatic generation of creative text in Portuguese: an overview Hugo Gonçalo Oliveira Survey Open access 03 May 2023 Pages: 7 - 41
Lexical modeling for the development of Amharic automatic speech recognition systems Martha Yifiru TachbelieSolomon Teferra Abate Original Paper 03 May 2023 Pages: 963 - 984
OLID-BR: offensive language identification dataset for Brazilian Portuguese Douglas TrajanoRafael H. BordiniRenata Vieira Original Paper 03 May 2023
Rant or rave: variation over time in the language of online reviews Yftah ZiserBonnie WebberShay B. Cohen Original Paper Open access 31 March 2023 Pages: 1329 - 1359
Blackfoot Words: a database of Blackfoot lexical forms Natalie WeberTyler BrownLena Venkatraman Original Paper Open access 31 March 2023 Pages: 1207 - 1262
Finnish parliament ASR corpus Anja VirkkunenAku RouheMikko Kurimo Original Paper Open access 27 March 2023 Pages: 1645 - 1670
Design and construction of Guayaquil radio speech corpus (CHARG) Brygida Sawicka-Stępińska Project Notes Open access 25 March 2023 Pages: 1405 - 1422
Hope speech detection in Spanish Daniel García-BaenaMiguel Ángel García-CumbrerasRafael Valencia-García Original Paper Open access 17 March 2023 Pages: 1487 - 1514