Projects

COST KGELL

Knowledge Graphs in the Era of Large Language Models (KGELL) Knowledge Graphs (KGs) have gained attention due to their ability to represent structured and interlinked information. KGs represent knowledge in the form of relations between entities, referred to as facts, typically grounded in formal ontological models. Such machine-readable formats enable AI systems to make decisions using clear and verifiable data. Consequently, KGs have become essential elements in web search engines, recommendation systems, etc. Large Language Models (LLMs) have revolutionized the landscape of AI and are widely utilized in various NLP tasks such as natural language understanding, question answering, etc. Despite their remarkable performance, LLMs suffer from some significant drawbacks. First, they are trained on general-purpose data and have lower performance in domain-specific tasks and low- resource languages. Secondly, they often reflect societal biases present in training data, which can result in biased outcomes. Third, LLMs sometimes produce inaccurate or made-up information, termed “hallucinations”. Finally, understanding the decision-making process of LLMs is challenging and their outputs may lack consistency. A potential solution to all these problems is to integrate LLMs with KGs, since KGs can provide factual information and the ability to perform reasoning. This would boost the LLM’s domain-specific reasoning, and interpretability, and mitigate biases and hallucinations. A notable challenge with KGs is their requirement for frequent updates, usually performed by processing and integrating information from vast textual datasets, LLMs can aid in generating and refining KGs. Therefore, combining LLMs and KGs offers a promising opportunity to advance both technologies and represents a pivotal challenge in the contemporary research landscape. KGELL

LangNet-KG

LangNet-KG: Knowledge Graph for Climate Change Research (LangNet-KG: Graf znanja o klimatskim promjenama)- Global warming and climate change pose an increasing challenge to the planet, ecosystems, and society, causing extreme weather events. At the same time, digital resources for studying climate change-related texts are fragmented, incomplete, or nonexistent. Therefore, developing a climate change knowledge graph that integrates scientific insights contributes to the advancement of new methods in information sciences, provides valuable resources for climate research, and opens possibilities for applications in education, public policy, and sustainable development. The goal of the LangNet-KG project is to create a knowledge graph that connects key entities and relations, enabling the representation of climate change knowledge. The research is based on a corpus of scientific texts, but existing NLP resources and methods are not tailored to this domain. The project will develop digital resources (corpus, named entities, relations, and a knowledge graph) for automated information extraction from scientific papers related to climate change research, and will create machine learning and deep learning methods for their analysis. A corpus of high-ranked open-access scientific papers will be used to train deep language models (BERT, DistillRoBERTa). The methods will include named entity recognition, relation extraction, and knowledge graph construction, which will be tested on challenges related to reducing misinformation and hallucinations in text generation using large language models (LLMs). LangNet-KG is "Funded by European Union - NextGenerationEU".

DGITAL4Security

DGITAL4Security – European Masters Programme in Cybersecurity Management & Data Sovereignty DIGITAL4Security stands as an innovative and market-driven European Masters Programme dedicated to Cybersecurity Management & Data Sovereignty. This program is strategically designed to empower European SMEs and companies across diverse sectors with the necessary cybersecurity management, regulatory, and technical skills. By doing so, it aims to proactively address and counteract existing and emerging cybersecurity threats, thereby fortifying European industries against potential cyber-attacks. Digital4Security Programme is inherently market and industry demand-driven, perpetually adapting to address current and future cybersecurity risks. It is geared towards supporting European companies, particularly SMEs, in minimizing security risks, constructing robust defenses, and adeptly managing any cyber incidents that may arise. By aligning with the objectives of the DIGITAL Europe Programme, DIGITAL4Security seeks to expedite the entry of a significant number of graduates into high-demand roles outlined by the European Cybersecurity Skills Framework (ECSF). This strategic approach ensures that critical occupational profiles are filled, vital for the ongoing security and success of European businesses. Digital4Security aim is to reskill and upskill graduates, professionals, managers, and business leaders, fostering a state of 'Cyber Confidence.' Through comprehensive cybersecurity management expertise, we aim to empower individuals to strengthen their cybersecurity infrastructure and implement resilient incident prevention and management procedures. At a global level, this project aspires to design and implement a highly innovative, effective, and sustainable European Cybersecurity Masters Programme. The program's primary goal is to consistently produce qualified cybersecurity management experts, actively contributing to closing the widening cybersecurity skills gap that poses a threat to the stability of numerous European industries and public sector institutions.

EDIH Adria

The European Digital Innovation Hub EDIH Adria (EDIH Adria) is based on two key technologies for which there is a need for use for the intended purposes in the Republic of Croatia – artificial intelligence (AI) and high-performance computing (HPC). The proposed consortium has significant existing competencies, resources, knowledge and capacities for knowledge transfer towards target users. EDIH Adria is an agile consortium created for user-centric performance, as a partnership consisting of two scientific research institutions (University of Rijeka and University of Pula), companies that are global technology leaders (Ericsson Nikola Tesla and Infobip, first Croatian unicorn), a business support institution (Step Ri leading Croatian business support institution) and a technology & innovation cluster (Smart RI, the national competence centre for smart cities). EDIH Adria will implement AI and HPC in the following sectors: health and quality of life (including health tourism), transport and mobility (with emphasis on the maritime sector), and energy and sustainable development. Covering both the private and public sectors in Adriatic Croatia, the foreseen activities include an elaborated set of digitalization services and educations for the users, networking, complementing and cooperating throughout Croatia, as well as with other EDIHs in the European Union, thus encouraging the development of an ecosystem in which entrepreneurs, employers' associations and support institutions on one part, the academic and education sector on the second and the public sector and policy makers on the third, with financial support, form the basis for creating the Hub and the innovation ecosystem around it.

CEEPUS Development of Computational Thinking

The network is focused on the development of computational thinking. Each of the partner institutions allows to support the development of specific components of computational thinking through a wide range of study programs and technical backgrounds, and thus contribute to the comprehensive development not only for students but also for teachers. Among the offered study areas are, for example, algorithmization and programming, automation and robotics, virtual and augmented reality, artificial intelligence, etc. Within the network, it is possible to exchange not only students at all levels of higher education but also teachers. A key aspect is the cooperation of individual institutions to exchange examples of good practice, experience, views on the issues addressed, etc. to achieve the most effective combinations of approaches to the development of informatics thinking. The network is also fully prepared for virtual and hybrid mobility thanks to its own e-learning platform. CEEPUS link

HRZZ InfoCov

Multilayer Framework for the Information Spreading Characterization in Social Media during the COVID-19 Crisis InfoCoV is a research project funded by Croatian Science Foundation. Communication through social media has been gaining importance in responses to major crises, such as COVID-19. In emergency situations, there is an urgent need to rely on trustworthy information. On the other side, we are all witnessing a huge amount of misinformation (fake news, conspiracy theories also spreading on social media, especially during a crisis. In this light, understanding and recognising the information spreading patterns in social media plays an important role and opens various possibilities for alleviating fear, stereotyping and uncertainty, strengthening responsible individual and group behaviour and trust in public authorities in social media communications. The automatic recognition of information spreading patterns may improve various aspects of crisis communication, such as: the classification of positive and negative public attitudes to certain policies and restrictions; choosing the best communication patterns to promote important information in social media; detection, prediction and preventing fake news spreading; and many more. Hence, it is important to recognise how different types of information are transmitted and dispersed through social media during crisis communication. The first step toward understanding the information spreading patterns is to perform a quantitative and qualitative analysis of textual information in social media and to identify which characteristics of information spreading can differentiate between various information spreading patterns. The main objective of the proposed research is to study and characterise information spreading patterns in the social media during the COVID-19 pandemic.

H2020 MESOC

H2020 MESOC is a Research and Innovation Action designed to propose, test and validate an innovative and original approach to measuring the societal value and impacts of culture and cultural policies and practices, related to three crossover themes of the new European Agenda for Culture: 1) Health and Wellbeing, 2) Urban and Territorial Renovation and 3) People’s Engagement and Participation. The global aim is to respond to the challenge posed by the H2020 Call (“To develop new perspectives and improved methodologies for capturing the wider societal value of culture, including but also beyond its economic impact”). To do so, MESOC adapts and further develops a method for “transition based” impact assessment derived from a previous UNESCO Chair publication, building a structural model of the Societal Dimension of Culture, as defined by one of the strategic objectives of the European Agenda. The model will be tested within 10 European city pilots: Athens, Barcelona, Cluj, Gent, Issy-les-Moulineaux, Milano, Rijeka, Turku, Valencia and Warsaw.

COST Multi3Generation

Multi3Generation: Multi-task, Multilingual, Multi-modal Language Generation Language generation (LG) is a crucial technology if machines are to communicate with humans seamlessly using human natural language. A great number of different tasks within Natural Language Processing (NLP) are language generation tasks, and being able to effectively perform these tasks implies (1) that machines are equipped with world knowledge that can require multi-modal processing and reasoning (e.g. textual, visual and auditory inputs, or sensory data streams), and (2) the study of strong, novel Machine Learning (ML) methods (e.g. structured prediction, generative models), since virtually all state-of-the-art NLP models are learned from data. Moreover, human languages can differ wildly in their surface realisation (i.e. scripts) as well as their internal structure (i.e. grammar), which suggests that multilinguality is a central goal if machines are to perform seamless language generation. Language generation technologies would greatly benefit both public and private services offered to EU citizens in a multilingual Europe, and have strong economic and societal impacts. Multi3Generation: WEB

Keyword Extraction and Summarization Based on Language Networks - LangNet

We live in a time of exponential growth of data – in a big data era, where texts represent a significant portion of unstructured data sources, which require advanced computing techniques for processing, fast indexing, retrieval, classification, even fast reading enabled by text summarization. Design and development of new techniques for keyword extraction and document summarization are in the very core of natural language processing of large quantities of texts. Recently, besides complex networks, deep learning methods and multi-layered neural networks are gaining attraction in the research community, leading to success in various applications of artificial intelligence. In the proposed research, we are planning to investigate various representations of texts based on new and innovative combinations of complex networks and deep neural networks for solving the problems of keyword extraction and document summarization. Besides for English – the reference language in the field, the research will be conducted for the Croatian language as well. The goal of this research is to develop new methods for keyword extraction and extractive summarization enabled by the formalisms of complex networks and neural networks. The comparative analysis of the performance of the proposed methods in terms of their effectiveness and correctness will be performed on English and Croatian texts.

COST COSTNET

Cost COSTNET: A major challenge in many modern economic, epidemiological, ecological and biological questions is to understand the randomness in the network structure of the entities they study: for example, the SARS epidemic showed how preventing epidemics relies on a keen understanding of random interactions in social networks, whereas progress in curing complex diseases is aided by a robust data-driven network approach to biology. Although analysis of data on networks goes back to at least the 1930s, the importance of statistical network modelling for many areas of substantial science has only been recognised in the past decade. The USA is at the forefront of institutionalizing this field of science through various interdisciplinary projects and networks. Also in Europe there are excellent statistical network scientists, but until now cross-disciplinary collaboration has been slow. This Action aims to facilitate interaction and collaboration between diverse groups of statistical network modellers, establishing a large and vibrant interconnected and inclusive community of network scientists. The aim of this interdisciplinary Action is two-fold. On the scientific level, the aim is to critically assess commonalities and opportunities for cross-fertilization of statistical network models in various applications, with a particular attention to scalability in the face of Big Data. On a meta-level, the aim is to create a broad community which includes researchers across the whole of Europe and at every stage in their scientific career and to facilitate contact with stakeholders.
COSTNET 1 – Causation
COSTNET 2 – Monte Carlo
COSTNET 3 – Radicalisation
COSTNET 4 – Evolution
COSTNET 5 – COVID
COSTNET 6 – SOAM

Language Networks - LangNet, University of Rijeka grant

Written, as well as spoken language can be modeled via complex networks where the lingual units (words) are represented by vertices and their linguistic interactions by links. Language networks are a powerful formalism to the quantitative study of language structure at various language sublevels: phonological, morphological, syntactic or semantic. Language network analysis enables: the examination of structural complexities of each language sublevel and their mutual interactions; the systematic investigation of language evolution; the modeling of language acquisition; the modeling of mental lexicons, assessing the text quality, authorship attribution, disambiguation of the word’s meaning in a semantic context. The aim of the LangNet project is to design the methodology for complexity evaluation across language levels using complex networks by establishing an information science bridge between linguistics, complex networks and natural language processing. The word-level and subword-level language networks will be constructed from various Croatian texts, lexicons and dictionaries. So far, there have been no systematic efforts to model the phenomena of various Croatian language subsystems and examine their functions through complex networks. Obtaining such findings is critical for deepening our understanding of conceptual similarities, differences and universalities in natural languages. The proposed methodology will reveal the currently unavailable structural properties of the Croatian language at subword-level and word-levels: phonological, phonetic, syllabic; co-occurrences and syntax. Language network analysis can be further extended in the direction of intelligent applications in the field of natural language processing.

COST Keystone

Cost Keystone is a cooperative network of researchers, practitioners, and application domain specialists working in fields related to semantic data management, the Semantic Web, information retrieval, artificial intelligence, machine learning and natural language processing, that coordinates collaboration among them to enable research activity and technology transfer in the area of keyword-based search over structured data sources. The coordination effort will promote the development of a new revolutionary paradigm that provides users with keyword-based search capabilities for structured data sources as they currently do with documents. Furthermore, it will exploit the structured nature of data sources in defining complex query execution plans by combining partial contributions from different sources.

COST iV&L Net

The explosive growth of visual and textual data (both on the World Wide Web and held in private repositories by diverse institutions and companies) has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data depend on the semantic gap between vision and language being bridged, which in turn calls for expertise from two so far unconnected fields: Computer Vision (CV) and Natural Language Processing (NLP). The central goal of iV&L Net is to build a European CV/NLP research community, targeting 4 focus themes: (i) Integrated Modelling of Vision and Language for CV and NLP Tasks; (ii) Applications of Integrated Models; (iii) Automatic Generation of Image & Video Descriptions; and (iv) Semantic Image & Video Search. iV&L Net will organise annual conferences, technical meetings, partner visits, data/task benchmarking, and industry/end-user liaison. Europe has many of the world's leading CV and NLP researchers. Tapping into this expertise, and bringing the collaboration, networking and community building enabled by COSTActions to bear, iV&L Net will have substantial impact, in terms of advances in both theory/methodology and real world technologies

langnet