Multi3Generation: Multi-task, Multilingual, Multi-modal Language Generation Language generation (LG) is a crucial technology if machines are to communicate with humans seamlessly using human natural language. A great number of different tasks within Natural Language Processing (NLP) are language generation tasks, and being able to effectively perform these tasks implies (1) that machines are equipped with world knowledge that can require multi-modal processing and reasoning (e.g. textual, visual and auditory inputs, or sensory data streams), and (2) the study of strong, novel Machine Learning (ML) methods (e.g. structured prediction, generative models), since virtually all state-of-the-art NLP models are learned from data. Moreover, human languages can differ wildly in their surface realisation (i.e. scripts) as well as their internal structure (i.e. grammar), which suggests that multilinguality is a central goal if machines are to perform seamless language generation. Language generation technologies would greatly benefit both public and private services offered to EU citizens in a multilingual Europe, and have strong economic and societal impacts.
Keyword Extraction and Summarization Based on Language Networks - LangNet
We live in a time of exponential growth of data – in a big data era,
where texts represent a significant portion of unstructured data
sources, which require advanced computing techniques for processing,
fast indexing, retrieval, classification, even fast reading enabled by
text summarization. Design and development of new techniques for
keyword extraction and document summarization are in the very core of
natural language processing of large quantities of texts. Recently,
besides complex networks, deep learning methods and multi-layered neural
networks are gaining attraction in the research community, leading to
success in various applications of artificial intelligence. In the
proposed research, we are planning to investigate various
representations of texts based on new and innovative combinations of
complex networks and deep neural networks for solving the problems of
keyword extraction and document summarization. Besides for English – the
reference language in the field, the research will be conducted for the
Croatian language as well. The goal of this research is to develop new
methods for keyword extraction and extractive summarization enabled by
the formalisms of complex networks and neural networks. The comparative
analysis of the performance of the proposed methods in terms of their
effectiveness and correctness will be performed on English and Croatian
Cost COSTNET: A major challenge in many modern economic, epidemiological, ecological and biological questions is to understand the randomness in the network structure of the entities they study: for example, the SARS epidemic showed how preventing epidemics relies on a keen understanding of random interactions in social networks, whereas progress in curing complex diseases is aided by a robust data-driven network approach to biology.
Although analysis of data on networks goes back to at least the 1930s, the importance of statistical network modelling for many areas of substantial science has only been recognised in the past decade. The USA is at the forefront of institutionalizing this field of science through various interdisciplinary projects and networks. Also in Europe there are excellent statistical network scientists, but until now cross-disciplinary collaboration has been slow.
This Action aims to facilitate interaction and collaboration between diverse groups of statistical network modellers, establishing a large and vibrant interconnected and inclusive community of network scientists. The aim of this interdisciplinary Action is two-fold. On the scientific level, the aim is to critically assess commonalities and opportunities for cross-fertilization of statistical network models in various applications, with a particular attention to scalability in the face of Big Data. On a meta-level, the aim is to create a broad community which includes researchers across the whole of Europe and at every stage in their scientific career and to facilitate contact with stakeholders.
Language Networks - LangNet, University of Rijeka grant
Written, as well as spoken language can be modeled via complex networks where the lingual units (words) are represented by vertices and their
linguistic interactions by links. Language networks are a powerful formalism to the quantitative study of language structure at various
language sublevels: phonological, morphological, syntactic or semantic. Language network analysis enables: the examination of structural
complexities of each language sublevel and their mutual interactions; the systematic investigation of language evolution; the modeling of language
acquisition; the modeling of mental lexicons, assessing the text quality, authorship attribution, disambiguation of the word’s meaning in a
semantic context. The aim of the LangNet project is to design the methodology for complexity evaluation across language levels using complex
networks by establishing an information science bridge between linguistics, complex networks and natural language processing. The
word-level and subword-level language networks will be constructed from various Croatian texts, lexicons and dictionaries. So far, there have been
no systematic efforts to model the phenomena of various Croatian language subsystems and examine their functions through complex networks. Obtaining
such findings is critical for deepening our understanding of conceptual similarities, differences and universalities in natural languages. The
proposed methodology will reveal the currently unavailable structural properties of the Croatian language at subword-level and word-levels:
phonological, phonetic, syllabic; co-occurrences and syntax. Language network analysis can be further extended in the direction of intelligent applications in the field of natural language processing.
Cost Keystone is a cooperative network of researchers, practitioners, and application domain specialists working in fields related to semantic data management,
the Semantic Web, information retrieval, artificial intelligence, machine learning and natural language processing, that coordinates collaboration among them to enable
research activity and technology transfer in the area of keyword-based search over structured data sources. The coordination effort will promote the development of a new
revolutionary paradigm that provides users with keyword-based search capabilities for structured data sources as they currently do with documents. Furthermore, it will exploit
the structured nature of data sources in defining complex query execution plans by combining partial contributions from different sources.
COST iV&L Net
The explosive growth of visual and textual data (both on the World Wide Web and held in private repositories by diverse institutions and companies)
has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data depend on
the semantic gap between vision and language being bridged, which in turn calls for expertise from two so far unconnected fields: Computer Vision (CV) and Natural Language
Processing (NLP). The central goal of iV&L Net is to build a European CV/NLP research community, targeting 4 focus themes: (i) Integrated Modelling of Vision and Language for
CV and NLP Tasks; (ii) Applications of Integrated Models; (iii) Automatic Generation of Image & Video Descriptions; and (iv) Semantic Image & Video Search. iV&L Net will organise
annual conferences, technical meetings, partner visits, data/task benchmarking, and industry/end-user liaison. Europe has many of the world's leading CV and NLP researchers.
Tapping into this expertise, and bringing the collaboration, networking and community building enabled by COSTActions to bear, iV&L Net will have substantial impact, in terms of
advances in both theory/methodology and real world technologies