Skip to main content

    Byron Georgantopoulos

    The OpenMinTeD platform aims to bring full text Open Access scholarly content from a wide range of providers together with Text and Data Mining (TDM) tools from various Natural Language Processing frameworks and TDM developers in an... more
    The OpenMinTeD platform aims to bring full text Open Access scholarly content from a wide range of providers together with Text and Data Mining (TDM) tools from various Natural Language Processing frameworks and TDM developers in an integrated environment. In this way, it supports users who want to mine scientific literature with easy access to relevant content and allows running scalable TDM workflows in the cloud.
    In this paper we present a method aiming at (semi)automating the process of eliciting domain specific terminological resources. The method aims at linguistically processing machine-readable text corpora and extracting lists of candidate... more
    In this paper we present a method aiming at (semi)automating the process of eliciting domain specific terminological resources. The method aims at linguistically processing machine-readable text corpora and extracting lists of candidate multi-word terms of the domain, that would then be validated by domain experts. The method proceeds in three pipelined stages: a) morphosyntactic annotation of the domain corpus, b) corpus parsing based on a pattern grammar endowed with regular expressions and feature-structure unification, c). statistical evaluation of the candidate terms with an aim to skim valid domain terms and lessen the overgeneration effect caused by the pattern grammar. This hybrid methodology was tested on a Greek software manual corpus, featuring a 63% recall. Out of 10 different statistical filters applied on two-word terms, the best performing one further confirmed 30% of the index two-word terms and also reduced the size of the proposed list to 1/15.
    This paper addresses the problem of creating a summary by extracting a set of sentences that are likely to represent the content of a document. A small scale experiment is conducted leading to the compilation of an evaluation corpus for... more
    This paper addresses the problem of creating a summary by extracting a set of sentences that are likely to represent the content of a document. A small scale experiment is conducted leading to the compilation of an evaluation corpus for the Greek language. Three models of sentence extraction are then described, along the lines of shallow linguistic analysis, feature combination and machine learning
    This paper addresses the problem of creating a summary by extracting a set of sentences that are likely to represent the content of a document. A small scale experiment is conducted leading to the compilation of an evaluation corpus for... more
    This paper addresses the problem of creating a summary by extracting a set of sentences that are likely to represent the content of a document. A small scale experiment is conducted leading to the compilation of an evaluation corpus for the Greek language. Two models of ...
    In this paper, we present a method for the automatic extraction of terms from machine-readable text corpora. The method is based on a pattern grammar endowed with regular expressions and feature-structure unification capacity. The text... more
    In this paper, we present a method for the automatic extraction of terms from machine-readable text corpora. The method is based on a pattern grammar endowed with regular expressions and feature-structure unification capacity. The text corpus we have used consisted of a ...
    Cross-media summarization in a retrieval setting Byron Georgantopoulos 1, 2, Toon Goedeme 3, Stavros Lounis 1, Harris Papageorgiou 1, Tinne Tuytelaars 3, Luc Van Gool 3 1 Institute for Language and Speech Processing/IRIS 6 Artemidos... more
    Cross-media summarization in a retrieval setting Byron Georgantopoulos 1, 2, Toon Goedeme 3, Stavros Lounis 1, Harris Papageorgiou 1, Tinne Tuytelaars 3, Luc Van Gool 3 1 Institute for Language and Speech Processing/IRIS 6 Artemidos & Epidavrou, Athens, Greece {byron, ...
    ABSTRACT Τα τεράστια κειµενικά δεδοµένα που είναι διαθέσιµα σήµερα σε ηλεκτρονική µορφή απαιτούν εύρωστες τεχνολογίες επεξεργασίας φυσικής γλώσσας (ΕΦΓ). Η αλυσίδα αρθρωµάτων ΕΦΓ που έχει αναπτύξει το Ινστιτούτο Επεξεργασίας του Λόγου... more
    ABSTRACT Τα τεράστια κειµενικά δεδοµένα που είναι διαθέσιµα σήµερα σε ηλεκτρονική µορφή απαιτούν εύρωστες τεχνολογίες επεξεργασίας φυσικής γλώσσας (ΕΦΓ). Η αλυσίδα αρθρωµάτων ΕΦΓ που έχει αναπτύξει το Ινστιτούτο Επεξεργασίας του Λόγου είναι ...
    Research Interests:
    The present dissertation and project describes a system for automatic summarising of texts. Instead of generating abstracts, a hard NLP task of questionable effectiveness, the system tries to identify the most important sentences of the... more
    The present dissertation and project describes a system for automatic summarising of texts. Instead of generating abstracts, a hard NLP task of questionable effectiveness, the system tries to identify the most important sentences of the original text, thus producing an extract. The ...
    18 Eliciting Terminological Knowledge for Information Extraction Applications Byron Georgantopoulos1 (*''and Stelios Piperidis* X4) (*) Institute for Language and Speech Processing-ILSP Artemidos & Epidavrou,... more
    18 Eliciting Terminological Knowledge for Information Extraction Applications Byron Georgantopoulos1 (*''and Stelios Piperidis* X4) (*) Institute for Language and Speech Processing-ILSP Artemidos & Epidavrou, Athens, GREECE Tel:+ 30 1 6800959, fax:+ 30 1 6854270 (@) University of ...