APs Project (2021 - 2024)

  • Augmented Proxemics services
  • Keywords: Social Media Analytics, Information Extraction, Multilingual Text Mining, Natural Language Processing, Text Mining, Visualization
  • Funding: E2S UPPA (Ph.D. Scholarship) and Urban Community of Pau Béarn Pyrénées (10,000 euros)
  • Institutions involved: LIUPPA Laboratory (Pau, France) and HiTZ Center (San Sebastian, Spain)
  • Objectives: propose a multilingual framework for processing and analyzing social media data within semantically defined application domains (e.g., through ontologies or thesauri), focusing on the multilingual border regions of Basque Country and Béarn. The main objective of this project is to assist decision-makers and local stakeholders across various application domains (such as tourism) in getting insights and indicators based on social media to address domain-specific requirements.
  • Selected publications: Social Network Analysis And Mining, WISE 2022, IDA 2023, EACL 2024, INFORSID 2024, Geodata Award @ GeoDataDays 2023
  • My contributions to this project:
    1. I proposed a formal methodology to build multidimensional datasets from social media. Building accurate and exhaustive datasets is a recurrent challenge in the Web and Social Media Search research field. However, most approaches currently used are ad hoc and, therefore, difficult to reuse. This methodology addresses this issue by proposing an iterative and incremental pipeline applied to several data feeds (e.g., posts, metadata, media, etc.), incorporating both human feedback and automatic mechanisms to improve quality.
    2. I conducted a comparative study of rules-based, fine-tuning, and few-shot learning techniques alongside various new large language models (LLMs) for extracting knowledge from unstructured, multilingual, and noisy social media texts in the tourism domain. Social media posts are challenging in Natural Language Processing (NLP) due to their multilingualism, brevity, the presence of informal language, and frequent grammatical errors, among other factors. Additionally, I investigated a recurrent challenge faced by researchers: determining the minimum number of training annotations required to achieve competitive results in a specific domain. Manual annotations are both time-consuming and costly; thus, researchers aim to annotate as few samples as possible while still maintaining high-quality results.
    3. I proposed modular, domain-adaptive indicators by reinterpreting the theory of proxemics (Hall, 1966) for social media. The challenge is that most indicators for social media are domain-specific, meaning they are effective within a specific domain of application but difficult to adapt to others. My indicators, expressed as similarity measures, stand out due to their modularity based on proxemic dimensions, and their applicability across heterogeneous entities, such as users, demographics, themes, time periods or places.
    4. I proposed TextBI an interactive dashboard designed to visualize multidimensional indicators on social media across various dimensions (e.g., spatial, temporal, thematic, personal, sentimental). It addresses the challenge of presenting complex information in a way that is adaptable to various domains and easily interpretable by non-computer scientists, such as local stakeholders (e.g., tourism offices, municipal councils). Unlike existing Business Intelligence (BI) tools, TextBI offers interactive visuals specifically designed for social media, featuring sentiment and engagement overlays, multilevel timelines, thematic maps, proxemic crosshairs and interaction graphs.
  • Demonstration video of the TextBI dashboard

DA3T Project (2021 - 2022)

  • Digital Trace Analysis Device for the Valorization of Touristic Territories
  • Keywords: Geomatics, Geospatial Analysis, ETL, GPS Tracks, Mobility, Semantic Trajectories, Data Integration, Similarity Measures
  • Funding: French Nouvelle-Aquitaine Region, Berger-Levrault Company, and Charente Tourism.
  • Institutions involved: LIUPPA Laboratory (Pau, France), L3i/LIENSs Laboratory (La Rochelle, France), and UMR CNRS Passage (Bordeaux, France)
  • Objective: The project aims to develop a system for analyzing multidimensional mobility tracks both outdoor in cities and indoor, for example, in museums, to assist local planners and decision‑makers in managing and promoting touristic areas in Nouvelle‑Aquitaine. It is a multidisciplinary project involving both computer scientists and geographers, focused on creating tools and methods for extracting, processing and analyzing mobility tracks. A mobile application named Geoluciole was developed to collect tracks from tourists in various touristic cities of the region.
  • Selected publications: ACM SAC 2022, ISPRS - IJGI, INFORSID 2022, GAST Workshop @ EGC 2022
  • My contributions to this project:
    1. I worked on a multi-level and multi-aspect model for analyzing semantic trajectories, addressing several challenges in the geomatics field. Specifically, the model focuses on: modeling semantic trajectories with data enrichment associated with positions or segments; defining enrichment generically to integrate various dimensions; and structuring this enrichment according to a hierarchical organization.
    2. I proposed a novel ETL platform (Extract, Transform, Load) dedicated to processing mobility tracks. It represents the first mobility-specific ETL system and addresses the challenge of seamlessly analyzing heterogenous mobility tracks coming from various sources. More precisely, it allows geographers to automatically integrate (e.g., process, enrich, visualize) many mobility tracks through modular and reactive pipelines accessible to users who are not necessarily computer scientists.
    3. I designed 3D visualization modules, including a customizable space-time cube, and semantic trajectory enrichment modules by leveraging open data sources, including OpenStreetMap, Google Maps Places, and the DataTourisme ontology, as well as through semi-structured interviews.
    4. I participated in the design of composite and interpretable semantic trajectory similarity measures that assist geographers in assessing the similarity of touristic trajectories at various granularities (e.g., macro, meso and micro).
  • Access the space time cube here.
  • Demonstration video of the mobility-specific ETL