Leveraging intra-modal and inter-modal interaction for multi-modal entity alignment

Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li*, Jeff Z. Pan*

*Corresponding author for this work

Research output: Working paperPreprint

Abstract / Description of output

Multi-modal entity alignment (MMEA) aims to identify equivalent entity pairs across different multi-modal knowledge graphs (MMKGs). Existing approaches focus on how to better encode and aggregate information from different modalities. However, it is not trivial to leverage multi-modal knowledge in entity alignment due to the modal heterogeneity. In this paper, we propose a Multi-Grained Interaction framework for Multi-Modal Entity Alignment (MIMEA), which effectively realizes multi-granular interaction within the same modality or between different modalities. MIMEA is composed of four modules: i) a Multi-modal Knowledge Embedding module, which extracts modality-specific representations with multiple individual encoders; ii) a Probability-guided Modal Fusion module, which employs a probability guided approach to integrate uni-modal representations into joint-modal embeddings, while considering the interaction between uni-modal representations; iii) an Optimal Transport Modal Alignment module, which introduces an optimal transport mechanism to encourage the interaction between uni-modal and joint-modal embeddings; iv) a Modal-adaptive Contrastive Learning module, which distinguishes the embeddings of equivalent entities from those of non-equivalent ones, for each modality. Extensive experiments conducted on two real-world datasets demonstrate the strong performance of MIMEA compared to the SoTA. Datasets and code have been submitted as supplementary materials.
Original languageUndefined/Unknown
Number of pages10
Publication statusPublished - 19 Apr 2024

Keywords / Materials (for Non-textual outputs)

  • multi-modal knowledge graph
  • entity alignment
  • knowledge graph

Cite this