GraphOOD Tutorial

Tutorial Description

Graph machine learning has witnessed rapid progress across both academia and industry. However, most existing methods are developed under the in-distribution (I.D.) hypothesis, which assumes that training and testing graph data are drawn from the same distribution. In real-world applications—ranging from dynamic knowledge graphs to evolving biomedical networks—this assumption is frequently violated, resulting in severe performance degradation under distribution shifts. Addressing this challenge has become a key focus in recent years, leading to the development of novel paradigms that move beyond the I.D. setting. This tutorial presents a comprehensive overview of three emerging and synergistic directions for tackling distribution shifts in graph learning. First, we highlight Graph LLMs, which combine the representational power of large language models with graph structures to enable flexible, in-context, and few-shot learning on graphs. Second, we introduce adaptation techniques for both GNNs and Graph LLMs, including graph neural architecture search and continual learning strategies for evolving data. Third, we cover generalization methods that incorporate causality and invariance principles to build robust graph models under unseen distributions. These advances are reshaping the future of graph machine learning and are of central interest to the IJCAI community across machine learning, data mining, and real-world AI deployment.

Tutorial Outline

To the best of our knowledge, this tutorial is the first to systematically and comprehensively discuss graph machine learning and graph LLM under distribution shifts, with a great potential to draw a large amount of interests in the community. The tutorial is planned for 1/4 day and organized into 4 sections.

The research and industrial motivation for graph machine learning and graph LLM under distribution shifts

Graph LLMs for distribution shift

Adaptation of GNNs and Graph LLMs under distribution shifts

Generalization of GNNs and Graph LLMs under distribution shifts

Discussions and future directions

Target Audience and Prerequisites

This tutorial will be highly accessible to the whole machine learning and data mining community, including researchers, students and practitioners who are interested in graph adaptation, graph generalization, graph LLM and their applications in graph-related tasks. The tutorial will be self-contained and designed for introductory and intermediate audiences. No special prerequisite knowledge is required to attend this tutorial.

Motivation, Relevance and Rationale

Graph machine learning under distribution shifts is an increasingly critical research topic in both academia and industry, as traditional methods often struggle when distributional assumptions are violated in real-world scenarios. Addressing distribution shifts is crucial for developing robust graph models that maintain performance across diverse and evolving data distributions. This tutorial aims to disseminate recent advances from three complementary perspectives—Graph LLMs, adaptation, and generalization—each addressing distribution shifts from a unique angle. Graph LLMs introduce a new paradigm by leveraging large language models for enhanced in-context and few-shot learning on graphs. Adaptation techniques, including graph neural architecture search and continual learning, equip models to dynamically adjust to changing environments. Generalization methods, inspired by causality and invariance principles, ensure model stability across unseen data distributions. These three directions collectively offer a comprehensive approach to building resilient and trustworthy graph learning systems, making the topic highly relevant to the IJCAI community and beyond.

Tutorial Overview

We characterize graph machine learning and graph LLM under distribution shifts as those that not only demonstrate competence but also integrate fundamental elements. These elements encompass graph adaptation, graph generalization specifically in relation to GNNs and graph LLM. We posit that embedding these characteristics in the design phase of graph ML models will greatly enhance their trustworthiness, thereby fostering their industrial application and widespread use.

In this tutorial, we discuss promising solutions to graph machine learning under distribution shifts from three aspects:

Graph LLM for distribution shifts

Large language models (LLMs) have demonstrated remarkable capabilities in in-context learning and few-shot learning, showcasing their ability to generalize across diverse tasks without extensive training or fine-tuning. This inherent adaptability has drawed the interest of researchers in graph-related fields. Numerous studies have sought to integrate LLMs and Graph Neural Networks (GNNs) to develop versatile graph models capable of handling a wide array of data distributions. Some approaches leverage LLMs to enrich the feature representations used by GNNs, enhancing their ability to capture nuanced information from graph data. Conversely, other methodologies utilize GNNs to empower LLMs with structural knowledge, enabling them to effectively tackle graph-related tasks by encoding graph structures into their learning process. This relationship between LLMs and GNNs holds promise for advancing the state-of-the-art in graph representation learning and opens avenues for more robust and adaptable graph-based applications under distribution shifts. We will introduce methods which focus on combining the abilities of large language models to enhance the graph out-of-distribution generalization capability under distribution shifts, including LLMs-enhanced GNNs and GNNs-enhanced LLMs, enabling better in-context learning and few-shot learning abilities across graph distributions in real-world scenarios.

Adaptation of GNNs and graph LLMs under distribution shifts

As graphs evolve over time, encountering shifts in their underlying distributions, traditional graph ML models face challenges in maintaining performance and adaptability. In response, the recent research focus has shifted towards developing adaptive techniques that can seamlessly adjust to these distribution shifts while preserving model effectiveness. One classic of methods are graph neural architecture search methods, which empower algorithms to autonomously explore and discover effective graph architectures tailored to specific tasks. By automating the design process, graph neural architecture search methods enable the creation of models optimized for adaptability under distribution shifts, fostering more resilient systems. Another classic of methods are graph continual learning methods, which complement these efforts by focusing on models' ability to learn and adapt over time. Rather than treating data as static entities, continual learning frameworks facilitate ongoing model updates and refinement, enabling adaptation to evolving distributions without catastrophic forgetting and advancing the performance of graph models in dynamic real-world scenarios. We will discuss works which aim to propose new approaches to adapt graph ML models under distribution shifts, including two types of representative methods: graph neural architecture search and graph continual learning.

Generalization of GNNs and graph LLMs under distribution shifts

Besides Out-of-distribution generalized graph model, some works focus on exploiting training schemes with tailored optimization objectives and constraints to promote OOD generalization on graphs, including graph invariant learning, graph adversarial training, and graph self-supervised learning. Some graph invariant learning methods are built upon the invariance principle to address the OOD generalization problem from a principle way, which aim to exploit the invariant relationships between features and labels across different distributions while disregarding the variant spurious correlations. Some graph adversarial training and graph self-supervised learning methods have also been demonstrated to improve graph model robustness against distribution shifts and OOD generalization ability. We will give an introduction to these graph training strategies and introduce some recent approaches for handling OOD generalization.

The tutorial will be presented lively. However, in case of any technical problems, we may also provide a pre-recorded video for the tutorial.

Beyond Graph Distribution Shifts:

LLMs, Adaptation, and Generalization

Speakers

Xin Wang Tsinghua University, China

Haoyang Li Cornell University, USA

Zeyang Zhang Tsinghua University, China

Wenwu Zhu Tsinghua University, China