Yadalam_et

BASIC RESEARCH:

Knowledge-Aware Graph Neural Networks for TRPV1 Drug-Gene Associations in

Periodontal Pain

Redes neuronales de grafos con conciencia del conocimiento para asociaciones fármaco-gen TRPV1 en el dolor periodontal

Pradeep K. Yadalam PhD¹ https://orcid.org/0000-0002-6653-4123

Saravagya Sharma PhD¹ https://orcid.org/0009-0004-8883-5041

Carlos M. Ardila PhD¹-2 https://orcid.org/0000-0002-3663-1416

¹Department of Periodontics, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Chennai - 600 077.Tamilnadu, India.

2Basic Sciences Department, Faculty of Dentistry, Biomedical Stomatology Research Group, Universidad de Antioquia U de A, Medellín, 050010, Colombia.

Correspondence to: Pradeep K. Yadalam - pradeepkumar.sdc@saveetha.com

Carlos M. Ardila - martin.ardila@udea.edu.co

Received: 2-V-2025 Accepted: 8-VII-2025

ABSTRACT: TRPV1 (Transient Receptor Potential Vanilloid 1) is a critical protein in the pathogenesis of periodontal pain, activated by noxious stimuli and inflammatory mediators associated with periodontitis. This study investigates drug-gene interactions involving TRPV1 to elucidate its role in periodontal pain mechanisms. Knowledge Graph Neural Networks (KGNNs) were employed to model and analyze the complex relationships between drugs, genes, and pain receptors in periodontal tissues. By leveraging biological datasets, including TRPV1 channel activity, pain receptor interactions, and gene expression profiles, the study aims to identify potential therapeutic targets and strategies for personalized pain management in periodontal treatment. Differentially expressed genes (DEGs) were integrated with drug and gene associations to model biological systems and inform therapeutic development. The study utilized a gene expression dataset encompassing features such as gene similarity scores, adjusted p-values, and biochemical activity. A semantic similarity-based fusion approach was applied to enhance model performance by incorporating biological information layers, improving interaction modeling, and promoting efficient information propagation. Three graph-based models were employed: Graph Convolutional Network (GCN) as a baseline, Residual GCN (ResGCN) for stability, and Attention-based GCN (AttGCN) for dynamic node weighting. Among the models, ResGCN demonstrated superior performance with an accuracy of 93.75% and the lowest final loss, highlighting its robustness in predicting drug-gene associations. This outcome supports the potential utility of ResGCN in accurately modeling TRPV1-mediated pain mechanisms and guiding therapeutic decisions. The application of KGNNs has provided valuable insights into TRPV1 drug-gene interactions in the context of periodontal pain. The findings emphasize the potential for using ResGCN in therapeutic discovery and optimization. However, challenges such as data quality and biological complexity remain.

KEYWORDS: Periodontal pain; TRPV1; Graphs; Neural networks; Knowledge graphs.

RESUMEN: TRPV1 (Receptor Potencial Transitorio Vanilloide 1) es una proteína crítica en la patogénesis del dolor persistente, activado por estímulos nocivos e intermediarios inflamatorios asociados con la periodontitis. Este estudio investiga las interacciones fármaco-gen que involucran a TRPV1 para dilucidar su papel en el dolor periodontal. Se emplearon Redes Neuronales de Grafos con Conciencia del Conocimiento (KGNs) para modelar y analizar las complejas relaciones entre fármacos, genes y receptores de dolor en tejidos periodontales. Al aprovechar conjuntos de datos biológicos, incluyendo el canal iónico TRPV1, interacciones de receptores de dolor y perfiles de expresión génica, el estudio tiene como objetivo identificar posibles dianas terapéuticas y estrategias para el manejo personalizado del dolor periodontal. Se integraron genes diferencialmente expresados (DEGs) con datos de fármacos y genes para modelar sistemas biológicos e informar la priorización terapéutica. Se emplearon tres modelos basados en grafos: Red Neuronal Convolucional de Grafos (GCN) como línea base, GCN Residual (ResGCN) para estabilidad y GCN basado en atención (AttGCN) para ponderación dinámica de nodos. Entre los modelos, ResGCN demostró un rendimiento superior con una precisión del 93.75% y la menor pérdida final, destacando su robustez en la predicción de asociaciones fármaco-gen. Este resultado apoya la utilidad potencial de ResGCN en el modelado preciso de los mecanismos de dolor mediados por TRPV1 y en la guía de la priorización terapéutica. La aplicación de KGNs ha proporcionado valiosos conocimientos sobre las interacciones fármaco-gen TRPV1 en el contexto del dolor periodontal. Los hallazgos enfatizan el potencial de usar ResGCN en la optimización y el tratamiento terapéutico. Sin embargo, persisten desafíos como la calidad de los datos y la complejidad biológica.

PALABRAS CLAVE: Dolor periodontal; TRPV1; Grafos; Redes neuronales; Grafos de conocimiento.

INTRODUCTION

A significant health concern is periodontal disease, an inflammatory condition affecting the teeth. TRPV1, a key player in pain sensation, influences its pathophysiology (1). Understanding TRPV1's interactions with drugs and genes can help treat periodontal pain more effectively. TRPV1 (2) is a crucial protein in transmitting pain signals in periodontal diseases. It is expressed in sensory neurons and can be triggered by noxious stimuli, causing depolarization and pain signal propagation. Inflammatory mediators can sensitize TRPV1, amplifying pain (3, 4). Lipopolysaccharide (LPS) increases TRPV1 expression and proinflammatory cytokines in human periodontal ligament cells, causing periodontitis in rats. Furthermore, experimental models demonstrate that TRPV1 is a significant mediator of inflammation and nociception in periodontal tissue, underscoring its potential as a therapeutic target. In vivo research shows that capsazepine reduces bone loss from periodontal ligation, while capsaicin exacerbates the condition (4).

Recent studies have highlighted TRPV1 and TRPA1 as key channels in trigeminal pain pathways. Orthodontic force-induced inflammation sensitizes these channels, with TRPV1 activated early and TRPA1 activated late, leading to CGRP release. Blocking both TRPV1 and TRPA1 may be a primary therapeutic target for orthodontic pain relief (5). Similarly, chemical ablation of nociceptive neurons in the trigeminal ganglia or functional inhibition of TRPV1-expressing trigeminal afferents decreases bone loss, osteoclast numbers, leukocyte/neutrophil infiltration, and proinflammatory cytokines in mouse models of experimental periodontitis. Importantly, these interventions do not alter the periodontal microbiome, suggesting that targeting neuroimmune and neuroskeletal regulation could offer promising therapeutic approaches for periodontitis (6, 7).

High TRPV1 expression in trigeminal ganglia is vital for orofacial pain transmission and modulation (8). Gene therapy targeting TRPV1 has demonstrated efficacy in reducing tooth-movement pain in Sprague-Dawley rats, suggesting potential clinical applications for orthodontic pain relief. Moreover, TRPV1 is crucial for osteoclastogenesis in periodontal tissues, as shown by severe bone loss in TRPV1-knockout mice. Capsaicin, a TRPV1 agonist, has been found to suppress ligature-induced bone loss in mice, with reduced numbers of TRAP-positive cells (9, 10). These findings underscore the potential of small molecules to modulate TRPV1 activity for treating periodontitis.

Given these insights, there is an unmet need for advanced computational models, such as AI-driven frameworks, to systematically analyze and predict TRPV1-mediated interactions in periodontal pain. Such models can facilitate the identification of novel therapeutic targets and optimize personalized treatment strategies. Knowledge graphs integrated with deep learning methods are well-suited for addressing this challenge, as they can solve complex biological problems, reveal patterns and clusters, and elucidate disease variations (11-13). Knowledge graphs offer structured representations of biological entities, enhancing feature extraction and enabling deep learning models to capture intricate interactions between genes, proteins, and drugs (14). This structured approach promotes a deeper understanding of disease progression and supports personalized medicine.

Recent advancements in AI and graph-based methods have laid a robust foundation for addressing biomedical challenges. A novel drug sensitivity predictive AI model using deep learning and similarity network fusion has demonstrated accurate sensitivity prediction for targeted and non-specific cancer drugs, offering a promising framework for precision medicine (15-18). Another noteworthy development is the use of knowledge graphs such as BOCK, which integrates clinical data with biomedical networks to predict pathogenic gene interactions and explain them via subgraph visualizations. These frameworks are highly relevant for oligogenic diseases and drug-drug interaction prediction (19, 20). Additionally, BioKGLM, a contextualized language model, consistently outperforms state-of-the-art methods in biomedical information extraction by capturing underlying relationships between biological concepts (20-22). Another model, DeepLGF, uses local and global information to enhance drug-drug interaction prediction, demonstrating the power of integrating graph neural networks and natural language processing (23).

Building on these advancements, this study explores Knowledge Aware Graph Neural Networks (KAGNNs) to investigate fused drug-gene associations in TRPV1-mediated periodontal pain. TRPV1 is a key ion channel involved in nociception, particularly in inflammatory pain conditions such as periodontitis (24). This research aims to develop personalized periodontal pain management strategies using KAGNNs, identify novel drug-gene interactions, and promote overall health and well-being in patients with periodontal disease.

Despite the promise of AI-based models, studying TRPV1 presents challenges due to the complexity of dynamic biological systems, multifactorial disease mechanisms, and intricate drug-gene interactions (25-27). Traditional statistical methods often fail to capture external factors, necessitating advanced computational tools for high-throughput genomic analyses. Graph neural networks (GNNs) integrated with knowledge graphs offer a powerful solution, dynamically updating to reflect the latest biomedical research findings (28). This study leverages these cutting-edge methodologies to map and analyze the extensive network of TRPV1-associated drug-gene interactions, focusing on their implications for pain mechanisms and therapeutic development. By addressing data sparsity and efficiently exploring high-order information, the proposed KAGNN framework aims to advance our understanding of TRPV1-mediated periodontal pain and inform future therapeutic strategies.

Materials and Methods

This section outlines the methodologies and processes utilized to understand drug-gene associations in periodontal pain. The research emphasizes the TRPV1 pathway and explores its significance in the context of periodontal inflammation and pain management. Graph-based models and knowledge-aware neural networks form the backbone of this study, leveraging semantic similarity fusion and differential gene expression analysis.

Figure 1 provides a visual representation of the drug-gene association workflow in periodontal pain, highlighting key components such as the TRPV1 pathway, periodontal pain mechanisms, gene associations, drug interactions, and the integration of knowledge-aware graph neural networks. This comprehensive workflow is essential for identifying potential therapeutic targets and understanding molecular interactions.

Figure 1. A Knowledge-Aware Workflow for Drug-Gene Interactions in Periodontal Pain.

Data Retrieval

Using a probe and drugs (29), TRPV1-associated drugs and genes were retrieved. This dataset includes drugs, genes, biochemical activity, mode of action, and other relevant features. Omics datasets were retrieved from the NCBI GEO database (30) under accession number GSE2373. Differential gene expression analysis was conducted to identify highly significant genes between groups, and these genes were collected and processed for bulk RNA sequencing using the GEO2R tool. After identifying differentially expressed genes (DEGs), drug and gene interaction data were fused with DEGs, and the top 300 entries were selected from both data frames. Graph neural networks were implemented using Python. To ensure reproducibility, the data preprocessing steps included quality checks, normalization of expression levels, and filtering of low-expression genes before differential analysis.

Semantic Similarity Fusion-Knowledge Graph

Integrating differentially expressed genes (DEGs) with drug and gene associations is critical for understanding biological systems and developing therapeutic strategies. A knowledge graph represents biological and pharmacological data, aiding in understanding and discovery in drug-gene interactions. It employs layout algorithms, dynamic visualizations, semantic similarity, hypothesis generation, and predictive modeling with regular updates.

Semantic similarity-based fusion (31) is an effective approach for combining drug and gene information due to its ability to capture contextual relationships and meanings within biological and pharmacological data. This method handles data sparsity and noise, allowing continuous updates without re-analysis. Semantic similarity algorithms (32) measure biological context between genes and drugs. The fusion process integrates multiple data sources to enhance the predictive power of models.

The primary dataset used comprises a gene expression dataset fused with drug and gene data, featuring attributes such as gene similarity scores, adjusted p-values (padj), and biochemical activity. Data integrity and consistency were verified using custom scripts in Python.

The semantic similarity-based fusion process improves model performance by integrating biological information layers, enabling complex gene interactions, and facilitating information propagation. Graph construction involves defining nodes representing entities like genes, drugs, and pathways, creating edges that represent relationships, and assigning semantic similarity scores as edge weights to quantify the strength of these relationships.

Model Architectures and Hyperparameters

Figure 2 illustrates the comprehensive workflow architecture implemented in this study to analyze drug-gene interactions. The process begins with data processing, where raw input data undergoes preprocessing and normalization to ensure consistency and reliability for downstream tasks. Semantic analysis is then performed, extracting and refining features while fusing them to enhance the interpretability of the dataset. A knowledge graph is subsequently constructed by integrating the processed data, leveraging semantic similarity and feature weighting to map meaningful relationships between drugs and genes. This graph serves as the foundation for training advanced graph neural networks, including Graph Convolutional Networks (GCNs) and Attention-based Graph Convolutional Networks (AGCNs), which are optimized for predicting drug-gene associations. Finally, the results are analyzed using performance metrics such as precision, recall, and F1-score, providing insights into the model's predictive capabilities and its application in understanding periodontal pain management.

The analysis employs three distinct graph-based models: the Graph Convolutional Network (GCN) (Base Model) (33) compared with state-of-the-art (SOTA) models-the Residual Graph Convolutional Network (ResGCN) and the Attention-based Graph Convolutional Network (AttGCN) (34, 35). These models leverage the graph structure of the data, where nodes represent genes and edges depict interactions or similarities. The GCN serves as the baseline model, utilizing convolutional layers to aggregate information from neighboring nodes. The ResGCN introduces residual connections to improve gradient flow and model stability, while the AttGCN incorporates attention mechanisms to dynamically weigh the importance of different node connections (Figure 1, Figure 2).

Figure 2. Workflow Architecture for Knowledge-Aware Neural Networks in Drug-Gene Interactions.

Base Graph Convolutional Network (GCN)

The graph neural network balances complexity and performance through regularization techniques that prevent overfitting while optimizing predictions for binary classification tasks. The input layer corresponds to the number of features in the dataset. The hidden layer comprises 32 neurons with a ReLU activation function, enabling the model to learn complex representations. The output layer directly corresponds to the input features, using a sigmoid activation for binary classification tasks. Regularization includes a dropout rate of 0.3, and the learning rate is adjusted dynamically. Weight decay penalizes complex models to prevent overfitting. The Adam optimizer ensures faster convergence and better performance.

The formula for a Graph Convolutional Network (GCN) layer is:

H(l+1) = σ(Â H(l) W(l)), where:

Â=D^(-1/2) (A + I) D^(-1/2): Normalized adjacency matrix with self-loops.

H(l): Input feature matrix for layer l.

W(l): Trainable weight matrix for layer l.

σ: Activation function (typically ReLU).

State-of-the-Art Model Architectures

Residual Graph Convolutional Network (ResGCN)

The ResGCN's input layer accommodates diverse datasets, retaining full complexity. The hidden layer has 32 neurons with a ReLU activation function, promoting convergence and mitigating the vanishing gradient problem. A residual connection facilitates gradient flow during backpropagation. Regularization includes a dropout rate of 0.3. A learning rate of 0.01 balances convergence speed and stability, while weight decay encourages simpler representations. The Adam optimizer ensures adaptive learning rates. The architecture incorporates a skip connection, ensuring robust information flow from input to output.

Attention-based Graph Convolutional Network (AttGCN)

The AttGCN combines attention mechanisms with robust training methodologies for binary classification. Its architecture includes an input layer matching the feature dimension, followed by attention layers. Training parameters include the Adam optimizer, a learning rate of 0.01, and weight decay of 5e-4. Implemented in the PyTorch Geometric library, the model employs Xavier/Glorot initialization to maintain balanced variance. Hardware includes a standard CPU setup.

Training stability was maintained without gradient clipping due to optimized hyperparameters, ensuring model convergence.

Training Process and Cross-Validation

The training process ensures robust performance and generalization. Models were trained using full-batch gradient descent, suitable for graph-based data structures where the entire graph is processed simultaneously, capturing the global structure of the data.

Cross-validation was implemented using a 5-fold strategy, splitting the dataset into five subsets. Each iteration used one subset for validation while training the remaining four. This method ensures that each data point is used for both training and validation, minimizing overfitting. Key performance metrics-accuracy, ROC-AUC, and loss-were monitored throughout the training process.

These metrics provide insights into the models' ability to classify gene interactions accurately and evaluate their predictive power across diverse datasets.

Results

Knowledge graph-based semantic similarity identified 405,301 interactions between drugs and genes, revealing that a higher clustering coefficient indicates tight-knit groups of genes. The average clustering coefficient was calculated as 0.0305, suggesting relatively low clustering among genes. These metrics are crucial for interpreting the structure and dynamics of biological networks.

Table 1. Top 10 Genes Associated with Knowledge Graph Construction.

Gene	Betweenness Centrality	Closeness Centrality
CNR1	0.00255	1
SLC6A3	0.00255	1
ALOX5AP	0.00255	1
AMPC	0.002078	0.727273
SIGMAR1	0.002078	0.727273
TRPC1	0.002078	0.727273
TRPC2	0.002078	0.727273
TRPC3	0.002078	0.727273
TRPC4	0.002078	0.727273
TRPC6	0.002078	0.727273
TRPC7	0.002078	0.727273
PRNP	0.002078	0.727273

The table highlights the top 10 genes identified from the constructed knowledge graphs, ranked by betweenness and closeness centrality values.

The knowledge graph contained 950 nodes and 405,300 edges, indicating a densely connected network. The average number of neighbors per node was 110.385, showcasing strong interconnectivity. The small network diameter suggested efficient connectivity across nodes, while the radius, as the minimum distance from a central node to the farthest node, highlighted a centralized structure. The characteristic path length, representing the average shortest path between node pairs, underscored the network's compactness. The clustering coefficient further quantified the degree of clustering, with a value of 0 reflecting no clustering, emphasizing the sparsity of tightly knit subgroups. Despite a relatively sparse network density of 11.6%, the heterogeneity score highlighted a highly diverse network, with some nodes exhibiting significantly more connections than others. The centralization score confirmed the presence of key nodes but indicated a moderate level of dominance.

The centrality analysis underscored the significance of certain genes in the network. Betweenness centrality, measuring how often a gene acts as a bridge, revealed that CNR1, SLC6A3, and ALOX5AP were the most critical connectors. Closeness centrality, reflecting how quickly a node can access others, showed that only three genes (CNR1, SLC6A3, ALOX5AP) had a value of 1, marking them as highly central and directly connected to most other genes. The remaining genes, with lower closeness centrality, were likely more peripheral in the network's hierarchy (Table 1).

Three graph neural network models-GCN, ResGCN, and AttGCN-were employed for predictive analysis. Both GCN and ResGCN achieved an accuracy of 93.75%, signifying exceptional performance. In contrast, AttGCN achieved an accuracy of 92.08%, indicating slightly lower predictive efficacy. Among these models, ResGCN exhibited the lowest final loss of 0.114, affirming its reliability in generating accurate predictions. Conversely, AttGCN recorded the highest final loss of 0.182, reflecting less precise predictions. These results suggest that ResGCN outperforms the other models in terms of accuracy and loss, making it the preferred choice for applications requiring dependable predictions.

In addition to accuracy and loss, the performance metrics of these models were evaluated based on their generalization ability. A final accuracy of 93.75% demonstrates the ability to make correct predictions on unseen data, indicating the robustness of the ResGCN model. Furthermore, the small deviation from actual values, as evidenced by the low final loss, underscores its potential utility in diverse applications involving knowledge graph-based predictions.

Figure 3.A illustrates the training loss of a machine learning model over 100 epochs. The model exhibits a substantial decrease in loss during the initial stages, stabilizing at approximately 90% accuracy after 20 epochs. This pattern suggests effective learning in the early phases but indicates potential room for improvement. Techniques such as regularization, learning rate tuning, or implementing a more sophisticated architecture may further enhance the model's performance and generalization.

Figure 3.B provides insight into the binary classification model's performance. The model accurately predicted 240 instances as class 0 (true negatives) and incorrectly classified 16 instances as class 1 (false negatives). Notably, no instances were classified as true positives or false positives for class 1, indicating a model bias towards class 0. This suggests that while the model performs well in identifying negative cases, it struggles with detecting positive cases. Further calibration or adjustments, such as addressing class imbalance, may improve its ability to predict class 1.

Figure 3. A. Training Performance of the Machine Learning Model Over 100 Epochs. B. Confusion Matrix of the Binary Classification Model.

Figure 4. Receiver Operating Characteristic (ROC) Curve for Binary Classification.

The ROC curve evaluates the binary classification model's performance, plotting the True Positive Rate (TPR) against the False Positive Rate (FPR). The curve begins at the origin (0,0) and trends toward the top-left corner, reflecting a balance between sensitivity and specificity. The diagonal line represents random performance, while the model’s curve, situated above it, indicates superior predictive capability. The Area Under the Curve (AUC) score of 0.92 demonstrates high accuracy, with 1.0 denoting a perfect model and 0.5 indicating random guessing.

Model Performance Summary

GCN Base (Graph Convolutional Network)

The Base GCN model achieved an accuracy of 93.75%, comparable to ResGCN, establishing it as a robust baseline model. However, it exhibited a higher final loss of 0.165 compared to ResGCN, suggesting potential areas for improvement. The model demonstrated consistent performance across different folds, indicating its reliability for gene interaction prediction.

ResGCN (Residual Graph Convolutional Network-SOTA)

The ResGCN model achieved an impressive accuracy of 93.75%, demonstrating its effectiveness in capturing data relationships and patterns. It exhibited a stable convergence pattern throughout the training process, with minimal prediction error. The incorporation of residual connections allowed for efficient feature extraction, enabling the model to learn identity functions and handle complex transformations with ease. Additionally, ResGCN's lower final loss of 0.114 underscores its superior optimization capability compared to other models.

AttGCN (Attention-based Graph Convolutional Network-SOTA)

The AttGCN model achieved an accuracy of 92.08%, slightly lower than both ResGCN and Base GCN. Its final loss of 0.182 indicates areas where improvements could be made. Despite its sophisticated architecture and attention mechanisms, AttGCN's accuracy was not significantly higher, suggesting a trade-off between architectural complexity and performance. While the attention mechanisms in AttGCN allow for a finer focus on critical features, their added complexity does not yield proportionate gains in predictive accuracy.

Comparative Analysis

The ResGCN and Base GCN models both achieved an accuracy of 93.75%. However, ResGCN outperformed the Base GCN with a lower final loss of 0.114, as well as a more stable convergence during training. AttGCN, while employing a more complex architecture, achieved a slightly reduced accuracy of 92.08% and had the highest final loss of 0.182.

The implemented models, particularly ResGCN, demonstrate competitive performance compared to current state-of-the-art approaches in gene interaction prediction. The high accuracy and low loss values across the models indicate their effectiveness in learning complex gene relationships. The ResGCN model, which combines residual connections, attention mechanisms, and robust regularization techniques, delivers strong performance by efficiently capturing complex gene interactions. This performance is achieved without extensive computational overhead, making ResGCN a practical choice for real-world applications.

Conclusion of Model Comparisons

The ResGCN model achieved the highest overall performance, with an accuracy of 93.75% and the lowest final loss of 0.114. This result surpasses both the Base GCN and AttGCN models, affirming its reliability and efficiency in minimizing prediction errors. While Base GCN matched ResGCN in accuracy, it lagged in optimization as indicated by its higher final loss of 0.165. AttGCN, although leveraging an advanced attention-based architecture, fell short in performance with an accuracy of 92.08% and the highest final loss of 0.182.

In summary, ResGCN emerges as the best-performing model due to its high accuracy, low prediction error, and computational efficiency. These results highlight its potential as a critical tool for accurate gene interaction prediction and the exploration of complex biological relationships.

Figure 5. Training Loss Comparison of GCN, ResGCN, and AttGCN Models Over 100 Epochs.

This figure illustrates the training loss trajectories for the GCN, ResGCN, and AttGCN models across 100 epochs. The X-axis represents the number of training epochs, while the Y-axis depicts the training loss. The blue, red, and green lines correspond to the GCN, ResGCN, and AttGCN models, respectively.

The GCN model begins with a higher initial loss but stabilizes at approximately 0.2 towards the end of training. In contrast, ResGCN achieves the lowest final loss, demonstrating superior optimization and predictive performance. The AttGCN model, despite incorporating advanced attention mechanisms, converges to a similar loss value as the GCN. However, the ResGCN consistently maintains a distinct advantage, with a sharper and more efficient reduction in training loss throughout the process. This performance suggests that ResGCN is better equipped to minimize prediction errors and effectively model complex relationships in the dataset.

Table 2. Accuracy and Final Loss of Model Performances.

Model	Accuracy	Final Loss
GCN	0.9375	0.16533499
ResGCN	0.9375	0.11435559
AttGCN	0.9208333	0.18176558

Table 2 summarizes the performance metrics-accuracy and final loss-of the three models (GCN, ResGCN, and AttGCN) evaluated in this study.

The GCN and ResGCN models achieved an identical accuracy of 93.75%, effectively classifying a significant portion of the dataset. However, ResGCN outperformed GCN in terms of final loss, recording the lowest value of 0.11435559, which underscores its superior ability to minimize prediction errors and achieve a better fit to the training data.

AttGCN, although marginally less accurate at 92.08%, exhibited the highest final loss at 0.18176558, suggesting potential challenges in learning or effectively modeling relationships in the input data. The higher loss values indicate that its predictions were slightly farther from the true labels compared to the other two models.

Among the three models, ResGCN demonstrated the best balance of accuracy and low final loss, affirming its effectiveness and reliability for predictive modeling tasks in this study.

Discussion

This study aims to advance understanding of TRPV1's role in periodontal pain, explore drug-gene interactions using innovative computational AI methods, and identify alternative treatments for patients suffering from this common health issue (24, 36). Current therapies remain inadequate or have significant side effects, emphasizing the importance of understanding the complexities of pain mechanisms (37). Additionally, this research promotes interdisciplinary collaboration between computational biology, pharmacology, and dentistry, fostering a multidisciplinary approach to addressing this pressing global health challenge (Figure 3).

The knowledge graph-based network constructed in this study demonstrated 950 nodes and 405,300 edges, indicating a dense and well-connected structure. The network's small diameter and centralized architecture are reflected in its path length and clustering coefficient. Despite this, the network density remains sparse, with only 11.6% of connections forming current edges. Centrality analysis, focusing on betweenness and closeness, highlighted the genes CNR1, SLC6A3, and ALOX5AP as having the highest betweenness centrality, suggesting their potential role as critical connectors in the network (2, 3, 38). The study identified a comprehensive list of genes associated with TRPV1 and pain signaling, including CNR1, SLC6A3, ALOX5AP, AMPC, SIGMAR1, TRPC1, TRPC2, TRPC3, TRPC4, TRPC6, TRPC7, and PRNP. CNR1, which binds to cannabinoids, has a known role in reducing pain signaling and inflammation. SLC6A3 and ALOX5AP contribute to pain modulation, while AMPC and SIGMAR1 are involved in regulating pain signaling. TRPC1, which interacts with TRPV1, facilitates calcium influx, potentially enhancing hyperalgesia and sensitization (39).

The KGETCDA (Knowledge Graph Encoder from Transformer) framework, developed for processing non-coding RNA datasets, utilizes multiple databases, Transformer-based knowledge representation learning, and multilayer perceptron for high-quality embeddings (16, 20). This approach outperforms other models and is accessible through HNRBase. The DeepOmix-ICI (ICInet) framework, which predicts immune checkpoint inhibitor (ICI) responses in cancer, demonstrates the robustness of combining deep learning and prior biological knowledge graphs. ICInet's performance in clinical settings has shown its superior generalization across various cancer types (19).

Model performance was assessed through accuracy and final loss metrics (34, 35). The study found that both the GCN and ResGCN models achieved an accuracy of 93.75%, while AttGCN demonstrated slightly lower accuracy. ResGCN had the lowest final loss, suggesting better fit to the training data, while GCN had a higher loss, and AttGCN had the highest loss. This suggests that AttGCN may face challenges in learning from input data or modeling complex relationships (Figure 4 and Figure 5) (Table 2). The study underscores the effectiveness of the GCN model in investigating extended drug-gene associations related to TRPV1 in the context of other oral diseases.

The centrality analysis revealed CNR1 and SLC6A3 as high-betweenness nodes, suggesting their importance as integrative hubs within the TRPV1-related network. Biologically, this finding is supported by their known mechanistic roles: CNR1 (cannabinoid receptor 1) is a key regulator of pain perception and neuroinflammation, frequently modulating TRPV1-mediated pathways. Its central role in the network likely reflects its broad interaction potential in modulating nociceptive signaling. SLC6A3 (dopamine transporter) regulates dopamine reuptake, influencing neuroplasticity and pain processing. Its prominence in the network may highlight the involvement of dopaminergic modulation in chronic pain conditions, including periodontal pain. This alignment reinforces the biological plausibility of the network-derived insights.

The identification of key genes and drug interactions involving TRPV1 also opens opportunities for clinical translation. For instance, genes such as CNR1 and SLC6A3 could serve as biomarkers for pain susceptibility or treatment response in periodontal conditions. Moreover, repurposing agents like capsazepine, a known TRPV1 antagonist, may offer targeted therapeutic strategies, potentially reducing the reliance on conventional analgesics with broader side effect profiles.

While ResGCN showed improved accuracy and lower loss compared to GCN and AttGCN, this study did not include a direct comparison with external state-of-the-art models such as BioKGLM. BioKGLM and similar biomedical knowledge graph embedding models leverage domain-specific language models that incorporate rich contextual information. Although beyond the current study's scope, future evaluations could benchmark ResGCN against such models to further contextualize its performance and validate its efficacy across broader biomedical applications.

Future directions for fused drug-gene interactions in TRPV1 (16, 20) and periodontal pain include enhanced data integration, longitudinal studies, model fine-tuning, mechanistic investigations, and the development of novel therapeutics (39). Combining advanced computational AI algorithms with both in vitro and in vivo studies can validate predicted drug-gene interactions and their biological significance (40).

This study on drug-gene associations related to periodontal pain has several limitations, including issues with data quality and availability, model generalizability, the inherent complexity of biological systems, and the interpretability of GNN models (19, 22). The accuracy of the findings heavily depends on the quality and comprehensiveness of the input data; consequently, the study may not capture all biological nuances, potentially leading to an oversimplified view of drug-gene interactions. Additionally, the study may not fully represent other receptors and pathways involved in periodontal pain. It is important to note that drug-gene interaction research is still an emerging field, with ongoing developments in understanding its intricate mechanisms.

While the semantic similarity fusion approach used in this study effectively integrates biological knowledge, it is not without limitations. This method typically relies on predefined ontologies and existing annotations, which may constrain the model’s ability to capture complex or novel biological relationships. Moreover, such approaches may overlook nonlinear interactions and emergent properties present in high-dimensional biological data. As a potential improvement, future work could explore hybrid embedding methods that combine semantic similarity with data-driven representations, such as deep learning-based embeddings, to capture more intricate and context-specific associations.

Furthermore, an observed bias toward class 0 predictions in the confusion matrix (Figure 3.B) suggests potential data imbalance in the classification process. This limitation could hinder the model’s ability to generalize across less represented classes. In future work, strategies such as oversampling, class weighting, or synthetic data generation techniques like SMOTE (Synthetic Minority Over-sampling Technique) could be implemented to address class imbalance and enhance model robustness.

Although this study used publicly available datasets, it is important to recognize potential ethical concerns in AI-driven biomedical research. Biases in omics datasets, especially those related to demographic underrepresentation or incomplete annotations, may affect model generalizability and fairness. Future work should emphasize dataset diversity and fairness-aware algorithms to ensure equitable applications of AI in healthcare settings.

To support the personalized application of TRPV1-related findings, future research could integrate electronic health records (EHRs), genomic profiles, and single-cell RNA sequencing (scRNA-seq) data. These integrations would allow deeper stratification of patient populations, identification of cell-type-specific pain mechanisms, and improved prediction of individual therapeutic responses, facilitating more precise and tailored interventions.

GCNs hold promise for advancing research into the molecular mechanisms of oral diseases, facilitating the identification of biomarkers and therapeutic targets, and potentially informing personalized treatment strategies that could improve therapeutic outcomes.

Conclusion

This study employed Graph Neural Networks to investigate drug-gene interactions involving TRPV1 in periodontal pain using various datasets, contributing to a deeper understanding of these interactions. Advanced computational models, such as AttGCN and ResGCN, were utilized to identify potential treatment strategies and significant interactions, though limitations such as data quality and model generalizability must be considered. Future research should focus on integrating data sources, conducting longitudinal studies, and exploring personalized treatment approaches to further elucidate TRPV1's role in pain mechanisms and support the development of novel therapeutics. Such advancements will improve patient outcomes and enhance the quality of care for individuals suffering from periodontal pain.

Conflicts of Interest: The authors declare no conflicts of interest.

Funding: This research did not receive grants.

Ethics approval and consent to participate: Not applicable

AUTHOR CONTRIBUTIONS STATEMENT

Conceptualization: P.K.Y., S.S. and C.M.A.

Data curation: P.K.Y., S.S. and C.M.A.

Formal analysis: P.K.Y., S.S. and C.M.A.

Funding acquisition: P.K.Y.

Investigation: P.K.Y., S.S. and C.M.A.

Methodology: P.K.Y., S.S. and C.M.A.

Project administration: P.K.Y.

Resources: P.K.Y.

Software: P.K.Y.

Supervision: P.K.Y., S.S. and C.M.A.

Validation: P.K.Y., S.S. and C.M.A.

Visualization: P.K.Y., S.S. and C.M.A.

Writing-original draft: P.K.Y., S.S. and C.M.A. Writing-review & editing: P.K.Y., S.S. and C.M.A.

All authors have read and agreed to the published version of the manuscript.

References

1. Lillis K.V., Austah O., Grinceviciute R., Garlet G.P., Diogenes A. Nociceptors regulate osteoimmune transcriptomic response to infection. Sci Rep. 2023; 13: 17601.

2. Tao L., Yang G., Sun T., Tao J., Zhu C., Yu H., et al. Capsaicin receptor TRPV1 maintains quiescence of hepatic stellate cells in the liver via recruitment of SARM1. J Hepatol. 2023; 78: 805-19.

3. Munjuluri S., Wilkerson D.A., Sooch G., Chen X., White F.A., Obukhov A.G. Capsaicin and TRPV1 Channels in the Cardiovascular System: The Role of Inflammation. Cells. 2021; 11: 18

4. Xiao T., Sun M., Zhao C., Kang J. TRPV1: A promising therapeutic target for skin aging and inflammatory skin diseases. Front Pharmacol. 2023; 14: 1037925.

5. Wang S., Ko C.C., Chung M.K. Nociceptor mechanisms underlying pain and bone remodeling via orthodontic forces: toward no pain, big gain. Frontiers in pain research (Lausanne, Switzerland). 2024; 5: 1365194.

6. Wang S., Nie X., Siddiqui Y., Wang X., Arora V., Fan X., et al. Nociceptor Neurons Magnify Host Responses to Aggravate Periodontitis. J Dent Res. 2022; 101: 812-20.

7. Thammanichanon P., Kaewpitak A., Binlateh T., Pavasant P., Leethanakul C. Varied temporal expression patterns of trigeminal TRPA1 and TRPV1 and the neuropeptide CGRP during orthodontic force-induced pain. Arch Oral Biol. 2021; 128: 105170.

8. Maximiano T.K.E., Carneiro J.A., Fattori V., Verri W.A. TRPV1: Receptor structure, activation, modulation and role in neuro-immune interactions and pain. Cell Calcium. 2024; 119: 102870.

9. Juárez-Contreras R., Méndez-Reséndiz K.A., Rosenbaum T., González-Ramírez R., Morales-Lázaro S.L. TRPV1 Channel: A Noxious Signal Transducer That Affects Mitochondrial Function. Int J Mol Sci. 2020; 21: 8882

10. Iglesias LP, Aguiar DC, Moreira FA. TRPV1 blockers as potential new treatments for psychiatric disorders. Behavioural pharmacology. 2022; 33: 2-14.

11. Nicholson D.N., Greene C.S. Constructing knowledge graphs and their biomedical applications. Comput Struct Biotechnol J. 2020; 18: 1414-28.

12. Kumar S., Nanelia A., Mariappan R., Rajagopal A., Rajan V. Patient Representation Learning From Heterogeneous Data Sources and Knowledge Graphs Using Deep Collective Matrix Factorization: Evaluation Study. JMIR Med Inform. 2022; 10: e28842.

13. Peng C., Xia F., Naseriparsa M., Osborne F. Knowledge Graphs: Opportunities and Challenges. Artif Intell Rev. 2023; 1-32.

14. Silva M.C., Eugénio P., Faria D., Pesquita C. Ontologies and Knowledge Graphs in Oncology Research. Cancers (Basel). 2022; 14: 1906

15. Sousa R.T., Silva S., Pesquita C. Explaining protein-protein interactions with knowledge graph-based semantic similarity. Comput Biol Med. 2024; 170: 108076.

16. Mohamed S.K., Nounu A., Nováček V. Biological applications of knowledge graph embedding models. Brief Bioinform. 2020; 22: 1679-93.

17. Wang H., Zu Q., Lu M., Chen R., Yang Z., Gao Y., et al. Application of Medical Knowledge Graphs in Cardiology and Cardiovascular Medicine: A Brief Literature Review. Adv Ther. 2022; 39: 4052-60.

18. Gan Z., Zhou D., Rush E., Panickan V.A., Ho Y.L., Ostrouchov G., et al. ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis. medRxiv : the preprint server for health sciences. J Biomed Inform. 2025; 162: 104761.

19. Fei H., Ren Y., Zhang Y., Ji D., Liang X. Enriching contextualized language model from knowledge graph for biomedical information extraction. Brief Bioinform. 2020; 22: bbaa110.

20. Soman K., Rose P.W., Morris J.H., Akbas R.E., Smith B., Peetoom B., et al. Biomedical knowledge graph-optimized prompt generation for large language models. Bioinformatics. 2024; 40: btae560.

21. Renaux A., Terwagne C., Cochez M., Tiddi I., Nowé A., Lenaerts T. A knowledge graph approach to predict and interpret disease-causing gene interactions. BMC Bioinformatics. 2023; 24: 324.

22. Dai Y., Guo C., Guo W., Eickhoff C. Drug–drug interaction prediction with Wasserstein Adversarial Autoencoder-based knowledge graph embeddings. Brief Bioinform. 2020; 22: bbaa256.

23. Liu X.Y., Mei X.Y. Prediction of drug sensitivity based on multi-omics data using deep learning and similarity network fusion approaches. Front Bioeng Biotechnol. 2023; 11: 1156372.

24. Guo R., Zhou Y., Long H., Shan D., Wen J., Hu H., et al. Transient receptor potential Vanilloid 1-based gene therapy alleviates orthodontic pain in rats. Int J Oral Sci. 2019; 11: 11.

25. Yadalam P.K., Anegundi R.V., Ramadoss R., Joseph B., Veeramuthu A. Felodipine repurposed for targeting TRPV1 receptor to relieve oral cancer pain. Oral Oncol. 2022; 134: 106094.

26. Yadalam P.K., Natarajan P.M., Mosaddad S.A., Heboyan A. Graph neural networks-based prediction of drug gene association of P2X receptors in periodontal pain. J Oral Biol Craniofac Res. 2024; 14: 335-8.

27. Yadalam P.K., Natarajan P.M., Saeed M.H., Ardila C.M. Variational Approaches for Drug-Disease-Gene Links in Periodontal Inflammation. Int Dent J. 2025; 75: 185-194.

28. Wang C., Yang Y., Song J., Nan X. Research Progresses and Applications of Knowledge Graph Embedding Technique in Chemistry. J Chem Inf Model. 2024; 64: 7189-213.

29. Skuta C., Popr M., Muller T., Jindrich J., Kahle M., Sedlak D., et al. Probes & Drugs portal: an interactive, open data resource for chemical biology. Nat Methods. 2017; 14: 759-60.

30. Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res. 2013; 41(Database issue): D991-5.

31. Czolbe S., Pegios P., Krause O., Feragen A. Semantic similarity metrics for image registration. Med Image Anal. 2023; 87: 102830.

32. Kulmanov M., Smaili F.Z., Gao X., Hoehndorf R. Semantic similarity and machine learning with ontologies. Brief Bioinform. 2021; 22: bbaa199.

33. Yang Y., Sun Y., Li F., Guan B., Liu J.X., Shang J. MGCNRF: Prediction of Disease-Related miRNAs Based on Multiple Graph Convolutional Networks and Random Forest. IEEE Trans Neural Netw Learn Syst. 2024; 35: 15701-15709.

34. Jia C., Wang F., Xing B., Li S., Zhao Y., Li Y., et al. DGAMDA: Predicting miRNA-disease association based on dynamic graph attention network. Int J Numer Method Biomed Eng. 2024; 40: e3809.

35. Ning Q., Zhao Y., Gao J., Chen C., Li X, Li T., et al. AMHMDA: attention aware multi-view similarity networks and hypergraph learning for miRNA-disease associations identification. Brief Bioinform. 2023; 24: bbad094.

36. Xu X,. Li Y., Yang Z., Zhou Z. Transient receptor potential vanilloid type-1 regulates periodontal disease damage via the PI3K/AKT signaling pathway. Iran J Basic Med Sci. 2022; 25: 635-42.

37. Loos B.G., Dyke T.E. Van. The role of inflammation and genetics in periodontal disease. Periodontology 2000. 2020; 83: 26-39.

38. Spaull R.V.V., Kurian M.A. SLC6A3-Related Dopamine Transporter Deficiency Syndrome. In: Adam MP, Feldman J, Mirzaa GM, Pagon RA, Wallace SE, Amemiya A, editors. Seattle (WA); 1993.

39. Hu H., Zhao H., Zhong T., Dong X., Wang L., Han P., et al. Adaptive deep propagation graph neural network for predicting miRNA-disease associations. Brief Funct Genomics. 2023; 22: 453-62.

40. Jiao C.N., Zhou F., Liu B.M., Zheng C.H., Liu J.X., Gao Y.L. Multi-Kernel Graph Attention Deep Autoencoder for MiRNA-Disease Association Prediction. IEEE J Biomed Health Inform. 2024; 28: 1110-21.