Accurate indoor pathloss prediction is crucial for optimizing wireless communication in indoor settings, where diverse materials and complex electromagnetic interactions pose significant modeling challenges. This paper introduces TransPathNet, a novel two-stage deep learning framework that leverages transformer-based feature extraction and multiscale convolutional attention decoding to generate high-precision indoor radio pathloss maps. TransPathNet demonstrates state-of-the-art performance in the ICASSP 2025 Indoor Pathloss Radio Map Prediction Challenge, achieving an overall Root Mean Squared Error (RMSE) of 10.397 dB on the challenge full test set and 9.73 dB on the challenge Kaggle test set, showing excellent generalization capabilities across different indoor geometries, frequencies, and antenna patterns.
Fig 1: Overview of TransPathNet training process. The framework employs a two-stage architecture: a coarse stage and a fine stage.
TransPathNet follows a U-Net-like architecture with a transformer-based encoder and multi-scale convolutional attention-based decoder. The encoder is based on TransNeXt, a state-of-the-art backbone that extracts hierarchical features from complex environmental data. We incorporate the Efficient Multiscale Convolutional Attention Decoder (EMCAD), which refines and reconstructs pathloss maps at multiple scales through an attention mechanism.
To capture the complexity of indoor propagation, TransPathNet extends the default three-channel inputs (reflectance, transmittance, distance) with enhanced features:
These additional channels collectively aid the model in generalizing to diverse indoor layouts, materials, and operating conditions.
Our system employs a two-stage coarse-to-fine training strategy to achieve high-precision prediction results:
Implementation Details: The model is implemented in PyTorch and trained using the Adam optimizer with an initial learning rate of 10-4, halved at 50% and 75% of training progress. Input features are resized to 384×384 across all training and evaluation sets. Random flips and rotations are applied to input features to improve generalizability. Mean Squared Error (MSE) loss is used for training. All experiments were conducted on an NVIDIA RTX 4090 GPU with a batch size of 4 for 30 epochs.
Fig 2: Visual comparison of pathloss predictions across different stages of our pipeline for a particular input.
RMSE in dB is the main metric for evaluating the model. The test dataset is divided into three tasks, each of which aims to evaluate the adaptability of the model to new (1) geometric environments, (2) frequencies, and (3) antenna patterns, with weights of 30%, 30%, and 40%, respectively.
Case | Two-Stage | Post-Process | RMSE(dB): Kaggle | RMSE(dB): Full |
---|---|---|---|---|
Coarse only | ✗ | ✗ | 9.93 | 10.327 |
+ Two-Stage Training | ✓ | ✗ | 9.75 | 10.430 |
Full pipeline | ✓ | ✓ | 9.73 | 10.397 |
The average inference time is about 43.8 ms per sample on the RTX 4090.
Our coarse-to-fine training strategy enables high-precision prediction results by first generating a rough approximation and then refining details.
State-of-the-art transformer-based backbone for robust feature extraction from complex environmental data.
Efficient Multiscale Convolutional Attention Decoder refines predictions at multiple scales through an attention mechanism.
Comprehensive feature set including Free Space PathLoss, transmission ray encoding, antenna embeddings, and spatial-frequency embeddings.
Winner of the ICASSP 2025 Indoor Pathloss Radio Map Prediction Challenge with state-of-the-art RMSE scores across different environments.
This paper presents TransPathNet, an advanced deep learning framework for indoor pathloss prediction that combines transformer-based feature extraction with multi-scale convolutional attention decoding. Our model achieves state-of-the-art performance in the ICASSP 2025 Indoor Pathloss Radio Map Prediction Challenge, demonstrating robust generalization across different geometries, frequencies, and antenna patterns. However, it is still difficult to predict high quality pathloss caused by reflections. Future work will focus on developing network designs to improve the accuracy of these predictions.
@inproceedings{li2025transpathnet,
title={TransPathNet: A Novel Two-Stage Framework for Indoor Radio Map Prediction},
author={Li, Xin and Liu, Ran and Xu, Saihua and Razul, Sirajudeen Gulam and Yuen, Chau},
booktitle={Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
year={2025},
month={April}
}