✨ ICCV 2025 ✨

SL²A-INR

Single-Layer Learnable Activation for Implicit Neural Representation

A novel hybrid INR architecture combining Chebyshev-parameterized learnable activations with a lightweight ReLU fusion network, enabling adaptive spectral-bias tuning and achieving state-of-the-art performance across image representation, 3D shape reconstruction, and neural radiance fields.

*Equal contribution, †‡Corresponding authors

University of British Columbia University of Tehran RWTH Aachen University University of Regensburg

SL²A-INR Overview

SL²A-INR enables flexible tuning of spectral bias through learnable activation. The architecture combines a Learnable Activation Block (Ψ) parameterized by Chebyshev polynomials with a feature fusion network.

As polynomial degree K increases (4→64), reconstruction quality improves significantly, demonstrating enhanced capacity for detailed representations. This design addresses the convergence-capacity gap found in comparable methods.

SL²A-INR method overview showing learnable activation block and reconstruction quality
Architecture and reconstruction quality progression with increasing polynomial degree K.

Abstract

Implicit Neural Representations (INRs) excel at modeling continuous signals but often struggle with spectral bias—learning low frequencies first and missing fine high-frequency structure. Current methods using hand-crafted activation functions (SIREN, WIRE, FINER) or positional encodings still face limitations in capturing diverse signal types and high-frequency components.

We introduce SL²A-INR, a hybrid architecture that combines a learnable Chebyshev activation block with a lightweight ReLU fusion network. The Chebyshev polynomials learn higher-order coefficients directly, expanding the representable frequency range without fragile periodic initialization. The fusion block modulates feature flow through skip connections, enabling adaptive spectral control, sharper reconstructions, and faster convergence.

Through comprehensive experiments, SL²A-INR sets new benchmarks in accuracy, quality, and robustness across image representation, 3D shape reconstruction, and novel view synthesis tasks.

SL²A-INR architecture diagram showing Learnable Activation Block and Fusion Block
Figure 1: SL²A-INR architecture. The Learnable Activation Block (Ψ) parameterized by Chebyshev polynomials is followed by a feature fusion block with skip modulation. As polynomial degree K increases (4→64), reconstruction quality improves significantly.

🎯 Learnable Activation

Chebyshev polynomial activations learn higher-order coefficients directly during training, expanding the representable frequency range without fragile periodic initialization schemes.

⚡ Efficient Fusion

Low-rank ReLU layers modulated by the learnable activation output balance computational efficiency with expressive high-frequency modeling capabilities.

🚀 Superior Performance

State-of-the-art results across images, 3D shapes, and NeRF scenes with faster convergence and stable training under varied hyperparameters.

Method Overview

A two-block architecture designed for flexible spectral-bias tuning

1

Learnable Activation (LA) Block

Each activation function is parameterized using Chebyshev polynomials:

ψi,j(x) = K k=0 ai,j,k Tk(tanh(x))

where Tk are Chebyshev polynomials of the first kind and ai,j,k are learnable coefficients optimized via backpropagation. Layer normalization stabilizes training of high-order polynomials.

2

Fusion Block with Skip Modulation

The output of the LA Block modulates each layer via element-wise products:

z1 = Ψ(x)
zl = ReLU(Wl(zl-1 ⊙ z1) + bl)

This persistent modulation preserves high-frequency information throughout the network while maintaining computational efficiency through low-rank linear layers.

3

Adaptive Spectral Control

The polynomial degree K controls the spectral spread. Higher K values enable the network to capture finer details and higher-frequency components. Unlike fixed activation functions, our learnable approach adapts to the data, mitigating spectral bias without manual tuning.

Spectral Bias Analysis

Demonstrating reduced spectral bias on 1D function approximation

Frequency approximation error comparison across training steps
Figure 2: Convergence and frequency approximation error on a 1D periodic function with four dominant frequencies. SL²A-INR (e) shows significantly lower frequency approximation error across all frequencies compared to ReLU (b), SIREN (c), and FINER (d), demonstrating effective mitigation of spectral bias.

Key Observations

• ReLU exhibits strong spectral bias, learning higher frequencies very slowly
• SIREN mitigates some bias but still struggles with high-frequency approximation
• FINER shows improved performance but with slower convergence on some frequencies
SL²A-INR maintains consistently low error across all frequencies from early training

Experimental Results

State-of-the-art performance across diverse signal representation tasks

2D Image Representation

Image representation comparison showing text detail preservation
Figure 3: Qualitative comparison on image representation. SL²A-INR produces sharper text and preserves fine details better than FINER, SIREN, Gauss, WIRE, and ReLU+P.E.

Table 1: PSNR (dB) / SSIM comparison on DIV2K images (512×512). Best results in bold, second-best underlined.

Method #Params Image 00 Image 05 Image 10 Image 15 Average
FINER 198.9K 32.00 / 0.862 32.92 / 0.889 40.08 / 0.965 36.29 / 0.932 36.35 / 0.924
Gauss 198.9K 30.08 / 0.847 31.33 / 0.862 39.74 / 0.961 35.59 / 0.938 34.96 / 0.914
ReLU+P.E. 204.0K 30.59 / 0.851 31.22 / 0.854 40.27 / 0.973 34.59 / 0.947 35.27 / 0.916
SIREN 198.9K 29.29 / 0.831 30.73 / 0.836 37.25 / 0.950 32.23 / 0.915 33.47 / 0.896
WIRE 91.6K 28.00 / 0.773 29.26 / 0.821 33.77 / 0.862 30.49 / 0.805 30.63 / 0.818
SL²A (Ours) 330.2K 33.40 / 0.892 34.02 / 0.903 41.04 / 0.974 36.70 / 0.951 36.88 / 0.933

3D Shape Reconstruction (Occupancy Fields)

For 3D shape representation, we maintain the same architectural settings as in image representation, translating 3D coordinates into signed distance function (SDF) values. We evaluate on five shapes from the Stanford 3D Scanning Repository dataset.

Figure 4 shows the Dragon model reconstruction. SL²A-INR (0.9989 IoU) achieves superior quality with well-preserved details in both smooth low-frequency regions (body curves) and rough high-frequency areas (face details).

Dragon occupancy field reconstruction comparison
Figure 4: Occupancy volume representation comparison on the Dragon model.

Table 2: IoU comparison on signed distance field representation (Stanford 3D Scanning Repository)

Method Armadillo Dragon Lucy Thai Statue Bearded Man
FINER 0.9899 0.9895 0.9832 0.9848 0.9943
Gauss 0.9768 0.9968 0.9601 0.9900 0.9932
ReLU+P.E. 0.9870 0.9763 0.9760 0.9406 0.9939
SIREN 0.9895 0.9409 0.9721 0.9799 0.9948
WIRE 0.9893 0.9921 0.9707 0.9900 0.9911
SL²A (Ours) 0.9983 0.9989 0.9988 0.9986 0.9987

Novel View Synthesis (NeRF)

Table 3: PSNR (dB) on Blender dataset with 25 training images (reduced from standard 100 to test high-frequency detail capture)

Method Chair Drums Ficus Hotdog Lego Materials Mic Ship
ReLU+P.E. 31.32 26.38 21.46 20.18 24.49 30.59 25.90 25.16
Gauss 32.68 33.59 22.28 23.16 26.10 32.17 28.29 26.19
SIREN 33.31 33.28 22.25 24.89 27.26 32.85 29.60 27.13
WIRE 29.31 32.35 21.15 22.22 25.91 30.11 25.76 25.05
FINER 33.90 33.96 22.47 24.90 28.70 33.05 30.04 27.05
SL²A (Ours) 34.70 33.88 23.43 24.33 28.31 33.83 30.63 28.62

Analysis & Ablations

Understanding the design choices and robustness of SL²A-INR

Heatmap showing PSNR across different batch sizes and learning rates
Figure 5: Hyperparameter robustness analysis. SL²A-INR demonstrates greater stability across learning rates and batch sizes compared to FINER, SIREN, Gauss, and other methods, requiring less careful hyperparameter tuning.

🔄 Block Synergy

Removing the ReLU fusion block results in a PSNR drop of up to 6.22 dB, demonstrating the critical importance of coupling learnable activations with modulated ReLU layers for maintaining expressive power.

📊 Polynomial Degree Effect

Increasing Chebyshev polynomial degree K from 4 to 512 progressively improves performance. Skip connections (modulation) significantly enhance results, preserving high-frequency information throughout the network.

⚡ Computational Efficiency

Despite a modest parameter increase (0.33M vs 0.20M for baselines), SL²A-INR trains a 512² image in just 0.77 minutes, faster than Gauss (3.08 min) and ReLU+PE (3.43 min).

Neural Tangent Kernel (NTK) Perspective

The eigenvalue distribution of the Neural Tangent Kernel provides insights into training dynamics. Components corresponding to larger eigenvalues are learned faster, which is crucial for overcoming spectral bias.

Increasing K in SL²A-INR reduces the rate of eigenvalue decay, resulting in higher values that enhance the model's ability to capture high-frequency components. The figure shows a hierarchy: ReLU exhibits the most rapid decay, followed by SIREN, then FINER, with SL²A-INR maintaining the slowest decay—preserving spectral properties most effectively.

NTK eigenvalue distribution comparison
Figure 6: NTK eigenvalue distribution showing SL²A-INR's superior spectral properties with slower decay, enabling better high-frequency learning.

Single Image Super-Resolution

To demonstrate generalization to inverse problems, we evaluated SL²A-INR on single-image super-resolution (×2, ×4, ×6 upsampling). Our method consistently outperforms FINER with higher PSNR and SSIM while producing less noisy, sharper results—particularly visible in the ×6 setting where texture preservation is critical.

Super-resolution comparison showing better detail preservation
Figure 7: Single-image super-resolution comparison on a parrot image (1356×2040×3) demonstrates superior detail preservation and noise reduction.

🎲 Initialization Robustness

Unlike SIREN which is highly sensitive to initialization schemes, SL²A-INR maintains stable performance across Xavier uniform, Kaiming uniform/normal, and orthogonal initialization (PSNR variance < 1.5 dB).

🔧 Design Rationale

Chebyshev polynomials offer superior convergence, numerical stability, and minimax approximation properties compared to B-splines. They efficiently capture high-frequency components with fewer parameters and better stability.

📈 Scalability

Using learnable activations only in the first layer with low-rank MLPs in subsequent layers provides an optimal trade-off between expressivity and computational efficiency, avoiding the scalability issues of full KAN architectures.

Resources

📄 Paper

Read the full paper with detailed experiments, theory, and supplementary materials.

arXiv PDF

💻 Code

PyTorch implementation with training scripts, SL²A module, and dataset loaders.

GitHub Repository

🎥 Video

Watch the video presentation explaining SL²A-INR and our experimental results.

YouTube

📧 Contact

For questions, collaborations, or discussions about the work.

moein.heidari@ubc.ca

Citation

@article{heidari2024sl2a,
  title={SL$^{2}$A-INR: Single-Layer Learnable Activation for Implicit Neural Representation},
  author={Heidari, Moein and Rezaeian, Reza and Azad, Reza and 
          Merhof, Dorit and Soltanian-Zadeh, Hamid and Hacihaliloglu, Ilker},
  journal={arXiv preprint arXiv:2409.10836},
  year={2024},
  note={Accepted to ICCV 2025}
}

Acknowledgements

This work was supported by the Canadian Foundation for Innovation-John R. Evans Leaders Fund (CFI-JELF) program grant number 42816, Mitacs Accelerate program grant number AWD024298-IT33280, and the Natural Sciences and Engineering Research Council of Canada (NSERC), RGPIN-2023-03575.

We thank the authors of ChebyKAN, WIRE, and FINER for their publicly available code.