Transformer-Based 3D Object Detection

Jiayin  Li; Yixin Ma; Jiagu Pan; Xing Xu

doi:10.31686/ijier.vol12.iss4.4220

Transformer-Based 3D Object Detection

Authors

Jiayin Li

Shanghai University of Engineering Science

Author
Yixin Ma

Shanghai University of Engineering Science

Author
Jiagu Pan

Shanghai University of Engineering Science

Author
Xing Xu

Shanghai University of Engineering Science

Author

DOI:

https://doi.org/10.31686/ijier.vol12.iss4.4220

Keywords:

Array, Array, Array, Array, Array

Abstract

This paper mainly studies object detection methods based on Transformer. Transformer, as a natural language processing technology, is widely used in computer vision tasks such as image classification and object detection. This paper introduces an object detection method based on scale point cloud Transformer, which provides a new research direction for object detection in the future.

Author Biographies

Jiayin Li, Shanghai University of Engineering Science

School of Electronic and Electrical Engineering
Yixin Ma, Shanghai University of Engineering Science

School of Electronic and Electrical Engineering
Jiagu Pan, Shanghai University of Engineering Science

School of Electronic and Electrical Engineering
Xing Xu, Shanghai University of Engineering Science

School of Electronic and Electrical Engineering

References

[1] Liu S., Cao Y., Huang W., etc. Radar point cloud segmentation integrating sparse attention and instance enhancement [J]. Chinese Journal of Image and Graphics, 2023, 28(02): 483-494. DOI: https://doi.org/10.11834/jig.210787

[2] Zhou J., Hu Y., Hu C., et al. Weakly perceptual target detection method based on point cloud completion and multi-resolution Transformer [J/OL]. Computer Applications: 1-13 [2023-03-27].

[3] Han L., Gao Y., Shi Z. Radar point cloud three-dimensional target detection based on sparse Transformer [J]. Computer Engineering, 2022, 48(11): 104-110+144.

[4] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al. Attention is all you need. In Advances in neural information processing systems, 2017:5998-6008.

[5] Devlin, J., Chang, MW, Lee, K., et al. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, 1:4171-4186.

[6] Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations, 2021.

Downloads

PDF

Published

2025-03-25

Issue

Vol. 12 No. 4 (2024): International Journal for Innovation Education and Research

Section

Research Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyrights for articles published in IJIER journals are retained by the authors, with first publication rights granted to the journal. The journal/publisher is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author for more visit Copyright & License.

How to Cite

Li, J., Ma, Y., Pan, J., & Xu, X. (2025). Transformer-Based 3D Object Detection. International Journal for Innovation Education and Research, 12(4), 1-5. https://doi.org/10.31686/ijier.vol12.iss4.4220

Download Citation