Please wait a minute...
Tsinghua Science and Technology  2021, Vol. 26 Issue (4): 475-483    doi: 10.26599/TST.2020.9010011
    
HSPOG: An Optimized Target Recognition Method Based on Histogram of Spatial Pyramid Oriented Gradients
Shaojun Guo(),Feng Liu(),Xiaohu Yuan(),Chunrong Zou(),Li Chen(),Tongsheng Shen*()
National Innovation of Defense Technology, Academy of Military Sciences PLA China, Beijing 100071, China.
Department of Automation, Tsinghua University, Beijing 100084, China.
Download: PDF (14612 KB)      HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

The Histograms of Oriented Gradients (HOG) can produce good results in an image target recognition mission, but it requires the same size of the target images for classification of inputs. In response to this shortcoming, this paper performs spatial pyramid segmentation on target images of any size, gets the pixel size of each image block dynamically, and further calculates and normalizes the gradient of the oriented feature of each block region in each image layer. The new feature is called the Histogram of Spatial Pyramid Oriented Gradients (HSPOG). This approach can obtain stable vectors for images of any size, and increase the target detection rate in the image recognition process significantly. Finally, the article verifies the algorithm using VOC2012 image data and compares the effect of HOG.



Key wordsHistograms of Oriented Gradients (HOG)      Histogram of Spatial Pyramid Oriented Gradients (HSPOG)      object recognition      spatial pyramid segmentation     
Received: 13 March 2020      Published: 12 January 2021
Fund:  National Natural Science Foundation of China(51802348)
Corresponding Authors: Tongsheng Shen     E-mail: guoba2000@163.com;liufeng_cv@126.com;yxh96105@ 163.com;crzou_aitrc@163.com;357237301@qq.com;18710071768@163.com
About author: Shaojun Guo received the PhD degree from Navy Aeronautical University, Yantai, China in 2017. He is currently a researcher at National Innovation of Defense Technology, Academy of Military Sciences PLA China. His current research interests include metamaterial design and machine learning algorithms.|Feng Liu received the PhD degree from Navy Aeronautical University, Yantai, China in 2017. He is currently a researcher at National Innovation of Defense Technology, Academy of Military Science PLA China. His current research interests include image processing, target recognition, and machine learning algorithms.|Xiaohu Yuan received the PhD degree from Tsinghua University, Beijing, China in 2018. He is currently an engineer at the Department of Automation, Tsinghua University. His research interests include image processing and quantum information processing.|Chunrong Zou received the PhD degree from National University of Defense Technology in 2016. He has been working as a researcher at National Innovation of Defense Technology, Academy of Military Science PLA China since 2018, where he is mainly involved in the electromagnetic design, characterization of metamaterials, and artificial dielectrics. He has authored/coauthored more than 20 papers in peer-reviewed journals and conference proceedings. His research interests include information and communication engineering and computer application on metamaterial design.|Li Chen received the PhD degree from National Defence University of People’s Liberation Army in 2015. He has been working as a boffin at National Innovation of Defense Technology, Academy of Military Science PLA China since 2018. He is now engaged in scientific and technological innovation management research and has published more than 20 academic articles in domestic newspaper and journals. His main research interest is the application of computational procedures to project management.|Tongsheng Shen received the PhD degree from Beijing University of Aeronautics and Astronautics, Beijing, China in 2002. He is currently a professor at National Institute of Defense Technology Innovation. His research interests include information and communication engineering and computer application.
Cite this article:

Shaojun Guo,Feng Liu,Xiaohu Yuan,Chunrong Zou,Li Chen,Tongsheng Shen. HSPOG: An Optimized Target Recognition Method Based on Histogram of Spatial Pyramid Oriented Gradients. Tsinghua Science and Technology, 2021, 26(4): 475-483.

URL:

http://tst.tsinghuajournals.com/10.26599/TST.2020.9010011     OR     http://tst.tsinghuajournals.com/Y2021/V26/I4/475

Fig. 1 Cropping or warping to fit a fixed size.
n, it would be filled with zeros along the large edge until it can be divided by 2n exactly.
">
Fig. 2 Zooming method used in HSPOG. If the large edge of an image can not be divided by 2n, it would be filled with zeros along the large edge until it can be divided by 2n exactly.
Fig. 3 Flow chart of HSPOG ( f is the feature vector of HSPOG).
Zi is the i-th bin).
">
Fig. 4 Two types of bins (Zi is the i-th bin).
Fig. 5 Spatial pyramid scales and the feature map of an image.
Fig. 6 Curves of recognition rate and false alarm of experiments about (a) airplane, (b) bicycle, (c) bird, (d) boat, (e) bottle, and (f) bus.
Fig. 7 Curves recognition rate of and false alarm of experiments about (a) car, (b) chair, (c) cow, (d) dog, (e) horse, and (f) motobike.
Fig. 8 Curves of false alarm and recognition rate of experiments about (a) person, (b) potted plant, (c) sheep, (d) sofa, (e) cat, and (f) dinningtable.
Fig. 9 Part of positive and negative training samples. (a), (b), and (c) are samples of positive images and (d), (e), and (f) are samples of negtive images.
Fig. 10 Comparison of HSPOG and HOG after increasing feature quantity (Each point represents 50 sample images and the total number of testing images is 1684).
[1]   Surhone L. M., Tennoe M. T., and Henssonow S. F., Histogram of oriented gradients, Betascript Publishing, vol. 12, no. 4, pp. 1368-1371, 2010.
[2]   Liang B. and Zheng L., Diffractive phase elements based on two-dimensional artificial dielectrics, presented at the 22th International Conference on Pattern Recognition, Stockholm, Sweden, 2014.
[3]   Liu Q., Wu Z. G., and Guo J. M., The conversion of histograms of oriented gradient in different vision-angle and rotation-angle, Control Theory & Applications, vol. 27, no. 9, pp. 1269-1272, 2010.
[4]   Iamsa S. A. and Horata P., Hand written character recognition using histograms of oriented gradient features in deep learning of artificial neural network, presented at the 3th International Conference on IT Convergence and Security, Macao, China, 2013.
[5]   Pang Y. W., Yuan Y., Li X. L., and Pan J., Efficient HOG human detection, Signal Processing, vol. 91, no. 4. pp. 773-781, 2011.
[6]   Lina Y. E., Chen Y. L., and Lin J. L., Pedestrian fast detection based on histograms of oriented gradient, Computer Engineering, vol. 36, no. 22, pp. 206-207, 2010.
[7]   Grauman K. and Darrell T., The pyramid match kernel: Discriminative classification with sets of image features, presented at the 10th IEEE Conference on Computer Vision and Pattern Recognition (CVDR), Beijing, China, 2005.
[8]   Tavari N. V. and Deorankar A. V., Indian sign language recognition based on histograms of oriented gradient, International Journal of Computer Science & Information Technoloy, vol. 5, no. 3, pp. 3657-3660, 2014.
[9]   Jia H. X. and Zhang Y. J., Fast human detection by boosting histograms of oriented gradients, presented at the 8th International Conference on Image and Graphics, Tianjin, China, 2007.
[10]   Krizhevsky A., Sutskever I., and Hinton G. E., ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, vol. 25, no. 2, pp. 1-8, 2012.
[11]   Zeiler M. D. and Fergus R., Visualizing and understanding convolutional networks, presented at the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014.
[12]   Donahue J., Jia Y., and Vinyals O., DeCAF: A deep convolutional activation feature for generic visual recognition, , 2013.
[13]   Girshick R., Donahue J., Darrel T., and Malik J., Rich feature hierarchies for accurate object detection and semantic segmentation, presented at the 31th IEEE Conference on Computer Vision, Columbia, CA, USA, 2014.
[14]   He K., Zhang X., and Ren S., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, no. 9, pp. 1904-1916, 2015.
[15]   Hung P. C., Colorimetric calibration in electronic imaging devices using a look-up-table model and interpolations, Journal of Electronic Imaging, vol. 2, no. 1, p. 53, 1993.
[16]   Felzenszwalb P., Mcallester D., and Ramanan D., A discriminatively trained, multiscale, deformable part model, presented at the 25th Conference on Computer Vision and Pattern Recognition (CVPR), Alaska, AK, USA, 2008.
[17]   Dong J. P. and Kim C., A hybrid bags-of-feature model for sports scene classification, Journal of Signal Processing Systems, vol. 81, no. 2, pp. 249-263, 2014.
[1] Haiming Huang, Junhao Lin, Linyuan Wu, Bin Fang, Zhenkun Wen, Fuchun Sun. Machine Learning-Based Multi-Modal Information Perception for Soft Robotic Hands[J]. Tsinghua Science and Technology, 2020, 25(02): 255-269.
[2] . Computation of Edge-Edge-Edge Events Based on Conicoid Theory for 3-D Object Recognition[J]. Tsinghua Science and Technology, 2009, 14(2): 264-270.