박사

화자 인식을 위한 배경 화자 음성의 대표 특징을 사용한 히스토그램 등화 기법 = Histogram Equalization Using Representative Features of Background Speakers’ Utterances for Speaker Recognition

김명재 2015년
논문상세정보
' 화자 인식을 위한 배경 화자 음성의 대표 특징을 사용한 히스토그램 등화 기법 = Histogram Equalization Using Representative Features of Background Speakers’ Utterances for Speaker Recognition' 의 주제별 논문영향력
논문영향력 선정 방법
논문영향력 요약
주제
  • i-vector
  • plda
  • 가우시안 혼합 모델
  • 서포트 벡터 머신
  • 채널 보상
  • 특징 정규화
  • 화자 식별
  • 화자 인식
  • 화자 인증
  • 히스토그램 등화 기법
동일주제 총논문수 논문피인용 총횟수 주제별 논문영향력의 평균
160 0

0.0%

' 화자 인식을 위한 배경 화자 음성의 대표 특징을 사용한 히스토그램 등화 기법 = Histogram Equalization Using Representative Features of Background Speakers’ Utterances for Speaker Recognition' 의 참고문헌

  • Wan, V., and Renals, S., “Evaluation of kernal methods for speaker verification and identification”, 2002.
  • Wan, V., and Campbell, W. M., “Support vector machines for speaker verification and identification”, Neural networks signal processing, vol. 2, pp. 775-784, 2000
  • Viikki, O., and Laurila, K., “Cepstral domain segmental feature vector normalization for noise robust speech recognition”, Speech Communication, vol. 25.1, pp. 133-147, 1998.
  • The NIST year 2008 speaker recognition evaluation plan. [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/spk/2008/index.html
  • The NIST year 2005 speaker recognition evaluation plan. [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/spk/2005/index.html
  • The NIST year 2004 speaker recognition evaluation plan. [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/spk/2004/index.html
  • Skosan, M., and Mashao, D., “Modified segmental histogram equalization for robust speaker verification”, Pattern Recognition Letters, vol. 27.5, pp. 479-486, 2006.
  • Segura, J. C., Ben tez, C., De la Torre, A., Rubio, A. J., and Ram rez, J., “Cepstral domain segmental nonlinear feature transformations for robust speech recognition.” IEEE Signal Processing Letters, vol. 11.5, pp. 517-520, 2004.
  • Sch lkopf, B., Smola, A., and M ller, K. R., “Kernel principal component analysis”, Artificial Neural Networks—ICANN'97, pp. 583-588, 1997.
  • SILK, Super wideband audio codec, [Online], Available: https://developer.skype.com/silk
  • Reynolds, D. A., Thomas F. Q., and Robert B. D., “Speaker verification using adapted Gaussian mixture models”, Digital signal processing, vol. 10.1, pp. 19-41, 2000
  • Reynolds, D. A. and Rose, R. C., “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Transactions on Speech Audio Processing, vol. 3, pp.72-83, 1995.
  • Prince, S. J., and Elder, J. H., “Probabilistic linear discriminant analysis for inferences about identity”, International Conference on Computer Vision, pp. 1-8, 2007.
  • Pelecanos, J., and Sridharan, S., “Feature warping for robust speaker verification”, Odyssey, pp. 213-218, 2001.
  • Opus, Opus interactive audio codec, [Online], Available: http://opuscodec.org
  • Kim, M. S., Yang, I. H., and Yu, H. J., “Robust speaker identification using greedy kernel PCA”, IEEE International Conference on Tools with Artificial Intelligence, vol. 2, pp.143-146, 2008.
  • Kenny, P., “Bayesian Speaker Verification with Heavy-Tailed Priors”, Odyssey, pp. 14, 2010.
  • Kenny, P., Ouellet, P., Dehak, N., Gupta, V., and Dumouchel, P., “A study of interspeaker variability in speaker verification”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 16.5, pp. 980-988, 2008.
  • Jiang, Y., Lee, K. A., Tang, Z., Ma, B., Larcher, A., and Li, H., “PLDA Modeling in I-Vector and Supervector Space for Speaker Verification”, Interspeech, 2012.
  • Huang, X., Acero, A., Hon, H. W., and Foreword By-Reddy, R., Spoken language processing: A guide to theory, algorithm, and system development, Upper Saddle River, NJ: Prentice Hall PTR, 2001,
  • Gonzalez, R. C., Wintz, P., Digital image processing, Addision-Wesley Publishing Company pp. 275-281, 1987.
  • Gauvain, J. L., and Lee, C. H., “Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains”, IEEE Transaction on Speech Audio Processing, vol. 2, pp. 291-298, 1994.
  • Garcia-Romero, D., and Espy-Wilson, C. Y., “Analysis of i-vector Length Normalization in Speaker Recognition Systems”, Interspeech, pp. 249-252, 2011
  • G.729, Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear prediction(cs-acelp), [Online], Available: http://www.itu.int/rec/T-REC-G.729-200701-S/en
  • Franc, V., “Optimization algorithms for kernel methods”, Prague: A PhD dissertation. Czech Technical University, 2005.
  • Franc, V., and Hlav č, V., “Greedy kernel principal component analysis”, Cognitive Vision Systems, pp. 87-105, 2006.
  • Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., and Woodland, P., The HTK book, Cambridge: Entropic Cambridge Research Laboratory, vol. 2, 1997.
  • Duda, R. O., Hart, P. E., and Stork, D. G., Pattern classification, John Wiley & Sons, 2012.
  • Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., and Ouellet, P., “Front-end factor analysis for speaker verification”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 19.4, pp. 788-798, 2011.
  • De La Torre, A., Peinado, A. M., Segura, J. C., P rez-C rdoba, J. L., Ben tez, M. C., and Rubio, A. J., “Histogram equalization of speech representation for robust speech recognition”, IEEE Transactions on Speech and Audio Processing, vol. 13.3, pp. 355-366, 2005.
  • Cortes, C., and Vapnik, V., “Support-vector networks”, Machine learning, vol. 20.3, pp. 273-297, 1995.
  • Chang, C. C., and Lin, C. J., “LIBSVM: a library for support vector machines”, ACM Transactions on Intelligent Systems and Technology, vol. 2.3, pp. 27, 2011.
  • Cannon, R. L., Dave, J. V., and Bezdek, J. C., “Efficient implementation of the fuzzy c-means clustering algorithms”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 2, pp. 248-255, 1986.
  • Campbell, W. M., Sturim, D. E., Reynolds, D. A., “Support Vector Machines using GMM Supervectors for Speaker Verification”, IEEE Signal Processing Letters, vol.13, pp.308-311, 2006.
  • Campbell, W. M., Campbell, J. P., Gleason, T. P., Reynolds, D. A., and Shen, W., “Speaker verification using support vector machines and high-level features”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, pp. 2085-2094, 2007.
  • Campbell Jr, J. P., “Testing with the YOHO CD-ROM voice verification corpus”, International Conference of Acoustics, Speech, and Signal Processing, vol. 1, pp.341-344, 1995.
  • Bousquet, P. M., Matrouf, D., and Bonastre, J. F., “Intersession Compensation and Scoring Methods in the i-vectors Space for Speaker Recognition”, Interspeech, pp.485-488, 2011..
  • Bousquet, P. M., Larcher, A., Matrouf, D., Bonastre, J. F., and Plchot, O., “Variance-spectra based normalization for i-vector standard and probabilistic linear discriminant analysis”, Odyssey, pp. 157-164, 2012.
  • Blanco, Y., Zazo, S., and Principe, J. C., “Alternative statistical gaussianity measure using the cumulative density function”, Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, pp.537-542, 2000.
  • Atal, B. S., “Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification”, the Journal of the Acoustical Society of America vol. 55.6, pp. 1304- 1312, 1974.
  • Aksoy, S., and Haralick, R. M., “Feature normalization and likelihoodbased similarity measures for image retrieval”, Pattern Recognition Letters, vol. 22.5, pp. 563-582, 2001.
  • AFE, “Transmission and Quality Aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms”, ETSI ES 202.050 V1, 2002.