박사

Multi-Channel Voice Activity Detection using Multi-Source

Hyeopwoo Lee 2015년
논문상세정보
' Multi-Channel Voice Activity Detection using Multi-Source' 의 주제별 논문영향력
논문영향력 선정 방법
논문영향력 요약
주제
  • speech recognition
  • speech-based interface
  • voice activity detection
동일주제 총논문수 논문피인용 총횟수 주제별 논문영향력의 평균
84 0

0.0%

' Multi-Channel Voice Activity Detection using Multi-Source' 의 참고문헌

  • http://www.nist.gov/speech/tests/sre/index.html
  • V. P. Minotto, C. R. Jung, and B. Lee, Simultaneous-speaker voice activitydetection and localization using mid-fusion of SVM and HMMs, IEEETransactions on Multimedia, vol. 16, pp. 1032-1044, 2014.64
  • V. P. Minotto, C. B. O. Lopes, J. Scharcanski, C. R. Jung, and B. Lee, Audiovisual voice activity detection based on microphone arrays and colorinformation, IEEE Transactions on Multimedia, vol. 7, pp. 147-156, 2013.
  • T. Pirinen and A. Visa, Signal Independent Wideband Activity DetectionFeatures for Microphone Arrays, IEEE International Conference on Acoustics,Speech, and Signal Processing, pp. 1109-1112, 2006.
  • T. F. Cootes, G. J. Edwards, and C. J. Taylor, Active appearance models, inProc. European Conference on Computer Vision, vol. 2, pp. 484-498, 1998.
  • S.G. Tanyer and H. Ozer, Voice activity detection in nonstationary noise, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 4, pp. 478?482, 2000.
  • S. Gazor and W. Zhang, A soft voice activity detector based on a Laplacian-Gaussian model, IEEE Transactions on Speech and Audio Processing, vol. 11,no. 5, pp. 498?505, 2003.
  • S. Gannot, D. Burshtein, and E. Weinstein, Signal enhancement usingbeamforming and non-stationarity with applications to speech, IEEETransactions on Signal Processing, vol. 49, no. 8, pp. 1614-1626, 2001.
  • R. O. Schmidt, Multiple emitter location and signal parameter estimation, IEEE Transactions on Antennas and Propagation, vol. AP-34, no. 3, pp. 276-280, 1986.
  • R. O. Duda, P. R. Hart and D.G. Stork, Pattern Classification, second ed.. JohnWiley and Sons, 2001.
  • P. Rose, Technical forensic speaker recognition: evaluation, types and testingof evidence, Computer Speech and Language, pp. 159-191, 2006.
  • O. Siohan, T. A. Myrvoll, and C. ?H. Lee, Structural maximum a posteriorilinear regression for fast HMM adaptation, Computer Speech and Language,vol. 16, pp. 5-24, 2002.
  • M. W. Hoffman, Z. Li, and D. Khataniar, "GSC-based spatial voice activitydetection for enhanced speech coding in the presence of competing speech,"IEEE Transactions on Speech and Audio Processing, vol. 9, no. 2, pp. 175-178,2001.
  • M. Ferras, L. Cheung Chi, C. Barras, and J. Gauvain, Constrained MLLR forSpeaker Recognition, IEEE International Conference on Acoustics, Speech,and Signal Processing, pp.IV-53-IV-56, 2007.
  • L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, 1993.
  • K. Yu and J. F. Gales, Discriminative cluster adaptive training, IEEE62Transactions on Speech and Audio Processing, vol. 14, no.5, pp.1694-1703,2006.
  • K. Kiyohara, Y. Kaneda, S. Takahashi, H. Nomura, and J. Kojima, AMicrophone Array System for Speech Recognition, IEEE InternationalConference on Acoustics, Speech, and Signal Processing, pp. 215-218, 1997.
  • J. M. Valin, S. Yamamoto, J. Rouat, F. Michaud, K. Nakadai, and H. Okuno, Robust recognition of simultaneous speech by a mobile robot, IEEETransactions on Robotics, vol. 23, no. 4, pp. 742-752, 2007.
  • J. Gonzalez-Rodriguez, A. Drygajlo, D. Ramos-Castro, M. Garcia-Gomar, andJ. Ortega-Garcia, Robust estimation, interpretation and assessment oflikelihood ratios in forensic speaker recognition. Computer Speech andLanguage, pp. 331?355, 2006
  • J. Gauvain and C. Lee, Maximum a posteriori estimation for multivariateGaussian mixture observations of Markov chains, IEEE Transactions onSpeech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.
  • J. DiBiase, H. Silverman, and M. Brandstein, Microphone Arrays: SignalProcessing Techniques and Applications, pp. 157-180, Springer, 2001.
  • J. Chen and W. Ser, Speech Detection Using Microphone Array, ElectronicsLetters, vol. 36, pp. 181-182, 2000.
  • I. Potamitis, Estimation of Speech Presence Probability in the Field ofMicrophone Array, IEEE Signal Processing Letters, vol. 11, no. 12, pp. 956-959, 2004.
  • I. Moon, K. Kim, J. Ryu, and M. Mun, Face Direction-Based HumanComputer Interface Using Image Observation and EMG Signal for theDisabled, IEEE International Conference on Robotics and Automation, pp.1515-1520, 2003.
  • I. Almajai and B. Milner, Using audio-visual features for robust voice activitydetection in clean and noisy speech, EUSIPCO, pp. 123?126, 2008
  • H. Lee and D. Yook, Unsupervised adaptation without estimatedtranscriptions, IEEE International Conference on Acoustics, Speech, andSignal Processing, pp. 7918-7921, 2013.
  • H. Lee and D. Yook, Space time voice activity detection, IEEE Transactionson Consumer Electronics, vol. 55, no. 3, pp. 1471-1476, 2009.
  • H. Lee and D. Yook, Feature adaptation for robust mobile speech recognition, IEEE Transactions on Consumer electronics, pp. 1393-1398, 2012.61
  • G. Kim and N.I. Cho, Voice activity detection using phase vector inmicrophone array, Electronics Letters, vol. 43 no. 14, 2007.
  • Digital Cellular Telecommunications System (Phase 2+); Voice ActivityDetector (VAD) for Adaptive Multi Rate (AMR) Speech Traffic Channels,GSM 06.94 V7.1.1 (ETSI EN 301 708), 1999.
  • D. Reynolds, Speaker identification and verification using Gaussian mixturemodels, Speech Communication, vol. 17, pp. 91-108, 1995.
  • D. Povey, Discriminative training for large vocabulary speech recognition, Ph.D thesis, Cambridge Univ., 2003.
  • C. Knapp, and G. Carter, The generalized correlation method for estimationof time delay, IEEE Transactions on Speech and Audio Processing, vol.ASSP-24, no. 4, pp. 320-327, 1976.
  • C. J. Leggetter, Improved acoustic modeling for HMMs using lineartransformations, Ph.D. dissertation, Cambridge, 1995.
  • C. J. Leggetter and P. C. Woodland, Maximum likelihood linear regression forspeaker adaptation of continuous density hidden Markov models, ComputerSpeech and Language, vol. 9, pp. 171-185, 1995.
  • B. Wu and K. Wang, Robust endpoint detection algorithm based on theadaptive band-partitioning spectral entropy in adverse environments, IEEETransactions on Speech and Audio Processing, vol. 13, no. 5, pp. 762-775,2005.
  • B. Mak, J. T. kwok, and S. Ho, Kernel eigenvoice speaker adaptation, IEEETransactions on Speech and Audio Processing, vol 13, no. 5, pp. 984-992, 2005.
  • A. Stolcke, L. Ferrer, S. Kajarekar, E. Shriberg, and A. Venkataraman, MLLRTransforms as Features in Speaker Recognition, in Proc. Interspeech, pp.2425-2428, 2005.
  • A. Martin, L. Mauuary, Robust speech/non-speech detection based on LDAderivedparameter and voicing parameter for speech recognition in noisyenvironments, Speech Communication, vol. 48, pp. 191-206, 2006.
  • A. Davis, S. Nordholm, and R. Togneri, Statistical voice activity detectionusing low-variance spectrum estimation and an adaptive threshold, IEEETransactions on Speech and Audio Processing, vol. 14, no. 2, pp. 412-424,2006.
  • A Silence Compression Scheme for G.729 Optimized for TerminalsConforming to ITU-T V.70, ITU-T Rec. G.729 Annex B, 1996.63