박사

Robust voice activity detection using formant frequencies

유인철 2015년
논문상세정보
' Robust voice activity detection using formant frequencies' 의 주제별 논문영향력
논문영향력 선정 방법
논문영향력 요약
주제
  • voice activity detection
동일주제 총논문수 논문피인용 총횟수 주제별 논문영향력의 평균
12 0

0.0%

' Robust voice activity detection using formant frequencies' 의 참고문헌

  • Y . Ma and A. Nishihara, "Efficient voice activity detection algorithm using long-termspectral flatness measure," EURASIP Journal on Audio, Speech, and Music Processing,2013.
  • Y . Ephraim and D. Malah, "Speech enhancement using a minimum-mean square errorshort-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech andSignal Processing, vol. 32, no. 6, pp. 1109-1121, 1984.
  • W. Q. Syed and H. C. Wu, "Speech waveform compression using robust adaptive voiceactivity detection for nonstationary noise in multimedia communications," inProceedings of Global Telecommunications Conference, 2007.
  • S . S. Stevens, J. Volkman and E. B. Newman, "A scale for the measurement of thepsychological magnitude pitch," The Journal of the Acoustical Society of America, vol.8, no. 3, pp. 185-190, 1937.
  • S . Pigeon and P. Verlinde, "Fusing fast algorithms to achieve efficient speech detectionin FM broadcasts," in Proceedings of Interspeech, 2009.
  • S . Mousazadeh and I. Cohen, "Voice activity detection in presence of transient noiseusing spectral clustering," IEEE Transactions on Audio, Speech, and LanguageProcessing, vol. 21, no. 6, pp. 1261-1271, 2013.
  • S . Gazor and W. Zhang, "A soft voice activity detector based on a Laplacian-Gaussian82model," IEEE Transactions on Speech and Audio Processing, vol. 11, no. 5, pp. 498-505,2003.
  • R . O. Duda, P. E. Hart and D. G. Stork, Pattern classification, John Wiley & Sons, 2012.
  • P . Ladefoged, A Course in Phonetics, Thomson Wadsworth, 2006.
  • M. J. Hunt, "Spectral signal processing for ASR," Proceedings of Automatic SpeechRecognition and Understanding (ASRU) Workshop, vol. 1, pp. 17-25, 1999.
  • M . Marzinzik and B. Kollmeier, "Speech pause detection for noise spectrum estimationby tracking power envelope dynamics," IEEE Transactions on Speech and AudioProcessing, vol. 10, no. 2, pp. 109-118, 2002.
  • M . Fujimoto, K. Ishizuka and H. Kato, "Noise robust voice activity detection based onstatistical model and parallel non-linear Kalman filtering," in Proceedings of IEEEInternational Conference on Acoustics, Speech and Signal Processing, 2007.
  • L . R. Rabiner and M. R. Sambur, "An algorithm for determining the endpoints of isolatedutterances," Bell Syst. Tech. J, vol. 54, no. 2, pp. 297-315, 1975.
  • J. Kotus, K. Lopatka and A. Czyzewski, "Detection and localization of selected acousticevents in acoustic field for smart surveillance applications," Multimedia Tools andApplications, vol. 68, no. 1, pp. 5-21, 2014.
  • J . Sohn, N. S. Kim and W. Sung, "A statistical model-based voice activity detection," IEEESignal Processing Letters, vol. 6, no. 1, pp. 1-3, 1999.
  • J . Sohn and W. Sung, "A voice activity detector employing soft decision based noisespectrum adaptation," in Proceedings of the 1998 IEEE International Conference onAcoustics, Speech and Signal Processing, 1998.
  • J . Ramirez, J. C. Segura, C. Benitez, A. De La Torre and A. Rubio, "Efficient voice activitydetection algorithms using long-term speech information," Speech communication, vol.42, no. 3, pp. 271-287, 2004.
  • J . L. Shen, J. W. Hung and L. S. Lee, "Robust entropy-based endpoint detection forspeech recognition in noisy environments," in Proceedings of International Conferenceon Spoken Language Processing, 1998.
  • J . H. Chang, N. S. Kim and S. K. Mitra, "Voice activity detection based on multiplestatistical models," IEEE Transactions on Signal Processing, vol. 54, no. 6, pp. 1965-1976,2006.
  • J . H. Chang and N. S. Kim, "Voice activity detection based on complex Laplacian model,"Electronics Letters, vol. 39, no. 7, pp. 632-634, 2003.
  • J . Garofolo, L. Lamel, W. Fisher, J. Fiscus, D. Pallett, N. Dahlgren and V. Zue, "TIMIT:acoustic-phonetic continuous speech corpus," in Proceedings of Linguistic DataConsortium, 1993.
  • ITU, "A silence compression scheme for G.729 optimized for terminals conforming toITU-T V.70," ITU-T Recommendation G. 729 Annex B, 1996.80
  • I TU, "G.729 : Coding of speech at 8 kbit/s using conjugate-structure algebraic-codeexcitedlinear prediction (CS-ACELP)," 2012. [Online]. Available: http://www.itu.int/rec/TREC-G.729/e.
  • I . V. McLoughlin, "Super-audible voice activity detection," IEEE/ACM Transactions onAudio, Speech and Language Processing, vol. 22, no. 9, pp. 1424-1433, 2014.
  • I . C. Yoo and D. Yook, "Robust voice activity detection using the spectral peaks of vowelsounds," ETRI Journal, vol. 31, no. 4, pp. 451-453, 2009.83
  • H. Lim, I. Yoo, Y. Cho and D. Yook, "Speaker localization in noisy environments usingsteered response voice power," IEEE Transactions on Consumer Electronics, vol. 61, no.1, pp. 112-118, 2015.
  • H . Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken wordrecognition," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26, no.1, pp. 43-49, 1978.
  • H . Hirsch, "FaNT: Filtering and Noise Adding Tool," 2005. [Online]. Available:http://dnt.kr.hsnr.de/download.html.
  • H . G. Hirsch and D. Pierce, "The AURORA experimental framework for the performanceevaluation of speech recognition systems under noise conditions," in Proceedings ofISCA ITRW ASR2000 Automatic Speech Recognition: Challenges for the Next Millenium,2000.
  • F. d. Wet, K. Weber, L. Boves, B. Cranen, S. Bengio and H. Bourlard, "Evaluation offormant-like features on an automatic vowel classification task," The Journal of theAcoustical Society of America, vol. 116, no. 3, pp. 1781-17892, 2004.
  • F. Faubel, M. Georges, K. Kumatani, A. Bruhn and D. Klakow, "Improving hands-freespeech recognition in a car through audio-visual voice activity detection," inProceedings of Joint Workshop on Hands-Free Speech Communication and MicrophoneArrays, 2011.
  • ETSI, "Voice activity detector (VAD) for Adaptive Multi-Rate (AMR) speech trafficchannels," ETSI EN 201 108 Recommendation, 2002.
  • E TSI, "Digital cellular telecommunications system (Phase 2+); Adaptive Multi Rate (AMR)speech; ANSI-C code for the AMR speech codec (3GPP TS 06.73 version 7.6.0 Release1998)," 2001. [Online]. Available:http://webapp.etsi.org/workprogram/Report_WorkItem.asp?WKI_ID=15217.
  • E . Zwicker, "Subdivision of the audible frequency range into critical bands(Frequenzgruppen)," The Journal of the Acoustical Society of America, vol. 33, no. 2, p.248, 1961.
  • E . Nemer, R. Goubran and S. Mahmoud, "Robust voice activity detection using higherorderstatistics in the LPC residual domain," IEEE Transactions on Speech and AudioProcessing, vol. 9, no. 3, pp. 217-231, 2001.
  • D . Vlaj, Z. Kacic and M. Kos, "Voice activity detection algorithm using nonlinear spectralweights, hangover and hangbefore criteria," Computers and Electrical Engineering 38,pp. 1820-1836, 2012.
  • D . A. Reynolds and R. C. Rose, "Robust text-independent speaker identification usingGaussian mixture speaker models," IEEE Transactions on Speech and Audio Processing,vol. 3, no. 1, pp. 72-83, 1995.
  • C . E. Shannon, "A mathematical theory of communication," Bell System TechnicalJournal, vol. 27, pp. 379-423, 1948.
  • B . F. Wu and K. C. Wang, "Robust endpoint detection algorithm based on the adaptiveband-partitioning spectral entropy in adverse environments," IEEE Transactions onSpeech and Audio Processing, vol. 13, no. 5, pp. 762-775, 2005.
  • A. M. Kondoz and B. G. Evans, "A high quality voice coder with integrated echo cancellerand voice activity detector for VSAT systems," in Proceedings of 3rd EuropeanConference on Satellite Communications, 1993.
  • A . Varga and H. J. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92 : A database and an experiment to study the effect of additive noise on speechrecognition systems," Speech communication, vol. 12, no. 3, pp. 247-251, 1993.
  • A . Davis, S. Nordholm and R. Togneri, "Statistical voice activity detection using lowvariancespectrum estimation and an adaptive threshold," IEEE Transactions on Audio,81Speech, and Language Processing, vol. 14, no. 2, pp. 412-424, 2006.