IMPLEMENTASI TEKNIK DYNAMIC TIME WARPING (DTW) PADA APLIKASI SPEECH TO TEXT

Candra Dinata; Diyah Puspitaningrum; Ernawati Erna

doi:10.15408/jti.v10i1.6816

Authors

Candra Dinata Universitas Bengkulu
Diyah Puspitaningrum Universitas Bengkulu
Ernawati Erna Universitas Bengkulu

DOI:

https://doi.org/10.15408/jti.v10i1.6816

Keywords:

Pengolahan Suara, Speech to Text, MFCC, DTW

Abstract

ABSTRAK

Suara/ucapan adalah salah satu cara kita sebagai manusia untuk berkomunikasi dan mengekspresikan diri. Speech to text (ucapan ke text), merupakan salah satu bidang sains computer yaitu bidang pengolahan suara. Speech to text (STT) adalah penerjemahan kalimat (kata yang diucapkan) ke dalam text. STT merupakan proses pengolahan suatu sinyal suara, mengekstrak fitur dari sinyal suara tersebut yang selanjutkan dibandingkan dengan hasil ekstraksi dari sinyal suara yang lain untuk dapat dikenali persamaannya. Penelitian ini merancang dan membangun suatu program aplikasi Speech to Text yang mampu identifikasi suatu sinyal suara menggunakan perangkat lunak simulasi MATLAB R2016a. Terdapat dua proses umum pada bidang pengolahan suara, yaitu ekstraksi fitur dan pencocokan fitur. Pada sistem ini metode mel-frequency cepstral coefficients digunakan untuk mengekstraksi fitur dan metode dynamic time warping digunakan untuk pencocokan fitur. Metode DTW yang digunakan dapat menghitung jarak atau selisih antara dua data yang dibandingkan. Rata-rata akurasi yang didapat setelah dilakukan percobaan pada pengujian kata adalah 95.85% dan pada pengujian kalimat adalah 94%.

ABSTRACT

Voice / speech is one of the ways we as human beings to communicate and express themselves. Speech to text (STT), is one of computer science is the field of sound processing. Speech to text (STT) is the translation of the sentence (the spoken word) in the text. STT is a voice signal processing, extracting features from the speech signal and then compared it with the extraction of the other sound signal to recognize the signal similarities. This research design and build an application program Speech to Text that is capable of identifying a sound signal using simulation software MATLAB R2016a. There are two common processes in the field of sound processing, feature extraction and matching features. In this system, the method mel-frequency cepstral coefficients are used to extract features and dynamic time warping method used for matching features. DTW method used can calculate the distance or the difference between the two data being compared. The average accuracy is obtained after experiments on the test word was 95.85% and the testing of the sentence is 94%.

How to Cite : Dinata, C. Puspitaningrum, D. Erna, E. (2017). IMPLEMENTASI TEKNIK DYNAMIC TIME WARPING (DTW) PADA APLIKASI SPEECH TO TEXT. Jurnal Teknik Informatika, 10(1), 49-58. doi:10.15408/jti.v10i1.6816

Permalink/DOI: http://dx.doi.org/10.15408/jti.v10i1.6816

Author Biography

Candra Dinata, Universitas Bengkulu

Teknik Infomatika

References

S. Paulson and B. Thilagavathi, "An Adaptable Speech to Sign Languange Translation Sistem," International Journal of Engineering Research & Technology (IJERT), vol. 3, no. 3, p. 1813, 2014.

S. Swamy and K. Ramakrishnan, "An Efficient Speech Recognition System," Computer Science & Engineering: An International Journal (CSEIJ), vol. 3, no. 4, pp. 21-27, August 2013.

X. Han, "Gesture and Voice Control of Internet of Things," Electronics and Computer Engineering at Massey University, Auckland, New Zealand, 2015.

S. D. Dhingra, G. Nijhawa and P. Pandit, "Isolated Speech Recognition Using MFCC and DTW," International Journal of Advanced Research in Electrical , Electronic and Instrumentation Engineering, vol. 2, no. 8, pp. 4085-4092, 8 August 2013.

S. W. Smith, "The Scientist and Engineer's Guide to Digital Signal Processing," California Technical Publishing, 1997. [Online]. Available: http://www.dspguide.com/ch1.htm. [Accessed 26 October 2016].

A. Sannino, "Analyzing Discontinuous Speeech in EU Conversation : A Methodological Proposal," Journal of Pragmatic, vol. 38, pp. 543-566, 2006.

K. Chakraborty, A. Talele and P. S. Upadhya, "Voice Recognition Using MFCC Algorithm," International Journal of Innovative Research in Advanced Engineering (IJIRAE), pp. 158-161, 2014.

L. Jalan, R. Masram, R. Jadhav and T. Palav, "Speech Recognition Based Learning System," International Journal of Engineering Trens and Tehcnology, vol. 4, no. 2, pp. 165-169, 2013.

A. W. B. H. &. T. D. Dennis, Sistem Analysis and Design with UML Version 2.0, United States of America: John Willey & Sons, Inc., 2005.

H. Arman, "Analisa Performance Metode Gabor Filter Untuk Pengenalan Wajah," Fakultas Sains dan Teknologi Universitas Islam Negeri Sultan Syarif Kasim Riau, Pekanbaru, 2012.

N. S. Sukmadinata, Metode Penelitian Pendidikan, Bandung: Rosda, 2005.