manipulate speech signals

Eszter Strupka
Manuals for:
 Audacity
 Praat
 ProsodyPro
Based on: Eszter Strupka, Gender and Culture Effects in
Human-Robot Interaction: an acoustic-prosodic analysis,
Master Thesis, Syddansk Universitet (2015)
2016
Audacity
Audacity is a piece of free, open source, cross-platform software for recording and editing audio. It provides tools for: Audio recording, editing sound files (WAV, AIFF, FLAC, MP2, MP3 or Ogg Vorbis), cutting,
copying, splicing or mixing sounds, changing speed or pitch of recordings, etc. [1]
In my thesis I used Audacity to record robots’ speech, as with this software it is possible to record the computer’s internal sound, so that the recordings do not have any background noise. I also used Audacity to
slow down the robots’ speech to make it more understandable.
Furthermore, to use Praat and ProsodyPro, you need your sound data in wave (wav.) form. If you collected
your data by taking a video, as I did, then you have to separate the audio from the video first. I did so by
using Audacity as well.
Audacity is available for download here: http://www.audacityteam.org/
Note: Festival Speech Synthesis System was used for the creation of the robots’ speech. This does not provide an opportunity to download the created speech, thus using Audacity was necessary. Find Festival
Speech Synthesis System here: http://www.cstr.ed.ac.uk/projects/festival/ [2]
Recording internal sound with Audacity:
1. To record the computer’s internal sound, you have to change a few settings. To record internal sound,
set Audacity as the followings: set “Audio host” to “Windows WASAPI”, the “Recording Device” to
“Speaker/HP (Realtek High Definition Audio) (loopback)”, the “Recording Channels” to “2 (Stereo)
Recording Channels” and the “Playback Device” to “Speaker/HP (Realtek High Definition Audio)”.
Picture1
(See Picture1)
Note: The necessary settings to record internal sound using audacity may change depending on the
soundcard and the version of Audacity used. There are many manuals online for other versions.
2. When the settings are changed, you have to push the “record” button to start the program recording,
while playing the sound to be recorded. You can stop the recording by pushing the “stop” button,
and you can continue recording by pushing the “record” button again. The latter approach allows
you to record longer utterances; as for example while recording the robots’ sound for my experiment,
the speech synthesis system only allowed to play shorter utterances at a time.
Note: When listening to the recorded sound, if there is any background noise, echo, or any other noise
that was not part of the original sound, the settings have to be changed, as this means that the software did not record the internal audio, but it recorded the sound coming from the speakers through
a microphone in the computer.
3. After recoding, unnecessary parts, such as silence, can be cut out easily, by using the cursor to mark
the unwanted part from the sound wave then pressing the “Ctrl” and “X” keys at the same time on
the keyboard, or by selecting the “Cut” option from the “Edit” menu.
4. After reaching the desired results with recording and editing the audio, save the file by choosing
“Export audio” under “Files” or by using the shortcut “Ctrl+Shift+E”. A window will pop up, where you
will have the opportunity to name your audio, and choose the desired format to save it in.
Note: For experiments always use “wave” (.wav) file format.
1 of 5
Converting video to audio using Audacity:
To be able to use Audacity for converting video to audio, you first have to install the LAME and FFmpeg
libraries for Audacity. To do so, follow the following easy steps that you can find here: http://
mikebeach.org/2012/11/26/install-lame-and-ffmpeg-libraries-for-audacity-in-windows/
After installing the LAME and FFmpeg libraries, restart Audacity and proceed as the following:
1. Open a video file by going to “File” and then “Open” and select the desired video to be converted
to audio in the pop up window.
2. Next, go to “File” again and choose “Export audio…” or use the shortcut “Ctrl+Shift+E”.
3. Now your data is saved in audio format, and ready to be analyzed using Praat and ProsodyPro.
Note: Always save your data to be run in Praat and ProsodyPro in wave (.wav)format.
Praat
Praat is a speech analysis and synthesis program. I used Praat to annotate the data in wave form for later
analysis using ProsodyPro (Find out more about ProsodyPro in the following). Praat is available for free on
http://www.fon.hum.uva.nl/praat/ and it is easy to download and run. [3]
Preparing sound data (wave) for analysis with Praat:
1. First, öpen the wave file in Praat by clicking on
“open” and “read from file”, and choosing the
file to be analyzed.
2. Once the file is opened in Praat, the next step is
to click on the “Annotate” button and click on
the “to TextGrid” button. Following this, a window will pop up, where “all tier names” have to
be set to a unit, in my case I named it
“sentence”.
Note: Point tiers do not have to be defined. Remember to delete the example given.
3. As a next step, mark both the sound file and
the TextGrid file, and click on “view and edit”.
A TextGrid editor window will pop up. Here it is
possible to zoom into specific parts of the
sound waves, to mark them, and to listen to
the selected parts separately. (We will come to
this in the next step.)
( Continue on next page…)
2 of 5
4. To select a specific part (unit), find the part to
be annotated by listening to the sound and
selecting the starting point by own judgment.
After finding and clicking on the starting point,
push the “Enter” button on the keyboard, this
way the program will mark the starting point.
This has to be followed by naming the part to
be annotated and finding the end point, clicking on it and pressing “Enter” again. Repeat
this until you marked and named every relevant unit.
How to play the audio? You can play the whole
audio by clicking on the grey bar with “Total duration” written on it, the visible part of the audio
by clicking on the grey bar with “Visible part” written on it, and the marked segment with the grey
bar with the length of the marked segment written on it. Furthermore, by clicking on the “in” and
“out” buttons on the left bottom corner, you can
zoom in and out accordingly.
5. Save the TextGrid file by clicking on “File” and
then “Save TextGrid as text file”.
Note: If you have more sound files to analyze,
repeat this process on every file before using ProsodyPro.
ProsodyPro
ProsodyPro is a Praat script for large-scale systematic prosody analysis. ProsodyPro allows the researcher to
systematically process large amount of speech data with high precision. It makes the researcher’s work
easier, as the program is automating tasks that do not require human judgment, such as locating and
opening sound files, taking measurements, and saving raw results in formats ready for further graphical and
statistical analysis. [4]
ProsodyPro is available for download here: http://www.homepages.ucl.ac.uk/~uclyyix/ProsodyPro/
Note: If any problems would occur with ProsodyPro, for example the dialogue window is too big for the
computer screen and you can’t see functions, email Yi Xu at [email protected] , who can provide you with a
solution, as in the mentioned example, with a ProsodyPro Praat script with a shortened menu.
Preparing data for further analysis using ProsodyPro:
1. After downloading ProsodyPro, put the
“ProsodyPro.praat” file in the folder containing
the sound files and TextGrids to be analyzed,
and launch Praat.
( Continue on next page…)
3 of 5
2. Next, the “ProsodyPro.praat” file has to be
opened in Praat by choosing “Open”, then
“Read
from
file…”
and
select
“ProsodyPro.praat” file. Followed by this a
Script window will pop up, showing all the instructions for using the ProsodyPro script.
3. To run “ProsodyPro.praat”, simply choose “Run”
from the “Run” menu. A dialogue window will
open.
4. In the dialogue window change “Task” to “2.
Process all sounds without pause” and unmark
the “get BID measures” box., then click “OK”.
Note: It will take quite some time for ProsodyPro
and Praat to run on all the wave files and TextGrids and to take measures.
When ProsodyPro is done, you will end up with a lot
of different files in the folder where you originally
only had the TextGrids, wave files and
“ProsodyPro.praat” file. To be able to see the extracted measures, do as following for each .mean
file:
1. Open Microsoft Excel and go to “File”, then
“Open” and choose the desired “.mean” file. A
dialogue window will pop up.
2. After clicking “Continue” twice, in the third window go to “Advanced…” and change “Decimal
separator” to “.” (dot) and “Thousand separator”
to “,” (comma) and choose “Finish”.
Note: If you wish to further analyze the data by using an SPSS data editor, then the data extracted
from the different .mean files have to be merged
and systemized according to needs.
4 of 5
What´s the next step?
I further analyzed the data using IBM SPSS Data Editor, which is available for download here: http://www01.ibm.com/software/analytics/spss/downloads.html [5]
To get started with SPSS, I personally recommend Pete Greasly’s Quantitative Data Analysis using SPSS: An
Introduction to Health and Social Science [6] to start with. It gives a nice and clear introduction to SPSS,
introducing the user to how to use SPSS with easy to understand examples. The book is available for download in the following link, note that opening the link will automatically start downloading the pdf file:
https://faculty.psau.edu.sa/filedownload/doc-4-pdf-413d1c02fadc3d07904bbc992b2e9195-original.pdf
References
[1] Audacityteam.org,. 'Audacity: Free Audio Editor And Recorder'. N.p., 2015. Web. 30 Oct. 2015.
[2] Cstr.ed.ac.uk,. 'Festival'. N.p., 2015. Web. 30 Oct. 2015.
[3] Fon.hum.uva.nl,. 'Praat: Doing Phonetics By Computer'. N.p., 2015. Web. 30 Oct. 2015.
[4] Homepages.ucl.ac.uk,. 'Prosodypro'. N.p., 2015. Web. 30 Oct. 2015.
[5] Ibm.com,. 'IBM SPSS Software - Downloads'. N.p., 2015. Web. 30 Oct. 2015.
[6] Greasly, Pete (2008), Quantitative Data Analysis using SPSS: An Introduction for Health and Social Science, New York: Open University Press
5 of 5

Download Report

manipulate speech signals

Paperzz.com

Your Paperzz