2017 International Conference on Sampling Theory and Applications (SampTA) Comparison of Uniform and Random Sampling for Speech and Music Signals arXiv:1705.01457v2 [cs.SD] 15 May 2017 Nematollah Zarmehi, Sina Shahsavari, and Farokh Marvasti Advanced Communication Research Institute Department of Electrical Engineering Sharif University of Technology, Tehran, Iran Email: http://zarmehi.ir/contact.html Abstract—In this paper, we will provide a comparison between uniform and random sampling for speech and music signals. There are various sampling and recovery methods for audio signals. Here, we only investigate uniform and random schemes for sampling and basic low-pass filtering and iterative method with adaptive thresholding for recovery. The simulation results indicate that uniform sampling with cubic spline interpolation outperforms other sampling and recovery methods. TABLE I S AMPLING AND RECOVERY SCHEMES . I. I NTRODUCTION Various sampling and recovery methods proposed in the field of signal processing. The Nyquist-Shannon theorem proposes condition to recover a band-limited signal from its samples [1]–[3]. The uniform sampling using an antialiasing filter and low-pass (LP) filtering were used for some decades. After that, other sampling methods such as nonuniform sampling [4]–[6], periodic non-uniform sampling [7], [8], and random sampling [9], [10] were proposed. In this paper, we provide a comparison between uniform and random sampling for speech and music signals. We use basic LP filtering and spline interpolation for uniform sampling and Iterative Method with Adaptive Thresholding (IMAT) for random sampling. Short Name Sampling Description U-AF-FFT-Sp AF & Uniform FFT as AF & Spline interpolation U-AF-FFT AF & Uniform FFT as AF & LP-filtering U-AF-FIR-Sp AF & Uniform FIR as AF & Spline interpolation U-AF-FIR AF & Uniform FIR as AF & LP-filtering R-IMATI Random IMATI R-Sp Random Spline interpolation TABLE II O BJECTIVE PERFORMANCE METRIC CRITERI II. S AMPLING AND R ECOVERY M ETHODS This Section introduces the sampling and recovery methods that we are going to compare. We compare uniform and random sampling schemes for speech and music signals along with different recovery methods such as basic LP filtering, spline interpolation, and IMAT [11], [12]. IMATI is a version of IMAT algorithm that uses interpolation operator in each iteration [13]. Table I shows the sampling and recovery schemes used in this paper. The abbreviations AF stands for Anti-aliasing Filter. We have used some objective performance metric criteria for comparison of above methods. They are listed in Table II. The recovered “.WAV” files are also saved on memory disk for subjective evaluations. Name Quantity SNR dB More description kxk SN R(x, x̂) = 20 log kx−x̂k PESQ Dimensionless A score between 1.0 (worst) up to 4.5 (best) CPU Time second Using tic and toc commands in MATLAB A. Speech Signal All sampling and recovery methods are simulated on our speech dataset and the results are presented in Fig. 1 in terms of SNR. Fig. 1 shows the SNR of all methods vs. sampling rate. In uniform sampling scheme, we used periodic uniform sampling for sampling rates greater than 0.5. According to Fig. 1, uniform sampling with spline interpolation outperforms the other methods. Another observation is that spline interpolation works well with uniform samples but its performance degraded in case of random sampling. The Perceptual Evaluation of Speech Quality (PESQ) metric is employed to assess the quality of recovered speech signals. PESQ is approved as ITU-T Rec. P.862 [14]. The voice quality is rated by a value ranging from 1 (bad) to 5 (excellent). The results are shown in Fig 2. III. S IMULATION R ESULTS In this section, we present simulation results. We have used 44.1–48kHz speech and music “.WAV” signals. Frame size is 1024 and the simulations are done in MATLAB R2015a on Intel(R) Core(TM) i7-5960X @ 3GHz with 23GB-RAM. 1 2017 International Conference on Sampling Theory and Applications (SampTA) 50 60 50 40 U-NAF-Sp U-NAF U-AF-FFT-Sp U-AF-FFT R-IMATI R-Sp U-AF-FIR-Sp U-AF-FIR 30 20 10 0 0.1 SNR (dB) SNR (dB) 40 0.2 0.3 0.4 0.5 0.6 0.7 0.8 30 U-NAF-Sp U-NAF U-AF-FFT-Sp U-AF-FFT R-IMATI R-Sp U-AF-FIR-Sp U-AF-FIR 20 10 0 0.1 0.9 0.2 0.3 Sampling Rate 0.4 0.5 0.6 0.7 0.8 0.9 Sampling Rate Fig. 1. SNR vs. sampling rate. Fig. 3. SNR of all methods. 3.5 4.5 4 3 3.5 3 U-NAF-Sp U-NAF U-AF-FFT-Sp U-AF-FFT U-AF-FIR-Sp U-AF-FIR R-IMATI R-Sp 2 1.5 1 0.5 0.1 PESQ PESQ 2.5 0.2 0.3 0.4 0.5 0.6 0.7 0.8 U-NAF-Sp U-NAF U-AF-FFT-Sp U-AF-FFT U-AF-FIR-Sp U-AF-FIR R-IMATI R-Sp 2.5 2 1.5 1 0.5 0.1 0.9 Sampling Rate 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Sampling Rate Fig. 2. PESQ vs. sampling rate. Fig. 4. PESQ of all methods. B. Music Signal Sampling rate = 0.5 35 We have also compared uniform and random sampling with music signals. The SNR of all methods is compared in Fig. 4. In uniform sampling scheme, the high frequency components will be filtered; Therefore, we expect spline does not work for music signal as well as for speech signal. Although the PESQ is not coincident with music, we measured PESQ for the recovered music signals. Fig. 4 presents the PESQ of all methods. It can be seen that uniform sampling with FIR filter as anti-aliasing filter has the best value of PESQ between these sampling and recovery methods. We also test different FIR and IIR filters as anti-aliasing filter before uniform sampling. The original signal is recovered by cubic spline interpolation. The results are shown in Figs. 5-8. It can be seen that using FIR filter as anti-aliasing filter leads to better performance in terms of SNR and PESQ. SNR (dB) 30 25 20 Maximally flat Window Least square Generalized Eq. ripple 15 10 50 60 70 80 90 100 110 120 Filter Order Fig. 5. Different FIR Filters. C. Complexity Comparison seen that the simulation time does not change as sampling rate changes. This is true almost for all methods. The normalized CPU time is also charted in Fig. 10. Random sampling with IMATI for reconstruction takes more We provide a complexity comparison between the methods of Table I by measuring the run time of simulations. Fig. 9 shows the run time for different sampling rate. It can be 2 2017 International Conference on Sampling Theory and Applications (SampTA) 0.14 Sampling rate = 0.5 X: 0.2 Y: 0.1185 2.85 Time (sec) Frame-Size = 1024 0.12 PESQ 2.8 2.75 2.7 2.65 2.6 50 Maximally flat Window Least square Generalized Eq. ripple 60 0.1 IMATI IMAT-Spline IMAT Spline Uniform 0.08 0.06 0.04 X: 0.2 Y: 0.01287 0.02 70 80 90 100 110 0 0.1 120 0.15 0.2X: 0.2 0.25 Y: 3.446e-05 Filter Order 0.3 0.35 0.4 0.45 0.5 Sampling Rate Fig. 9. CPU time of all methods. Fig. 6. Different FIR Filters. 1.5 13 Normalized CPU Time 12 11 SNR (dB) X: 0.2 Y: 0.04263 10 9 1 0.5 0.38 0.11 Butterworth Elliptic Chebyshev 8 1 0 IMATI IMAT-Spline IMAT 0.006 0.0003 Spline Uniform-LP 7 5 10 15 20 25 30 Fig. 10. Normalized CPU time of all methods. Filter Order Fig. 7. Different IIR Filters. well matches to the samples. Thus, in cases that the signal is band-limited, the cubic spline leads to smaller error. As the name suggests, in the cubic spline, the interpolated value at each missing sample is a cubic interpolation of the values at neighboring samples. Given a set of n + 1 data points (xi , yi ) where x0 < x1 < . . . < xn , the spline interpolator S(x) is a polynomial of degree 3 on each subinterval [xi−1 , xi ] where i = 1, . . . , n, i.e., C1 (x) x0 ≤ x ≤ x1 Ci (x) xi−1 ≤ x ≤ xi S(x) = (1) Cn (x) xn−1 ≤ x ≤ xn , 2.5 Butterworth Elliptic Chebyshev 2.4 PESQ 2.3 2.2 2.1 2 1.9 5 10 15 20 25 30 Filter Order where Ci (x) = ai + bi x + ci x2 + di x3 (di 6= 0) is a cubic function. An example of cubic spline interpolation is shown in Fig. 11. We have generated an artificial 64-sparse signal with length 1024 and sampled its inverse transformed version uniformly. The original, sampled, and reconstructed signal is shown in Fig. 12. It can be seen that the original signal is not smooth at all. In other words, it is not a nearly band-limited or LP signal. Consider two samples highlighted in Fig. 12. We expect the original signal have a value near zero that is between the value of these two samples. But it is about -30. Fig. 8. Different IIR Filters. time than the uniform sampling with basic LP filtering or spline interpolation. IV. C UBIC S PLINE I NTERPOLATION We observed that the spline interpolation does not work well in random sampling case. Among all spline interpolations, the cubic spline gives smoother interpolating polynomial that 3 2017 International Conference on Sampling Theory and Applications (SampTA) [4] A. J. Jerri, “The shannon sampling theorem: Its various extensions and applications: A tutorial review,” Proceedings of the IEEE, vol. 65, no. 11, pp. 1565–1596, November 1977. [5] M. B. Mashhadi, N. Salarieh, E. S. Farahani, and F. A. Marvasti, “Level crossing speech sampling and its sparsity promoting reconstruction using an iterative method with adaptive thresholding,” IET Signal Processing, 2017. [6] F. Marvasti, “Spectrum of nonuniform samples,” Electronics Letters, vol. 20, no. 21, pp. 896–897, October 1984. [7] J. L. Yen, “On nonuniform sampling of bandwidth-limited signals,” IRE Transaction on Circuit Theory, vol. CT-3, pp. 251–257, December 1956. [8] A. Papoulis, Signal Analysis. New York: McGraw-Hill, 1997. [9] E. Masry, “Random sampling and reconstruction of spectra,” Inf. Contr., vol. 19, pp. 275–288, 1971. [10] F. Marvasti, “Spectral analysis of random sampling and error free recovery by an iterative method,” IECE Transaction of Inst. Electron. Commun. Engs. Japan, vol. E 69, no. 2, 1986. [11] F. Marvasti, A. Amini, F. Haddadi, M. Soltanolkotabi, B. Khalaj, A. Aldroubi, S. Sanei, and J. Chambers, “A unified approach to sparse signal processing,” EURASIP Journal on Advances in Signal Processing, 2012. [12] IMAT algorithm, “Iterative method with adaptive thresholding algorithm for sparse signal reconstruction,” 2017, [Online]. Available: http://ee. sharif.ir/$\sim$imat. [Accessed: 07-January-2017]. [13] M. Azghani and F.Marvasti, “Iterative methods for random sampling and compressed sensing recovery,” in 10th International Conference on Sampling Theory and Applications (SAMPTA 2013), July 2013. [14] A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, “Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs,” in 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), vol. 2, 2001, pp. 749–752. Original Signal 0.04 Random Sampled Signal Reconstructed Signal by Cubic Spline 0.03 0.02 0.01 0 -0.01 -0.02 237 238 239 240 241 242 243 Fig. 11. An example of cubic spline interpolation. Fig. 12. An example of cubic spline interpolation for pure sparse signal. V. C ONCLUSION In this paper, we compared uniform and random sampling schemes for speech and music signals. We also used different reconstruction techniques for both sampling schemes. In case the speech signal is divided into frames, both objective performance metric criteria and subjective evaluations proposed that the uniform sampling with spline interpolation outperforms other methods. This is true due to the fact that speech and music signals are LP signals. However, this is not true if the signal has high frequency components or it is sparse. In case, the signal is not purely low-pass or sparse, one can use subband coding in which the low and high frequency components are sampled uniformly and randomly, respectively. R EFERENCES [1] C. E. Shannon, “Communication in the presence of noise,” in Proc. IRE, vol. 37, January 1949, pp. 10–21. [2] F. Marvasti, Nonuniform Sampling: Theory and Practice. Springer, formerly Kluwer Academic/Plenum Publishers, 2001. [3] H. J. Landau, “Necessary density conditions for sampling and interpolation of certain entire functions,” Acta Mathematica, vol. 117, no. 1, pp. 37–52, 1967. 4
© Copyright 2026 Paperzz