资 源 简 介
Machine Learning for Signal Processing 信号处理与机器学习,英文原版教科书 【截图】 【目录】ContentsPreface vList of Algorithms xiList of Figures xv1 Mathematical foundations 11.1 Abstract algebras 1Groups 1Rings 31.2 Metrics 41.3 Vector spaces 5Linear operators 7Matrix algebra 7Square and invertible matrices 8Eigenvalues and eigenvectors 9Special matrices 101.4 Probability and stochastic processes 12Sample spaces, events, measures and distributions 12Joint random variables: independence, conditionals, andmarginals 14Bayes’ rule 16Expectation, generating functions and characteristic functions 17Empirical distribution function and sample expectations 19Transforming random variables 20Multivariate Gaussian and other limiting distributions 21Stochastic processes 23Markov chains 251.5 Data compression and informationtheory 28The importance of the information map 31Mutual information and Kullback-Leibler (K-L)divergence 321.6 Graphs 34Special graphs 351.7 Convexity 361.8 Computational complexity 37Complexity order classes and big-O notation 38iiviii ContentsTractable versus intractable problems:NP-completeness 382 Optimization 412.1 Preliminaries 41Continuous differentiable problems and criticalpoints 41Continuous optimization under equality constraints: Lagrange multipliers 42Inequality constraints: duality and the Karush-Kuhn-Tuckerconditions 44Convergence and convergence rates for iterativemethods 45Non-differentiable continuous problems 46Discrete (combinatorial) optimization problems 472.2 Analytical methods for continuous convex problems 48L2-norm objective functions 49Mixed L2-L1 norm objective functions 502.3 Numerical methods for continuous convex problems 51Iteratively reweighted least squares (IRLS) 51Gradient descent 53Adapting the step sizes: line search 54Newton’s method 56Other gradient descent methods 582.4 Non-differentiable continuous convex problems 59Linear programming 59Quadratic programming 60Subgradient methods 60Primal-dual interior-point methods 62Path-following methods 642.5 Continuous non-convex problems 652.6 Heuristics for discrete (combinatorial) optimization 66Greedy search 67(Simple) tabu search 67Simulated annealing 68Random restarting 693 Random sampling 713.1 Generating (uniform) random numbers 713.2 Sampling from continuous distributions 72Quantile function (inverse CDF) and inverse transformsampling 72Random variable transformation methods 74Rejection sampling 74Adaptive rejection sampling (ARS) for log-concave densities 75Special methods for particular distributions 783.3 Sampling from discrete distributions 79Inverse transform sampling by sequential search 79Contents ixRejection sampling for discrete variables 80Binary search inversion for (large) finite samplespaces 813.4 Sampling from general multivariatedistributions 81Ancestral sampling 82Gibbs sampling 83Metropolis-Hastings 85Other MCMC methods 884 Statistical modelling and inference 934.1 Statistical models 93Parametric versus nonparametric models 93Bayesian and non-Bayesian models 944.2 Optimal probability inferences 95Maximum likelihood and minimum K-L divergence 95Loss functions and empirical risk estimation 98Maximum a-posteriori and regularization 99Regularization, model complexity and data compression 101Cross-validation and regularization 105The bootstrap 1074.3 Bayesian inference 1084.4 Distributions associated with metrics and norms 110Least squares 111Least Lq-norms 111Covariance, weighted norms andMahalanobis distance 1124.5 The exponential family (EF) 115Maximum entropy distributions 115Sufficient statistics and canonical EFs 116Conjugate priors 118Prior and posterior predictive EFs 122Conjugate EF prior mixtures 1234.6 Distributions defined through quantiles 1244.7 Densities associated with piecewise linear loss functions 1264.8 Nonparametric density estimation 1294.9 Inference by sampling 130MCMC inference 130Assessing convergence in MCMC methods 1305 Probabilistic graphical models 1335.1 Statistical modelling with PGMs 1335.2 Exploring conditionalindependence in PGMs 136Hidden versus observed variables 136Directed connection and separation 137The Markov blanket of a node 1385.3 Inference on PGMs 139x ContentsExact inference 140Approximate inference 1436 Statistical machine learning 1496.1 Feature and kernel functions 1496.2 Mixture modelling 150Gibbs sampling for the mixture model 150E-M for mixture models 1526.3 Classification 154Quadratic and linear discriminant analysis (QDA and LDA)155Logistic regression 156Support vector machines (SVM) 158Classification loss functions and misclassificationcount 161Which classifier to choose? 1616.4 Regression 162Linear regression 162Bayesian and regularized linear regression 163Linear-in parameters regression 164Generalized linear models (GLMs) 165Nonparametric, nonlinear regression 167Variable selection 1696.5 Clustering 171K-means and variants 171Soft K-means, mean shift and variants 174Semi-supervised clustering and classification 176Choosing the number of clusters 177Other clustering methods 1786.6 Dimensionality reduction 178Principal components analysis (PCA) 179Probabilistic PCA (PPCA) 182Nonlinear dimensionality reduction 1847 Linear-Gaussian systems and signal processing 1877.1 Preliminaries 187Delta signals and related functions 187Complex numbers, the unit root and complex exponentials 189Marginals and conditionals of linear-Gaussianmodels 1907.2 Linear, time-invariant (LTI) systems 191Convolution and impulse response 191The discrete-time Fourier transform (DTFT) 192Finite-length, periodic signals: the discrete Fourier transform (DFT) 198Continuous-time LTI systems 201Heisenberg uncertainty 203Gibb’s phenomena 205Transfer function analysis of discrete-time LTI systems 206Contents xiFast Fourier transforms (FFT) 2087.3 LTI signal processing 212Rational filter design: FIR, IIR filtering 212Digital filter recipes 220Fourier filtering of very long signals 222Kernel regression as discrete convolution 2247.4 Exploiting statistical stability for linear-Gaussian DSP 226Discrete-time Gaussian processes (GPs) and DSP 226Nonparametric power spectral density (PSD) estimation 231Parametric PSD estimation 236Subspace analysis: using PCA in DSP 2387.5 The Kalman filter (KF) 242Junction tree algorithm (JT) for KF computations 243Forward filtering 244Backward smoothing 246Incomplete data likelihood 247Viterbi decoding 247Baum-Welch parameter estimation 249Kalman filtering as signal subspace analysis 2517.6 Time-varying linear systems 252Short-time Fourier transform (STFT) and perfect reconstruction 253Continuous-time wavelet transforms (CWT) 255Discretization and the discrete wavelet transform (DWT) 257Wavelet design 261Applications of the DWT 2628 Discrete signals: sampling, quantization and coding 2658.1 Discrete-time sampling 266Bandlimited sampling 267Uniform bandlimited sampling: Shannon-Whittaker interpolation 267Generalized uniform sampling 2708.2 Quantization 273Rate-distortion theory 275Lloyd-Max and entropy-constrained quantizerdesign 278Statistical quantization and dithering 282Vector quantization 2868.3 Lossy signal compression 288Audio companding 288Linear predictive coding (LPC) 289Transform coding 2918.4 Compressive sensing (CS) 293Sparsity and incoherence 294Exact reconstruction by convex optimization 295Compressive sensing in practice 296xii Contents9 Nonlinear and non-Gaussian signal processing 2999.1 Running window filters 299Maximum likelihood filters 300Change point detection 3019.2 Recursive filtering 3029.3 Global nonlinear filtering 3029.4 Hidden Markov models (HMMs) 304Junction tree (JT) for efficient HMM computations 305Viterbi decoding 306Baum-Welch parameter estimation 306Model evaluation and structured data classification 309Viterbi parameter estimation 309Avoiding numerical underflow in message passing 3109.5 Homomorphic signal processing 31110 Nonparametric Bayesian machine learning and signal processing 31310.1 Preliminaries 313Exchangeability and de Finetti’s theorem 314Representations of stochastic processes 316Partitions and equivalence classes 31710.2 Gaussian processes (GP) 318From basis regression to kernel regression 318Distributions over function spaces: GPs 319Bayesian GP kernel regression 321GP regression and Wiener filtering 325Other GP-related topics 32610.3 Dirichlet processes (DP) 327The Dirichlet distribution: canonical prior for the categorical distribution 328Defining the Dirichlet and related processes 331Infinite mixture models (DPMMs) 334Can DP-based models actually infer the number of components? 343Bibliography 345Index 353