LOWER BOUNDS OF NONZERO ENTRIES IN SOLUTIONS XIAOJUN CHEN∗ , LEI GUO† , ZHAOSONG LU‡ , AND JANE J. YE§ Abstract. This is a supplementary material of paper [2]. 1. Lower bound for nonzero entries of local minimizers. In [2], we consider the following non-Lipschitz nonlinear programming problem: min f (x) + Φ(x) s.t. c(x) = 0, (1.1) d(x) ≤ 0, where f : IRn → IR, c : IRn → IRm and d : IRn → IRp are continuously differentiable functions, and Φ : IRn → IR is a continuous function (possibly nonconvex non-Lipschitz). As observed in the literature and our numerical experiments in [2], the local minimizers of problem (1.1) are often sparse when Φ is nonconvex and non-Lipschitz. This has made problem (1.1) a popular model in seeking a sparse solution (see, e.g., [1, 3]). In this report, we derive a lower bound on nonzero entries of any local minimizer of problem (1.1) under some suitable assumptions on its objective and constraint functions, which shows that the magnitude of all nonzero entries of a local minimizer must be above a certain positive number. This provides a possible interpretation why local minimizers of problem (1.1) are often sparse. To this end, we assume in this report that the function f is twice continuously differentiable, c(x) := Ax − b with A ∈ IRm×n , b ∈ IRm , and ( ) n l−x d(x) := with l, u ∈ IR and l < u. (1.2) x−u It is clear that d(x) ≤ 0 is equivalent to x ∈ [l, u]. The lower bounds on local minimizers have been derived in the literature for some special cases of problem (1.1) with the regularization term Φ(x) = ∥x∥qq with q ∈ (0, 1). In particular, Chen et al. [3] derived lower bounds for ℓq -regularized unconstrained convex quadratic programming. Lu [6] studied lower bounds for general ℓq -regularized unconstrained nonlinear programming. Recently, Chen et al. [1] developed lower bounds for the case where f is a convex quadratic function, A = eT or [eT − eT ] where e is the vector of all ones, b = 1, m = 1, l = 0 and u = ∞. For convenience of presentation, we first introduce some notations that will be used in this section only. We set the infimum or minimum over an empty set to be +∞. We let S ∗ be the set of local minimizers of problem (1.1) and assume that Lf := supx∈S ∗ ∥∇2 f (x)∥ is ∗ Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong. Email: [email protected]. The author’s work was supported in part by Hong Kong Research Grants Council PolyU5002/13p. † Sino-US Global Logistics Institute, Shanghai Jiao Tong University, Shanghai 200030, China. Email: [email protected]. This author’s work was supported by the NSFC Grant (No. 11401379) and the China Postdoctoral Science Foundation (No. 2015T80428). ‡ Department of Mathematics, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada. Email: [email protected]. This author’s work was supported in part by the NSERC. § Department of Mathematics and Statistics, University of Victoria, Victoria, BC, V8W 2Y2, Canada. Email: [email protected]. This author’s work was supported in part by the NSERC. 1 finite which clearly holds provided that either the gradient of f is globally Lipschitz continuous over [l, u] or S ∗ is bounded. Given a function φ : IR → IR and a ∈ IR, we define { } φ⋄ (a) := inf |t| φ(t) ≥ a . Given ∅ = ̸ I ⊆ {1, . . . , n}, let I c be the complement of I with respect to {1, . . . , n}, and } ∑ T 2 i∈I [(ei − Γi ) (ej − Γj )] > 0 , { + Γ := (AI ) AI , θ(I) := (1.3) j∈I } ∑ + c T Aj xj ) ̸= 0 for some xj ∈ {lj , 0, uj } j ∈ I , i ∈ I Γi = ei and ei (AI ) (b − c { σ(I) := j∈I (1.4) Lf ∑ [(ei − Γi )T (ej − Γj )]2 j∈I λ̄i (I) := − i ∈ θ(I) ̸= ∅, ∥ei − Γi ∥4 ( ) α(I) := min (ϕ′′i )⋄ λ̄i (I) , (1.5) i∈θ(I) { β(I) := min i∈σ(I) |eTi (AI )+ (b − ∑ j∈I c } ∑ T + c Aj xj )|ei (AI ) (b − Aj xj ) ̸= 0, xj ∈ {lj , 0, uj } j ∈ I , j∈I c (1.6) { δ := min } min I⊆{1,...,n} {α(I), β(I)}, min |ui |, min |li | , ui ̸=0 li ̸=0 λ0 := min min λ̄i (I). I⊆{1,...,n} i∈θ(I) Here, A+ I denotes the Moore–Penrose pseudoinverse of the matrix AI . It is not hard to observe that δ and λ0 only depend on the data of problem (1.1). Moreover, we can see that λ0 < 0 if λ0 ̸= ∞. Before establishing the main result of this section, we make some further assumptions on Φ that will be used in this section only. ∑n Assumption 1.1. Assume that Φ(x) := i=1 ϕi (xi ) with ϕi : IR → IR lower semi-continuous and for each i = 1, . . . , n, ϕi is twice continuously differentiable everywhere except at 0 and ϕ′′i (t) < 0 lim sup ϕ′′i (t) < λ0 , ∀t ̸= 0, t→0 where λ0 is defined in (1.7). It is easy to verify that the following widely used regularization functions satisfy Assumption 1.1. Note that the bridge penalty function is non-Lipschitz and it is locally Lipschitz everywhere expect at 0 while both the logistic penalty function and the fraction penalty function are locally Lipschitz everywhere. (i) Bridge penalty [4] with ϕi (t) = λ|t|q for any q ∈ (0, 1) and { λ> √ 0; > −λ0 if λ0 < 0, (ii) Logistic penalty [5] with ϕi (t) = log(1 + λ|t|) for any λ ≥0 if λ0 = ∞; { √ −λ0 > if λ0 < 0, λ|t| 2 for any λ (iii) Fraction penalty [7] with ϕi (t) = 1+λ|t| ≥0 if λ0 = ∞. 2 (1.7) Under Assumption 1.1, we can show δ > 0 as follows. Proposition 1.1. Let δ be defined in (1.7). If Assumption 1.1 holds, then δ > 0. Proof. By the definition of δ, one can easily see δ ≥ 0. It suffices to show δ ̸= 0. Suppose for contradiction that δ = 0. Notice that α(I) ≥ 0 and β(I) > 0 for any I ⊆ {1, . . . , n}. These together with the definition of δ and the assumption δ = 0 imply that α(I) = 0 for some I ⊆ {1, . . . , n}. It then follows from (1.5) that there exists some i0 ∈ θ(I) such that (ϕ′′i0 )⋄ (λ̄i0 (I)) = 0, which along with the definitions of (ϕ′′i0 )⋄ and λ0 implies that there exists a sequence {tk } converging to 0 such that ϕ′′i0 (tk ) ≥ λ̄i0 (I) ≥ λ0 . This contradicts the second relation in Assumption 1.1. We are now ready to establish the main result of this section. The result shows that δ defined in (1.7) is a lower bound for nonzero entries of local minimizers of problem (1.1). Theorem 1.1. Suppose that Assumption 1.1 holds. Let δ be defined in (1.7) and x∗ a local minimizer of problem (1.1). Then for any i, there holds |x∗i | ≥ δ > 0 if x∗i ̸= 0. Proof. Without loss of generality, we assume for simplicity that l, u ∈ IRn . Let Il = {i | x∗i = li }, Iu = {i | x∗i = ui } and { } I = i | x∗i ∈ / {li , 0, ui } , Θ = (I − Γ)[∇2 f (x∗ )II ](I − Γ), where I is the |I| × |I| identity matrix and Γ is defined as in (1.3). In addition, let the associated θ(I), α(I) and β(I) be defined according to (1.3), (1.5) and (1.6), respectively. Without loss of generality, assume that x∗ = (lIl , x∗I , 0, uIu ). Since x∗ is a local minimizer of problem (1.1), one can observe that x∗I is a local minimizer of the problem ∑ f (lIl , xI , 0, uIu ) + i∈I ϕi (xi ) ∑ ∑ s.t. AI xI = b − j∈Il Aj lj − j∈Iu Aj uj . min (1.8) Notice that the objective function of problem (1.8) is twice continuously differentiable at x∗I . By the second-order optimality condition, we have dT [∇2 f (x∗ )II + Diag(ϕ′′ (x∗I ))]d ≥ 0 ∀d ∈ {d | AI d = 0}, (1.9) where ϕ′′ (x∗I ) := (ϕ′′i (x∗i ) : i ∈ I). Since I − (AI )+ AI is the orthogonal projector onto the null space of AI , we have {d | AI d = 0} = {d | d = (I − (AI )+ AI )w, w ∈ IR|I| }, which together with (1.9) implies ( ) ( ) wT I − (AI )+ AI [∇2 f (x∗ )II + Diag(ϕ′′ (x∗I ))] I − (AI )+ AI w ≥ 0 ∀w ∈ IR|I| . This inequality can be rewritten as wT Θw + wT (I − Γ)Diag(ϕ′′ (x∗I ))(I − Γ)w ≥ 0 ∀w ∈ IR|I| . Letting w = ei − Γi and using this inequality and ϕ′′i (x∗i ) < 0 for i ∈ I, we see (ei − Γi )T Θ(ei − Γi ) + ϕ′′i (x∗i )∥ei − Γi ∥4 ≥ 0 i ∈ I. 3 (1.10) Suppose that |x∗i | > 0 for some i. If x∗i ∈ {li , ui }, it immediately follows from the definition of δ that |x∗i | ≥ δ holds. Otherwise, we have i ∈ I. We next show that |x∗i | ≥ δ also holds by considering two separate cases as follows. Case 1): Γi ̸= ei . Noting that ∥∇2 f (x∗ )II ∥ ≤ Lf , it follows from (1.10) that ∑ 0 < (ei − Γi )T Θ(ei − Γi ) ≤ Lf [(ei − Γi )T (ej − Γj )]2 . (1.11) j∈I This together with (1.3) implies that i ∈ θ(I) ̸= ∅. Moreover, (1.10) and (1.11) imply ∑ Lf j∈I [(ei − Γi )T (ej − Γj )]2 ′′ ∗ ϕi (xi ) ≥ − . ∥ei − Γi ∥4 By the definitions of (ϕ′′i )⋄ and α(I), we obtain that ) ( ∑ Lf j∈I [(ei − Γi )T (ej − Γj )]2 ∗ ′′ ⋄ ≥ α(I). |xi | ≥ (ϕi ) − ∥ei − Γi ∥4 It then follows from this inequality and the definition of δ that |x∗i | ≥ δ holds. Case 2): Γi = ei . It follows from the relation Ax∗ = b that ∑ ∑ AI x∗I = b − Aj lj − Aj uj . j∈Il j∈Iu + Pre-multiplying this relation by eTi A+ I on both sides and using Γ = (AI ) AI , we have ∑ ∑ eTi Γx∗I = eTi A+ Aj lj − Aj uj ), I (b − j∈Il j∈Iu which together with Γi = ei and x∗i ̸= 0 yields ∑ ∑ x∗i = eTi A+ Aj lj − Aj uj ) ̸= 0 I (b − j∈Il j∈Iu and hence i ∈ σ(I) ̸= ∅, where σ(I) is defined in (1.4). It follows from these relations and (1.6) that |x∗i | ≥ β(I), which together with the definition of δ implies that |x∗i | ≥ δ. The proof is complete since we observe that δ > 0 by Proposition 1.1. Acknowledgments. The first and the last two authors thank American Institute of Mathematics where this work was initiated, for support and hospitality during summer 2010 and summer 2011. The authors are also grateful to Jiming Peng at University of Houston and Ting Kei Pong at the Hong Kong Polytechnic University for their helpful discussions, Caihua Chen at Nanjing University for providing the Matlab code of the IP method in [1] to solve the ℓ1/2 -norm regularized models. REFERENCES [1] C. Chen, X. Li, C. Tolman, S. Wang and Y. Ye, Sparse portfolio selection via quasi-norm regularization, ArXiv preprint, arXiv:1312.6350, 2014. [2] X. Chen, L. Guo, Z. Lu and J. Ye, A Feasible augmented Lagrangian method for non-Lipschitz nonconvex programming, Preprint, 2016. 4 [3] X. Chen, F. Xu and Y. Ye, Lower bound theory of nonzero entries in solutions of l2 -lp minimization, SIAM J. Sci. Comput., 32 (2010), pp. 2832–2852. [4] W.J. Fu, Penalized regression: The bridge versus the lasso, J. Comput. Graph. Stat., 7 (1998), pp. 397–416. [5] D. Geman and G. Reynolds, Constrained restoration and the recovery of discontinuities, IEEE Trans. Pattern Anal. Mach. Intell., 14 (1992), pp. 357–383. [6] Z. Lu, Iterative reweighted minimization methods for lp regularized unconstrained nonlinear programming, Math. Program., 147 (2014), pp. 277–307. [7] M. Nikolova, M.K. Ng, S. Zhang and W. Ching, Efficient reconstruction of piecewise constant images using nonsmooth nonconvex minimization, SIAM J. Imaging Sci., 1 (2008), 2–25. 5
© Copyright 2026 Paperzz