Volume 28, Issue 5
Deep Network Approximation Characterized by Number of Neurons

Zuowei Shen, Haizhao YangShijun Zhang

Commun. Comput. Phys., 28 (2020), pp. 1768-1811.

Published online: 2020-11

Preview Purchase PDF 227 8008
Export citation
  • Abstract

This paper quantitatively characterizes the approximation power of deep feed-forward neural networks (FNNs) in terms of the number of neurons. It is shown by construction that ReLU FNNs with width $\mathcal{O}$(max{$d⌊N^{1/d}⌋$,$N$+1}) and depth $\mathcal{O}(L)$ can approximate an arbitrary Hölder continuous function of order $α∈(0,1]$ on $[0,1]^d$ with a nearly tight approximation rate $\mathcal{O}(\sqrt{d}N^{−2α/d}L^{−2α/d})$ measured in $L^p$ -norm for any $N,L∈\mathbb{N}^+$ and $p∈[1,∞]$. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f (·)$, the constructive approximation rate is $\mathcal{O}(\sqrt{d}ω_f(N^{−2α/d}L^{−2α/d}))$. We also extend our analysis to $f$ on irregular domains or those localized in an ε-neighborhood of a $d_\mathcal{M}$-dimensional smooth manifold $\mathcal{M}⊆[0,1]^d$ with $d_\mathcal{M}≪d$. Especially, in the case of an essentially low-dimensional domain, we show an approximation rate $\mathcal{O}(ω_f(\frac{ε}{1−δ}\sqrt{\frac{d}{d_δ}}+ε)+\sqrt{d}ω_f(\frac{\sqrt{d}}{1−δ\sqrt{d_δ}}N^{−2α/d_δ}L^{−2α/d_δ})$ for ReLU FNNs to approximate $f$ in the $ε$-neighborhood, where $d_δ=\mathcal{O}(d_\mathcal{M}\frac{\rm{ln}(d/δ)}{δ^2})$ for any $δ∈(0,1)$ as a relative error for a projection to approximate an isometry when projecting $\mathcal{M}$ to a $d_δ$-dimensional domain.

  • Keywords

Deep ReLU neural networks, Hölder continuity, modulus of continuity, approximation theory, low-dimensional manifold, parallel computing.

  • AMS Subject Headings

68T01, 65D99, 68U99

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{CiCP-28-1768, author = {Shen , Zuowei and Yang , Haizhao and Zhang , Shijun}, title = {Deep Network Approximation Characterized by Number of Neurons}, journal = {Communications in Computational Physics}, year = {2020}, volume = {28}, number = {5}, pages = {1768--1811}, abstract = {

This paper quantitatively characterizes the approximation power of deep feed-forward neural networks (FNNs) in terms of the number of neurons. It is shown by construction that ReLU FNNs with width $\mathcal{O}$(max{$d⌊N^{1/d}⌋$,$N$+1}) and depth $\mathcal{O}(L)$ can approximate an arbitrary Hölder continuous function of order $α∈(0,1]$ on $[0,1]^d$ with a nearly tight approximation rate $\mathcal{O}(\sqrt{d}N^{−2α/d}L^{−2α/d})$ measured in $L^p$ -norm for any $N,L∈\mathbb{N}^+$ and $p∈[1,∞]$. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f (·)$, the constructive approximation rate is $\mathcal{O}(\sqrt{d}ω_f(N^{−2α/d}L^{−2α/d}))$. We also extend our analysis to $f$ on irregular domains or those localized in an ε-neighborhood of a $d_\mathcal{M}$-dimensional smooth manifold $\mathcal{M}⊆[0,1]^d$ with $d_\mathcal{M}≪d$. Especially, in the case of an essentially low-dimensional domain, we show an approximation rate $\mathcal{O}(ω_f(\frac{ε}{1−δ}\sqrt{\frac{d}{d_δ}}+ε)+\sqrt{d}ω_f(\frac{\sqrt{d}}{1−δ\sqrt{d_δ}}N^{−2α/d_δ}L^{−2α/d_δ})$ for ReLU FNNs to approximate $f$ in the $ε$-neighborhood, where $d_δ=\mathcal{O}(d_\mathcal{M}\frac{\rm{ln}(d/δ)}{δ^2})$ for any $δ∈(0,1)$ as a relative error for a projection to approximate an isometry when projecting $\mathcal{M}$ to a $d_δ$-dimensional domain.

}, issn = {1991-7120}, doi = {https://doi.org/10.4208/cicp.OA-2020-0149}, url = {http://global-sci.org/intro/article_detail/cicp/18396.html} }
TY - JOUR T1 - Deep Network Approximation Characterized by Number of Neurons AU - Shen , Zuowei AU - Yang , Haizhao AU - Zhang , Shijun JO - Communications in Computational Physics VL - 5 SP - 1768 EP - 1811 PY - 2020 DA - 2020/11 SN - 28 DO - http://doi.org/10.4208/cicp.OA-2020-0149 UR - https://global-sci.org/intro/article_detail/cicp/18396.html KW - Deep ReLU neural networks, Hölder continuity, modulus of continuity, approximation theory, low-dimensional manifold, parallel computing. AB -

This paper quantitatively characterizes the approximation power of deep feed-forward neural networks (FNNs) in terms of the number of neurons. It is shown by construction that ReLU FNNs with width $\mathcal{O}$(max{$d⌊N^{1/d}⌋$,$N$+1}) and depth $\mathcal{O}(L)$ can approximate an arbitrary Hölder continuous function of order $α∈(0,1]$ on $[0,1]^d$ with a nearly tight approximation rate $\mathcal{O}(\sqrt{d}N^{−2α/d}L^{−2α/d})$ measured in $L^p$ -norm for any $N,L∈\mathbb{N}^+$ and $p∈[1,∞]$. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f (·)$, the constructive approximation rate is $\mathcal{O}(\sqrt{d}ω_f(N^{−2α/d}L^{−2α/d}))$. We also extend our analysis to $f$ on irregular domains or those localized in an ε-neighborhood of a $d_\mathcal{M}$-dimensional smooth manifold $\mathcal{M}⊆[0,1]^d$ with $d_\mathcal{M}≪d$. Especially, in the case of an essentially low-dimensional domain, we show an approximation rate $\mathcal{O}(ω_f(\frac{ε}{1−δ}\sqrt{\frac{d}{d_δ}}+ε)+\sqrt{d}ω_f(\frac{\sqrt{d}}{1−δ\sqrt{d_δ}}N^{−2α/d_δ}L^{−2α/d_δ})$ for ReLU FNNs to approximate $f$ in the $ε$-neighborhood, where $d_δ=\mathcal{O}(d_\mathcal{M}\frac{\rm{ln}(d/δ)}{δ^2})$ for any $δ∈(0,1)$ as a relative error for a projection to approximate an isometry when projecting $\mathcal{M}$ to a $d_δ$-dimensional domain.

Zuowei Shen, Haizhao Yang & Shijun Zhang. (2020). Deep Network Approximation Characterized by Number of Neurons. Communications in Computational Physics. 28 (5). 1768-1811. doi:10.4208/cicp.OA-2020-0149
Copy to clipboard
The citation has been copied to your clipboard