BernNet Learning Arbitrary Graph Spectral Filters via Bernstein Approximation

07-19 1585阅读

发表于:neurips21

推荐指数: #paper/⭐⭐

BernNet Learning Arbitrary Graph Spectral Filters via Bernstein Approximation

设定:在本文中,h是过滤器.

bernstein 多项式逼近(这个证明有点稀里糊涂的,反正我觉得一点点问题,可能因为我水平低)

p K ( t ) : = ∑ k = 0 K θ k ⋅ b k K ( t ) = ∑ k = 0 K f ( k K ) ⋅ ( K k ) ( 1 − t ) K − k t k . p_K(t):=\sum_{k=0}^K\theta_k\cdot b_k^K(t)=\sum_{k=0}^Kf\left(\frac kK\right)\cdot\binom Kk(1-t)^{K-k}t^k. pK​(t):=k=0∑K​θk​⋅bkK​(t)=k=0∑K​f(Kk​)⋅(kK​)(1−t)K−ktk.

(其实类似于二项分布.如上K,k即二项分布的前缀)

推论2.1:给定一个连续函数f(t) t ∈ [ 0 , 1 ] t \in[0,1] t∈[0,1],我们有:当 K → ∞ K\to\infty K→∞时, p K ( t ) → f ( t ) p_{K}(t)\to f(t) pK​(t)→f(t).

后者容易理解,当K趋近于无穷时,将f后面的即为二项分布,求和为0.而 f ( k K ) f\left( \frac{k}{K} \right) f(Kk​)又和t相关,用t取代(感觉理解的有问题)

对于过滤函数 h : [ 0 , 2 ] → [ 0 , 1 ] h:[0,2]\to[0,1] h:[0,2]→[0,1],我们有 t = λ 2 t=\frac{\lambda}{2} t=2λ​,我们就有: f ( t ) = h ( 2 t ) f(t)=h(2t) f(t)=h(2t), θ k = f ( k / K ) = h ( 2 k / K ) \theta_k = f(k/K) = h(2k/K) θk​=f(k/K)=h(2k/K). b k K ( t ) = b k K ( λ 2 ) = ( K k ) ( 1 − λ 2 ) K − k ( λ 2 ) k b_{k}^K(t)=b_k^K(\frac\lambda2) = \binom Kk(1-\frac\lambda2)^{K-k}(\frac\lambda2)^k bkK​(t)=bkK​(2λ​)=(kK​)(1−2λ​)K−k(2λ​)k.最终,我们可以得到如下近似: p K ( λ / 2 )   =   ∑ k = 0 K θ k ( K k ) ( 1 − λ 2 ) K − k ( λ 2 ) k   =   ∑ k = 0 K θ k 1 2 K ( K k ) ( 2 − λ ) K − k λ k p_K(\lambda/2)~=~\sum_{k=0}^K\theta_k\binom Kk(1-\frac\lambda2)^{K-k}\left(\frac\lambda2\right)^k~=~\sum_{k=0}^K\theta_k\frac1{2^K}\binom Kk(2-\lambda)^{K-k}\lambda^k pK​(λ/2) = ∑k=0K​θk​(kK​)(1−2λ​)K−k(2λ​)k = ∑k=0K​θk​2K1​(kK​)(2−λ)K−kλk.

z = U d i a g [ p K ( λ 1 / 2 ) , . . . , p K ( λ n / 2 ) ] U T ⏟ R e m N e t x = ∑ k = 0 K θ k 1 2 K ( K k ) ( 2 I − L ) K − k L k x \mathbf{z}=\underbrace{\mathbf{U}diag[p_K(\lambda_1/2),...,p_K(\lambda_n/2)]\mathbf{U}^T}_{\mathrm{RemNet}}\mathbf{x}=\sum_{k=0}^K\theta_k\frac1{2^K}\binom Kk(2\mathbf{I}-\mathbf{L})^{K-k}\mathbf{L}^k\mathbf{x} z=RemNet Udiag[pK​(λ1​/2),...,pK​(λn​/2)]UT​​x=k=0∑K​θk​2K1​(kK​)(2I−L)K−kLkx

实现常见的过滤器通过BernNet

BernNet Learning Arbitrary Graph Spectral Filters via Bernstein Approximation

证明好麻烦啊,烦烦烦

附录:组合数性质

∙ C n k = C n n − k ∙ C n k + 1 = C n k × n − k k + 1 ∙ C n k = C n − 1 k − 1 × n k ∙ C n k = C n − 1 k − 1 + C n − 1 k \begin{aligned}&\bullet C_n^k = C_n^{n-k}\\&\bullet C_n^{k+1} = C_n^k \times \frac{n-k}{k+1}\\&\bullet C_n^k = C_{n-1}^{k-1} \times \frac{n}{k}\\&\bullet C_n^k = C_{n-1}^{k-1} + C_{n-1}^k\end{aligned} ​∙Cnk​=Cnn−k​∙Cnk+1​=Cnk​×k+1n−k​∙Cnk​=Cn−1k−1​×kn​∙Cnk​=Cn−1k−1​+Cn−1k​​

C n k = A n k A k k = n k ‾ k ! = n ! k ! ( n − k ) ! C_n^k=\frac{A_n^k}{A_k^k}=\frac{n^{\underline{k}}}{k!}=\frac{n!}{k!(n-k)!} Cnk​=Akk​Ank​​=k!nk​​=k!(n−k)!n!​

图过滤

min ⁡ z f ( z ) = ( 1 − α ) z T γ ( L ) z + α ∥ z − x ∥ 2 2 \min_\mathbf{z}f(\mathbf{z})=(1-\alpha)\mathbf{z}^T\gamma(\mathbf{L})\mathbf{z}+\alpha\|\mathbf{z}-\mathbf{x}\|_2^2 zmin​f(z)=(1−α)zTγ(L)z+α∥z−x∥22​

令其倒数为0, α = 0.5 \alpha=0.5 α=0.5, γ ( L ) = e t L − I \gamma(\mathbf{L})=e^{t\mathbf{L}}-\mathbf{I} γ(L)=etL−I. ∂ f ( z ) ∂ z = ( e t L − I ) z + z − x = 0 , \frac{\partial f(\mathbf{z})}{\partial\mathbf{z}}=\left(e^{t\mathbf{L}}-\mathbf{I}\right)\mathbf{z}+\mathbf{z}-\mathbf{x}=\mathbf{0}, ∂z∂f(z)​=(etL−I)z+z−x=0,

z ∗ = e − t L x = e − t ( I − P ) x = ∑ k = 0 ∞ e − t t k k ! P k x . \mathbf{z}^*=e^{-t\mathbf{L}}\mathbf{x}=e^{-t(\mathbf{I}-\mathbf{P})}\mathbf{x}=\sum_{k=0}^\infty e^{-t}\frac{t^k}{k!}\mathbf{P}^k\mathbf{x}. z∗=e−tLx=e−t(I−P)x=∑k=0∞​e−tk!tk​Pkx.

这就是基于图热核的GNN例如GDC和GraphHeat采用的核

过滤器的非负性(保证凸优化)

0 ≤ g ( λ ) = ∑ k = 0 K w k λ k ≤ 1 , ∀ λ ∈ [ 0 , 2 ] . 0\leq g(\lambda)=\sum_{k=0}^Kw_k\lambda^k\leq1, \forall \lambda\in[0,2]. 0≤g(λ)=∑k=0K​wk​λk≤1,∀λ∈[0,2].证明:

α ( α I + ( 1 − α ) γ ( L ) ) − 1 x = U d i a g [ α α + ( 1 − α ) γ ( λ 1 ) , . . . , α α + ( 1 − α ) γ ( λ n ) ] U T x . \alpha\left(\alpha\mathbf{I}+(1-\alpha)\gamma(\mathbf{L})\right)^{-1}\mathbf{x}=\mathbf{U}diag\left[\frac\alpha{\alpha+(1-\alpha)\gamma(\lambda_1)},...,\frac\alpha{\alpha+(1-\alpha)\gamma(\lambda_n)}\right]\mathbf{U}^T\mathbf{x}. α(αI+(1−α)γ(L))−1x=Udiag[α+(1−α)γ(λ1​)α​,...,α+(1−α)γ(λn​)α​]UTx.

λ ∈ [ 0 , 2 ] ,  we have  0 ≤ h ( λ ) ≤ α α + ( 1 − α ) ⋅ 0 = 1  for  λ ∈ [ 0 , 2 ] . \lambda\in[0,2],\text{ we have }0\leq h(\lambda)\leq\frac\alpha{\alpha+(1-\alpha)\cdot0}=1\text{ for }\lambda\in[0,2]. λ∈[0,2], we have 0≤h(λ)≤α+(1−α)⋅0α​=1 for λ∈[0,2].

结果:貌似挺高的,但是别人跑的就没那么高.

结构: Z = ∑ k = 0 K θ k 1 2 K ( K k ) ( 2 I − L ) K − k L k f ( X ) , \mathbf{Z}=\sum_{k=0}^K\theta_k\frac1{2^K}\binom Kk(2\mathbf{I}-\mathbf{L})^{K-k}\mathbf{L}^kf\left(\mathbf{X}\right), Z=∑k=0K​θk​2K1​(kK​)(2I−L)K−kLkf(X),

其中:f(X)是二层的MLP

VPS购买请点击我

文章版权声明:除非注明,否则均为主机测评原创文章,转载或复制请以超链接形式并注明出处。

目录[+]