用maxima學Google PageRank
我當初是從這兩個影片中學到如何將馬可夫鏈應用在Google PageRank上,所以這裡不再詳述內容,而是將重點放在maxima程式上。[url]https://www.ptt.cc/bbs/tutor/M.1397643887.A.288.html[/url]
[url]https://www.ptt.cc/bbs/tutor/M.1397910311.A.C77.html[/url] 範例出自這裡
[url]https://www.youtube.com/watch?v=rMp1adeQigo[/url]
[color=green]A網頁連到C,D
B網頁連到C
C網頁連到A
D網頁連到B,C[/color]
[color=red](%i1)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,0,0,1],
[1,1,0,1],
[1,0,0,0]);[/color]
[color=red](%o1)[/color] \( \left[ \matrix{0&0&1&0 \cr 0&0&0&1 \cr 1&1&0&1 \cr 1&0&0&0} \right] \)
[color=green]將每行數字加起來代表該網頁對外連結個數
A網頁連到C,D => 2個
B網頁連到C => 1個
C網頁連到A => 1個
D網頁連到B,C => 2個[/color]
[color=red](%i2)[/color] [color=blue]n:apply("+",args(A));[/color]
[color=red](%o2)[/color] \( [2,1,1,2] \)
[color=green]將每一行除以對外連結個數,變成轉移矩陣[/color]
[color=red](%i3)[/color] [color=blue]A:transpose(transpose(A)/n);[/color]
[color=red](%o3)[/color] \( \left[ \matrix{\displaystyle 0&0&1&0 \cr 0&0&0&\frac{1}{2} \cr \frac{1}{2}&1&0&\frac{1}{2} \cr \frac{1}{2}&0&0&0} \right] \)
[color=green]單位矩陣I[/color]
[color=red](%i9)[/color] [color=blue]I:ident(length(A));[/color]
[color=red](%o9)[/color] \( \left[ \matrix{1&0&0&0 \cr 0&1&0&0 \cr 0&0&1&0 \cr 0&0&0&1} \right] \)
[color=green]變數X矩陣[/color]
[color=red](%i5)[/color] [color=blue]X:matrix([a],[ b ],[c],[d]);[/color]
[color=red](%o5)[/color] \( \left[ \matrix{a \cr b \cr c \cr d} \right] \)
[color=green]計算穩定狀態時的X
AX=X , (A-I)X=0[/color]
[color=red](%i6)[/color] [color=blue]equation: (A-I).X;[/color]
[color=red](%o6)[/color] \( \left[ \matrix{\displaystyle c-a \cr \frac{d}{2}-b \cr \frac{d}{2}-c+b+\frac{a}{2} \cr \frac{a}{2}-d} \right] \)
[color=green]穩定狀態時各網頁PageRank總和為1
a+b+c+d=1[/color]
[color=red](%i7)[/color] [color=blue]equation:addrow(equation,apply("+",args(X))-1);[/color]
[color=red](%o7)[/color] \( \left[ \matrix{\displaystyle c-a \cr \frac{d}{2}-b \cr \frac{d}{2}-c+b+\frac{a}{2} \cr \frac{a}{2}-d \cr d+c+b+a-1} \right] \)
[color=green]解方程式得到各網頁穩定狀態時的PageRank值[/color]
[color=red](%i8)[/color] [color=blue]solve(flatten(args(equation)),flatten(args(X)));[/color]
solve: dependent equations eliminated: (4)
[color=red](%o8)[/color] \( \displaystyle [[a=\frac{4}{11},b=\frac{1}{11},c=\frac{4}{11},d=\frac{2}{11}]] \) 但全世界的網頁連結太多了,大到無法放在一個矩陣內解方程組。於是Google退而求其之,改求\( X \)近似值,這個方法稱為power method就是迭代計算\( X_{(i+1)}=A \cdot X_{(i)} \),直到\( X \)矩陣值小於誤差為止。
[color=green]要先載入draw才能使用draw2d指令[/color]
[color=red](%i1)[/color] [color=blue]load(draw);[/color]
[color=red](%o1)[/color] C:/PROGRA~2/MAXIMA~1.1/share/maxima/5.36.1/share/draw/draw.lisp
[color=green]A網頁連到C,D
B網頁連到C
C網頁連到A
D網頁連到B,C[/color]
[color=red](%i2)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,0,0,1],
[1,1,0,1],
[1,0,0,0]);[/color]
[color=red](%o2)[/color] \( \left[ \matrix{0&0&1&0 \cr 0&0&0&1 \cr 1&1&0&1 \cr 1&0&0&0} \right] \)
[color=green]將每行數字加起來代表該網頁對外連結個數
A網頁連到C,D => 2個
B網頁連到C => 1個
C網頁連到A => 1個
D網頁連到B,C => 2個[/color]
[color=red](%i3)[/color] [color=blue]sum:apply("+",args(A));[/color]
[color=red](%o3)[/color] \( [2,1,1,2] \)
[color=green]將每一行除以對外連結個數,變成轉移矩陣[/color]
[color=red](%i4)[/color] [color=blue]A:transpose(transpose(A)/sum);[/color]
[color=red](%o4)[/color] \( \left[ \matrix{\displaystyle 0&0&1&0 \cr 0&0&0& \frac{1}{2} \cr \frac{1}{2}&1&0&\frac{1}{2} \cr \frac{1}{2}&0&0&0} \right] \)
[color=green]初始值為1/n的矩陣X,n為網頁個數[/color]
[color=red](%i5)[/color]
[color=blue]n:length(A)$
X:genmatrix(lambda([i,j],1/n),n,1);[/color]
[color=red](%o6)[/color] \( \left[ \matrix{\displaystyle \frac{1}{4}\cr\frac{1}{4}\cr\frac{1}{4}\cr\frac{1}{4}} \right] \)
[color=green]迭代計算X=A.X,直到X誤差小於ε為止[/color]
[color=red](%i7)[/color]
[color=blue]PowerMethod(A,X,epsilon):=block
([AllX:X,NextX:X],
do(NextX:A.X,
AllX:addcol(AllX,NextX),
if lmax(flatten(args(abs(NextX-X))))<epsilon then
(return(AllX)),
X:NextX
)
)$[/color]
[color=green]將每次迭代計算X儲存起來[/color]
[color=red](%i8)[/color] [color=blue]AllX: PowerMethod(A,X,0.01);[/color]
[color=red](%o8)[/color] \( \displaystyle \left[ \matrix{ \displaystyle
\frac{1}{4}&\frac{1}{4}&\frac{1}{2}&\frac{5}{16}&\frac{3}{8}&\frac{11}{32}&\frac{25}{64}&\frac{11}{32}&\frac{3}{8}&\frac{91}{256}&\frac{95}{256}&\frac{183}{512}&\frac{377}{1024}&\frac{369}{1024} \cr
\frac{1}{4}&\frac{1}{8}&\frac{1}{16}&\frac{1}{16}&\frac{1}{8}&\frac{5}{64}&\frac{3}{32}&\frac{11}{128}&\frac{25}{256}&\frac{11}{128}&\frac{3}{32}&\frac{91}{1024}&\frac{95}{1024}&\frac{183}{2048} \cr
\frac{1}{4}&\frac{1}{2}&\frac{5}{16}&\frac{3}{8}&\frac{11}{32}&\frac{25}{64}&\frac{11}{32}&\frac{3}{8}&\frac{91}{256}&\frac{95}{256}&\frac{183}{512}&\frac{377}{1024}&\frac{369}{1024}&\frac{375}{1024} \cr
\frac{1}{4}&\frac{1}{8}&\frac{1}{8}&\frac{1}{4}&\frac{5}{32}&\frac{3}{16}&\frac{11}{64}&\frac{25}{128}&\frac{11}{64}&\frac{3}{16}&\frac{91}{512}&\frac{95}{512}&\frac{183}{1024}&\frac{377}{2048} } \right] \)
[color=green]迭代次數[/color]
[color=red](%i9)[/color] [color=blue]n:length(AllX[1]);[/color]
[color=red](%o9)[/color] \( 14 \)
[color=green]經過多次迭代計算後發現X趨近穩定狀態[/color]
[color=red](%i10)[/color]
[color=blue]draw2d(points_joined = true,
color=red,
points(makelist(k,k,1,n),AllX[1]),
points_joined = true,
color=green,
points(makelist(k,k,1,n),AllX[2]),
points_joined = true,
color=blue,
points(makelist(k,k,1,n),AllX[3]),
points_joined = true,
color=black,
points(makelist(k,k,1,n),AllX[4])
)$[/color]
[img]https://math.pro/db/attachment.php?aid=2990&k=93e692798215b428b2f34ad5241461e1&t=1435757841&noupdate=yes[/img] 接續上面的例子,但這次刪除B網頁連向C網頁的連結,當使用者進到B網頁之後就不再連到其他網頁,這會導致其他網頁的PageRank值都流到B網頁,最後穩定狀態時B網頁PageRank值為1,A,C,D網頁PageRank值為0。
[color=green]要先載入draw才能使用draw2d指令[/color]
[color=red](%i1)[/color] [color=blue]load(draw);[/color]
[color=red](%o1)[/color] C:/PROGRA~2/MAXIMA~1.1/share/maxima/5.36.1/share/draw/draw.lisp
[color=green]A網頁連到C,D
B網頁沒有連到其他網頁(但為了符合轉移矩陣的定義,設定B網頁連到B)
C網頁連到A
D網頁連到B,C[/color]
[color=red](%i2)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o2)[/color] \( \left[ \matrix{0&0&1&0 \cr 0&1&0&1 \cr 1&0&0&1 \cr 1&0&0&0} \right] \)
[color=green]將每行數字加起來代表該網頁對外連結個數
A網頁連到C,D => 2個
B網頁連到B => 1個
C網頁連到A => 1個
D網頁連到B,C => 2個[/color]
[color=red](%i3)[/color] [color=blue]sum:apply("+",args(A));[/color]
[color=red](%o3)[/color] \( [2,1,1,2] \)
[color=green]將每一行除以對外連結個數,變成轉移矩陣[/color]
[color=red](%i4)[/color] [color=blue]A:transpose(transpose(A)/sum);[/color]
[color=red](%o4)[/color] \( \left[ \matrix{\displaystyle 0&0&1&0 \cr 0&1&0& \frac{1}{2} \cr \frac{1}{2}&0&0&\frac{1}{2} \cr \frac{1}{2}&0&0&0} \right] \)
[color=green]初始值為1/n的矩陣X,n為網頁個數[/color]
[color=red](%i5)[/color]
[color=blue]n:length(A)$
X:genmatrix(lambda([i,j],1/n),n,1);[/color]
[color=red](%o6)[/color] \( \left[ \matrix{\displaystyle \frac{1}{4}\cr\frac{1}{4}\cr\frac{1}{4}\cr\frac{1}{4}} \right] \)
[color=green]迭代計算X=A.X,直到X誤差小於ε為止[/color]
[color=red](%i7)[/color]
[color=blue]PowerMethod(A,X,epsilon):=block
([AllX:X,NextX:X],
do(NextX:A.X,
AllX:addcol(AllX,NextX),
if lmax(flatten(args(abs(NextX-X))))<epsilon then
(return(AllX)),
X:NextX
)
)$[/color]
[color=green]將每次迭代計算X儲存起來[/color]
[color=red](%i8)[/color] [color=blue]AllX: PowerMethod(A,X,0.01);[/color]
[color=red](%o8)[/color] \( \displaystyle \left[ \matrix{ \displaystyle
\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{3}{16}&\frac{3}{16}&\frac{5}{32}&\frac{9}{64}&\frac{1}{8}&\frac{7}{64}&\frac{25}{256}&\frac{11}{128}&\frac{39}{512}&\frac{69}{1024}&\frac{61}{1024}&\frac{27}{512}&\frac{191}{4096}&\frac{169}{4096}&\frac{299}{8192}&\frac{529}{16384}&\frac{117}{4096} \cr
\frac{1}{4}&\frac{3}{8}&\frac{7}{16}&\frac{1}{2}&\frac{9}{16}&\frac{39}{64}&\frac{21}{32}&\frac{89}{128}&\frac{187}{256}&\frac{195}{256}&\frac{101}{128}&\frac{833}{1024}&\frac{855}{1024}&\frac{1749}{2048}&\frac{3567}{4096}&\frac{907}{1024}&\frac{1841}{2048}&\frac{14919}{16384}&\frac{943}{1024}&\frac{30475}{32768} \cr
\frac{1}{4}&\frac{1}{4}&\frac{3}{16}&\frac{3}{16}&\frac{5}{32}&\frac{9}{64}&\frac{1}{8}&\frac{7}{64}&\frac{25}{256}&\frac{11}{128}&\frac{39}{512}&\frac{69}{1024}&\frac{61}{1024}&\frac{27}{512}&\frac{191}{4096}&\frac{169}{4096}&\frac{299}{8192}&\frac{529}{16384}&\frac{117}{4096}&\frac{207}{8192} \cr
\frac{1}{4}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{3}{32}&\frac{3}{32}&\frac{5}{64}&\frac{9}{128}&\frac{1}{16}&\frac{7}{128}&\frac{25}{512}&\frac{11}{256}&\frac{39}{1024}&\frac{69}{2048}&\frac{61}{2048}&\frac{27}{1024}&\frac{191}{8192}&\frac{169}{8192}&\frac{299}{16384}&\frac{529}{32768}} \right] \)
[color=green]迭代次數[/color]
[color=red](%i9)[/color] [color=blue]n:length(AllX[1]);[/color]
[color=red](%o9)[/color] \( 20 \)
[color=green]經過多次迭代計算後發現B網頁PageRank趨近於1,A,C,D網頁PageRank趨近於0[/color]
[color=red](%i10)[/color]
[color=blue]draw2d(points_joined = true,
color=red,
points(makelist(k,k,1,n),AllX[1]),
points_joined = true,
color=green,
points(makelist(k,k,1,n),AllX[2]),
points_joined = true,
color=blue,
points(makelist(k,k,1,n),AllX[3]),
points_joined = true,
color=black,
points(makelist(k,k,1,n),AllX[4])
)$[/color]
[img]https://math.pro/db/attachment.php?aid=3041&k=3a134c86959e200bc3c1b483001aeb01&t=1438855206&noupdate=yes[/img] 為了改善其它網頁PageRank值都流到B網頁的問題,Google加上另一個轉移矩陣B,模擬使用者會直接輸入網址隨機跳到其它網頁去,並設定\(\alpha=0.85\)的機率使用者循著連結連到其它網頁,\(1-\alpha=0.15\)的機率使用者直接輸入網址跳到其它網頁。
\( X_{(i+1)}=(\alpha A+(1-\alpha)B)\cdot X_{(i)} \)
最後穩定狀態時雖然B網頁PageRank值還是最大,但A,C,D網頁也能比較出PageRank值大小(\(A=0.183,B=0.533,C=0.168,D=0.117\))。
[color=green]要先載入draw才能使用draw2d指令[/color]
[color=red](%i1)[/color] [color=blue]load(draw);[/color]
[color=red](%o1)[/color] C:/PROGRA~2/MAXIMA~1.1/share/maxima/5.36.1/share/draw/draw.lisp
[color=green]A網頁連到C,D
B網頁沒有連到其他網頁(但為了符合轉移矩陣的定義,設定B網頁連到B)
C網頁連到A
D網頁連到B,C[/color]
[color=red](%i2)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o2)[/color] \( \left[ \matrix{0&0&1&0 \cr 0&1&0&1 \cr 1&0&0&1 \cr 1&0&0&0} \right] \)
[color=green]將每行數字加起來代表該網頁對外連結個數
A網頁連到C,D => 2個
B網頁連到B => 1個
C網頁連到A => 1個
D網頁連到B,C => 2個[/color]
[color=red](%i3)[/color] [color=blue]sum:apply("+",args(A));[/color]
[color=red](%o3)[/color] \( [2,1,1,2] \)
[color=green]將每一行除以對外連結個數,變成轉移矩陣[/color]
[color=red](%i4)[/color] [color=blue]A:transpose(transpose(A)/sum);[/color]
[color=red](%o4)[/color] \( \left[ \matrix{\displaystyle 0&0&1&0 \cr 0&1&0& \frac{1}{2} \cr \frac{1}{2}&0&0&\frac{1}{2} \cr \frac{1}{2}&0&0&0} \right] \)
[color=green]轉移矩陣B,使用者可以直接輸入網址跳到其它網頁(機率1/4)[/color]
[color=red](%i6)[/color]
[color=blue]n:length(A)$
B:genmatrix(lambda([i,i],1/n),n,n);[/color]
[color=red](%o6)[/color] \( \left[ \matrix{\displaystyle \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}} \right] \)
[color=green]使用者有0.85的機率循著連結連到其它網頁(轉移矩陣A)
有0.15的機率直接輸入網址跳到其它網頁(轉移矩陣B)[/color]
[color=red](%i8)[/color]
[color=blue]alpha:85/100$
A:alpha*A+(1-alpha)*B;[/color]
[color=red](%o8)[/color] \( \left[ \matrix{\displaystyle
\frac{3}{80}&\frac{3}{80}&\frac{71}{80}&\frac{3}{80}\cr
\frac{3}{80}&\frac{71}{80}&\frac{3}{80}&\frac{37}{80}\cr
\frac{37}{80}&\frac{3}{80}&\frac{3}{80}&\frac{37}{80}\cr
\frac{37}{80}&\frac{3}{80}&\frac{3}{80}&\frac{3}{80}} \right] \)
[color=green]初始值為1/n的矩陣X,n為網頁個數[/color]
[color=red](%i10)[/color]
[color=blue]n:length(A)$
X:genmatrix(lambda([i,j],1/n),n,1);[/color]
[color=red](%o10)[/color] \( \left[ \matrix{\displaystyle \frac{1}{4}\cr\frac{1}{4}\cr\frac{1}{4}\cr\frac{1}{4}} \right] \)
[color=green]迭代計算X=A.X,直到X誤差小於ε為止[/color]
[color=red](%i11)[/color]
[color=blue]PowerMethod(A,X,epsilon):=block
([AllX:X,NextX:X],
do(NextX:A.X,
AllX:addcol(AllX,NextX),
if lmax(flatten(args(abs(NextX-X))))<epsilon then
(return(AllX)),
X:NextX
)
)$[/color]
[color=green]將每次迭代計算X儲存起來[/color]
[color=red](%i12)[/color] [color=blue]AllX: PowerMethod(A,X,0.01);[/color]
[color=red](%o12)[/color] \( \displaystyle \left[ \matrix{ \displaystyle
\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{27087}{128000}&\frac{27087}{128000}&\frac{20249743}{102400000}&\frac{785852151}{4096000000}&\frac{15306704347}{81920000000}&\frac{299158329499}{1638400000000}\cr
\frac{1}{4}&\frac{57}{160}&\frac{2569}{6400}&\frac{56293}{128000}&\frac{1209381}{2560000}&\frac{101010051}{204800000}&\frac{2092613727}{4096000000}&\frac{17151248489}{32768000000}&\frac{3493031514769}{6553600000000}\cr
\frac{1}{4}&\frac{1}{4}&\frac{1311}{6400}&\frac{1311}{6400}&\frac{965279}{5120000}&\frac{37191303}{204800000}&\frac{719688491}{4096000000}&\frac{13983431147}{81920000000}&\frac{1097747219437}{6553600000000}\cr
\frac{1}{4}&\frac{23}{160}&\frac{23}{160}&\frac{23}{160}&\frac{652479}{5120000}&\frac{652479}{5120000}&\frac{497845631}{4096000000}&\frac{19503486567}{163840000000}&\frac{383093973899}{3276800000000}} \right] \)
[color=green]迭代次數[/color]
[color=red](%i13)[/color] [color=blue]n:length(AllX[1]);[/color]
[color=red](%o13)[/color] \( 9 \)
[color=green]經過多次迭代計算後發現B網頁PageRank值還是最大,但A,C,D網頁也能比較出PageRank值大小[/color]
[color=red](%i23)[/color]
[color=blue]draw2d(points_joined = true,
color=red,
points(makelist(k,k,1,n),AllX[1]),
points_joined = true,
color=green,
points(makelist(k,k,1,n),AllX[2]),
points_joined = true,
color=blue,
points(makelist(k,k,1,n),AllX[3]),
points_joined = true,
color=black,
points(makelist(k,k,1,n),AllX[4])
)$[/color]
[img]https://math.pro/db/attachment.php?aid=3063&k=7f7bf36a19767e5798bedba20c1ee6a1&t=1440814874&noupdate=yes[/img] \( A=\left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \)
\(A\)矩陣中大部分的元素都是0,故\(A\)矩陣又稱為稀疏矩陣(Sparse Matrix),只要儲存非0元素的位置及其元素値,不僅可以節省記憶體空間,還可以節省不必要的運算。
以下介紹三種基本稀疏矩陣的儲存格式及其對應的矩陣乘法
(1)Coordinate-wise(COO)
就是將\(A\)矩陣中非零元素的行、列、値儲存起來
\( \matrix{第3行第1列値為1 \cr 第2行第2列値為1 \cr 第4行第2列値為1 \cr 第1行第3列値為1 \cr 第4行第3列値為1 \cr 第1行第4列値為1} \) 轉換成 \( \matrix{Col=[\;3,2,4,1,4,1]\; \cr Row=[\;1,2,2,3,3,4]\; \cr Val=[\;1,1,1,1,1,1]\;} \)
COO稀疏矩陣以列或以列存取元素同樣簡單
再乘上\(b\)矩陣\( =[\;1,2,3,4]\;^T \)
\( \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \cdot \left[ \matrix{1\cr 2 \cr 3 \cr 4} \right]=\left[ \matrix{1\times 3 \cr 1 \times 2+1 \times 4 \cr 1 \times 1+1 \times 4 \cr 1 \times 4} \right] \) 轉換成 \( \left[ \matrix{Val[1]\cdot b[Col[1]] \cr Val[2]\cdot b[Col[2]]+Val[3]\cdot b[Col[3]] \cr Val[4]\cdot b[Col[4]]+Val[5]\cdot b[Col[5]] \cr Val[6]\cdot b[Col[6]]} \right] \)
程式碼為
for (i = 0; i < N; i = i + 1)
result[ i ] = 0;
for (i = 1; i <= nnz; i = i + 1)
result[Row[ i ]] = result[Row[ i ]] + Val[ i ]*b[Col[ i ]];
\(N\)代表\(A\)矩陣列個數
\(nnz\)代表\(A\)矩陣非零元素的個數
[color=red](%i1)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o1)[/color] \( \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \)
[color=green]A矩陣轉換成COO稀疏矩陣[/color]
[color=red](%i5)[/color]
[color=blue]Col:[]$
Row:[]$
Val:[]$
for i:1 thru length(A) do
(for j:1 thru length(A[1]) do
(if A[i,j]#0 then
(print("第",j,"行第",i,"列値為",A[i,j]),
Row:append(Row,[ i ]),
Col:append(Col,[j]),
Val:append(Val,[A[i,j]])
)
)
)$[/color]
第3行第1列値為1
第2行第2列値為1
第4行第2列値為1
第1行第3列値為1
第4行第3列値為1
第1行第4列値為1
[color=green]行,列,値[/color]
[color=red](%i8)[/color]
[color=blue]Col;
Row;
Val;[/color]
[color=red](%o6)[/color] \( [\;3,2,4,1,4,1]\; \)
[color=red](%o7)[/color] \( [\;1,2,2,3,3,4]\; \)
[color=red](%o8)[/color] \( [\;1,1,1,1,1,1]\; \)
[color=red](%i9)[/color] [color=blue]b:[1,2,3,4];[/color]
[color=red](%o9)[/color] \( [\;1,2,3,4]\; \)
[color=green]COO稀疏矩陣相乘,可以發現0元素都不會被乘到[/color]
[color=red](%i13)[/color]
[color=blue]N:apply(max,Row)$
result:create_list(0,i,1,N)$
for i:1 thru length(Val) do
(print("result[",Row[ i ],"]再加上",Val[ i ],"*",b[Col[ i ]]),
result[Row[ i ]]:result[Row[ i ]]+Val[ i ]*b[Col[ i ]]
)$[/color]
result[1]再加上1*3
result[2]再加上1*2
result[2]再加上1*4
result[3]再加上1*1
result[3]再加上1*4
result[4]再加上1*1
[color=green]COO稀疏矩陣相乘結果[/color]
[color=red](%i14)[/color] [color=blue]result;[/color]
[color=red](%o14)[/color] \( [\;3,6,5,1]\; \)
[color=green]和矩陣實際相乘結果相同[/color]
[color=red](%i15)[/color] [color=blue]A.b;[/color]
[color=red](%o15)[/color] \( \left[ \matrix{3 \cr 6 \cr 5 \cr 1} \right] \)
-------------------------------
(2)Compressed Sparse Row (CSR)
就是將\(Row\)陣列壓縮成\(RowPtr\)陣列
\( \matrix{Col=[\;3,2,4,1,4,1]\; \cr Row=[\;1,2,2,3,3,4]\; \cr Val=[\;1,1,1,1,1,1]\;} \) 轉換成 \( \matrix{Col=[\;3,2,4,1,4,1]\; \cr RowPtr=[\;1,2,4,6,7]\; \cr Val=[\;1,1,1,1,1,1]\;} \)
\(RowPtr\)意思是
第1列包含値\(Val[1]\)
第2列包含値\(Val[2],Val[3]\)
第3列包含値\(Val[4],Val[5]\)
第4列包含値\(Val[6]\)
若稀疏矩陣行>列則\(RowPtr\)壓縮率較高,CSR格式以列存取元素非常簡單,以行存取元素非常困難
故矩陣乘法以\( A.b \)表示,其中\( b=\left[ \matrix{1 \cr 2 \cr 3 \cr 4} \right] \)
\( \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \cdot \left[ \matrix{1\cr 2 \cr 3 \cr 4} \right]=\left[ \matrix{1\times 3 \cr 1 \times 2+1 \times 4 \cr 1 \times 1+1 \times 4 \cr 1 \times 4} \right] \) 轉換成 \( \left[ \matrix{Val[1]\cdot b[Col[1]] \cr Val[2]\cdot b[Col[2]]+Val[3]\cdot b[Col[3]] \cr Val[4]\cdot b[Col[4]]+Val[5]\cdot b[Col[5]] \cr Val[6]\cdot b[Col[6]]} \right] \)
程式碼為
for (i = 0; i < N; i = i + 1)
result[ i ] = 0;
for (i = 1; i < N; i = i + 1)
{for (j = RowPtr[ i ]; j < RowPtr[i+1]; j = j + 1)
result[ i ] = result[ i ] + Val[j]*b[Col[j]];
}
\(N\)代表\(RowPtr\)陣列長度
[color=red](%i1)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o1)[/color] \( \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \)
[color=green]A矩陣轉換成CSR稀疏矩陣[/color]
[color=red](%i6)[/color]
[color=blue]Col:[]$
RowPtr:[1]$
Val:[]$
index:1$
for i:1 thru length(A) do
(for j:1 thru length(A[1]) do
(if A[i,j]#0 then
(print("第",j,"行値為",A[i,j]),
Col:append(Col,[j]),
Val:append(Val,[A[i,j]]),
index:index+1
)
),
RowPtr:append(RowPtr,[index]),
print("第",i,"列包含Val[",RowPtr[ i ],"~",RowPtr[i+1]-1,"]")
)$[/color]
第3行値為1
第1列包含Val[1~1]
第2行値為1
第4行値為1
第2列包含Val[2~3]
第1行値為1
第4行値為1
第3列包含Val[4~5]
第1行値為1
第4列包含Val[6~6]
[color=green]行,列壓縮,値[/color]
[color=red](%i9)[/color]
[color=blue]Col;
RowPtr;
Val;[/color]
[color=red](%o7)[/color] \( [\;3,2,4,1,4,1]\; \)
[color=red](%o8)[/color] \( [\;1,2,4,6,7]\; \)
[color=red](%o9)[/color] \( [\;1,1,1,1,1,1]\; \)
[color=red](%i10)[/color] [color=blue]b:[1,2,3,4];[/color]
[color=red](%o10)[/color] \( [\;1,2,3,4]\; \)
[color=green]CSR稀疏矩陣相乘,可以發現0元素都不會被乘到[/color]
[color=red](%i13)[/color]
[color=blue]N:length(RowPtr)-1$
result:create_list(0,i,1,N)$
for i:1 thru length(RowPtr)-1 do
(for j:RowPtr[ i ] thru RowPtr[i+1]-1 do
(print("result[",i,"]再加上",Val[j],"*",b[Col[j]]),
result[ i ]:result[ i ]+Val[j]*b[Col[j]]
)
)$[/color]
result[1]再加上1*3
result[2]再加上1*2
result[2]再加上1*4
result[3]再加上1*1
result[3]再加上1*4
result[4]再加上1*1
[color=green]CSR稀疏矩陣相乘結果[/color]
[color=red](%i14)[/color] [color=blue]result;[/color]
[color=red](%o14)[/color] \( [\;3,6,5,1]\; \)
[color=green]和矩陣實際相乘結果相同[/color]
[color=red](%i15)[/color] [color=blue]A.b;[/color]
[color=red](%o15)[/color] \( \left[ \matrix{3 \cr 6 \cr 5 \cr 1}\right] \)
-------------------------------
(2)Compressed Sparse Column (CSC)
就是將\(Col\)陣列壓縮成\(ColPtr\)陣列
\( \matrix{Col=[\;1,1,2,3,4,4]\; \cr Row=[\;3,4,2,1,2,3]\; \cr Val=[\;1,1,1,1,1,1]\;} \) 轉換成 \( \matrix{ColPtr=[\;1,3,4,5,7]\; \cr Row=[\;3,4,2,1,2,3]\; \cr Val=[\;1,1,1,1,1,1]\;} \)
\(ColPtr\)意思是
第1行包含値\(Val[1],Val[2]\)
第2行包含値\(Val[3]\)
第3行包含値\(Val[4]\)
第4行包含値\(Val[5],Val[6]\)
若稀疏矩陣列>行則\(RowPtr\)壓縮率較高,CSC格式以行存取元素非常簡單,以列存取元素非常困難
故矩陣乘法以\( b.A \)表示,其中\( b=\left[ 1,2,3,4 \right] \)
\( \left[ 1,2,3,4 \right] \cdot \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right]=\left[ 3 \cdot 1+4 \cdot 1 , 2 \cdot 1 , 1 \cdot 1 , 2 \cdot 1+3 \cdot 1 \right] \) 轉換成 \( \left[ b[Row[1]]\cdot Val[1]+b[Row[2]]\cdot Val[2] , b[Row[3]]\cdot Val[3] , b[Row[4]] \cdot Val[4] , b[Row[5]]\cdot Val[5]+b[Row[6]]\cdot Val[6] \right] \)
程式碼為
for (i = 0; i < N; i = i + 1)
result[ i ] = 0;
for (i = 1; i < N; i = i + 1)
{for (j = ColPtr[ i ]; j < ColPtr[i+1]; j = j + 1)
result[ i ] = result[ i ] + b[Row[j]]*Val[j];
}
\(N\)代表\(ColPtr\)陣列長度
[color=red](%i1)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o1)[/color] \( \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \)
[color=green]A矩陣轉換成CSC稀疏矩陣[/color]
[color=red](%i6)[/color]
[color=blue]ColPtr:[1]$
Row:[]$
Val:[]$
index:1$
for j:1 thru length(A[1]) do
(for i:1 thru length(A) do
(if A[i,j]#0 then
(print("第",j,"行値為",A[i,j]),
Row:append(Row,[ i ]),
Val:append(Val,[A[i,j]]),
index:index+1
)
),
ColPtr:append(ColPtr,[index]),
print("第",j,"行包含Val[",ColPtr[j],"~",ColPtr[j+1]-1,"]")
)$[/color]
第1行値為1
第1行値為1
第1行包含Val[1~2]
第2行値為1
第2行包含Val[3~3]
第3行値為1
第3行包含Val[4~4]
第4行値為1
第4行値為1
第4行包含Val[5~6]
[color=green]行壓縮,列,値[/color]
[color=red](%i9)[/color]
[color=blue]ColPtr;
Row;
Val;[/color]
[color=red](%o7)[/color] \( [\;1,3,4,5,7]\; \)
[color=red](%o8)[/color] \( [\;3,4,2,1,2,3]\; \)
[color=red](%o9)[/color] \( [\;1,1,1,1,1,1]\; \)
[color=red](%i10)[/color] [color=blue]b:[1,2,3,4];[/color]
[color=red](%o10)[/color] \( [\;1,2,3,4]\; \)
[color=green]CSC稀疏矩陣相乘,可以發現0元素都不會被乘到[/color]
[color=red](%i12)[/color]
[color=blue]N:length(ColPtr)-1$
result:create_list(0,i,1,N)$
for i:1 thru length(ColPtr)-1 do
(for j:ColPtr[ i ] thru ColPtr[i+1]-1 do
(print("result[",i,"]再加上",b[Row[j]],"*",Val[j]),
result[ i ]:result[ i ]+b[Row[j]]*Val[j]
)
)$[/color]
result[1]再加上3*1
result[1]再加上4*1
result[2]再加上2*1
result[3]再加上1*1
result[4]再加上2*1
result[4]再加上3*1
[color=green]CSC稀疏矩陣相乘結果[/color]
[color=red](%i13)[/color] [color=blue]result;[/color]
[color=red](%o13)[/color] \( [\;7,2,1,5]\; \)
[color=green]和矩陣實際相乘結果相同[/color]
[color=red](%i14)[/color] [color=blue]b.A;[/color]
[color=red](%o14)[/color] \( [\;7,2,1,5]\; \)
-------------------------------
將以上程式改寫成副程式
[color=red](%i1)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o1)[/color] \( \left[\matrix{0&0&1&0\cr0&1&0&1\cr1&0&0&1\cr1&0&0&0}\right] \)
[color=red](%i2)[/color] [color=blue]b:[1,2,3,4];[/color]
[color=red](%o2)[/color] \( [1,2,3,4] \)
[color=green]A矩陣轉換成COO稀疏矩陣[/color]
[color=red](%i3)[/color]
[color=blue]COO(A):=block
([Col:[],Row:[],Val:[]],
for i:1 thru length(A) do
(for j:1 thru length(A[1]) do
(if A[i,j]#0 then
(Row:append(Row,[ i ]),
Col:append(Col,[j]),
Val:append(Val,[A[i,j]])
)
)
),
return([Col,Row,Val])
)$[/color]
[color=green]行,列,値[/color]
[color=red](%i4)[/color] [color=blue][Col,Row,Val]:COO(A);[/color]
[color=red](%o4)[/color] \( [[3,2,4,1,4,1],[1,2,2,3,3,4],[1,1,1,1,1,1]] \)
[color=green]COO稀疏矩陣相乘[/color]
[color=red](%i5)[/color]
[color=blue]COOmul(Col,Row,Val,b):=block
([N,result],
N:apply(max,Row),
result:create_list(0,i,1,N),
for i:1 thru length(Val) do
(result[Row[ i ]]:result[Row[ i ]]+Val[ i ]*b[Col[ i ]]
),
return(result)
)$[/color]
[color=green]COO稀疏矩陣相乘結果[/color]
[color=red](%i6)[/color] [color=blue]COOmul(Col,Row,Val,b);[/color]
[color=red](%o6)[/color] \( [3,6,5,1] \)
[color=green]和矩陣實際相乘結果相同[/color]
[color=red](%i7)[/color] [color=blue]A.b;[/color]
[color=red](%o7)[/color] \( \left[ \matrix{3 \cr 6 \cr 5 \cr 1} \right] \)
[color=green]A矩陣轉換成CSR稀疏矩陣[/color]
[color=red](%i8)[/color]
[color=blue]CSR(A):=block
([Col:[],RowPtr:[1],Val:[],index:1],
for i:1 thru length(A) do
(for j:1 thru length(A[1]) do
(if A[i,j]#0 then
(Col:append(Col,[j]),
Val:append(Val,[A[i,j]]),
index:index+1
)
),
RowPtr:append(RowPtr,[index])
),
return([Col,RowPtr,Val])
)$[/color]
[color=green]行,列壓縮,値[/color]
[color=red](%i9)[/color] \( [[3,2,4,1,4,1],[1,2,4,6,7],[1,1,1,1,1,1]] \)
[color=green]CSR稀疏矩陣相乘[/color]
[color=red](%i10)[/color]
[color=blue]CSRmul(Col,RowPtr,Val,b):=block
([N,result],
N:length(RowPtr)-1,
result:create_list(0,i,1,N),
for i:1 thru length(RowPtr)-1 do
(for j:RowPtr[ i ] thru RowPtr[i+1]-1 do
(result[ i ]:result[ i ]+Val[j]*b[Col[j]]
)
),
return(result)
)$[/color]
[color=green]CSR稀疏矩陣相乘結果[/color]
[color=red](%i11)[/color] [color=blue]CSRmul(Col,RowPtr,Val,b);[/color]
[color=red](%o11)[/color] \( [3,6,5,1] \)
[color=green]和矩陣實際相乘結果相同[/color]
[color=red](%i12)[/color] [color=blue]A.b;[/color]
[color=red](%o12)[/color] \( \left[ \matrix{3 \cr 6 \cr 5 \cr 1} \right] \)
[color=green]A矩陣轉換成CSC稀疏矩陣[/color]
[color=red](%i13)[/color]
[color=blue]CSC(A):=block
([ColPtr:[1],Row:[],Val:[],index:1],
for j:1 thru length(A[1]) do
(for i:1 thru length(A) do
(if A[i,j]#0 then
(Row:append(Row,[ i ]),
Val:append(Val,[A[i,j]]),
index:index+1
)
),
ColPtr:append(ColPtr,[index])
),
return([ColPtr,Row,Val])
)$[/color]
[color=green]行壓縮,列,値[/color]
[color=red](%i14)[/color] [color=blue][ColPtr,Row,Val]:CSC(A);[/color]
[color=red](%o14)[/color] \( [[1,3,4,5,7],[3,4,2,1,2,3],[1,1,1,1,1,1]] \)
[color=green]CSC稀疏矩陣相乘[/color]
[color=red](%i15)[/color]
[color=blue]CSCmul(b,ColPtr,Row,Val):=block
([N,result],
N:length(ColPtr)-1,
result:create_list(0,i,1,N),
for i:1 thru length(ColPtr)-1 do
(for j:ColPtr[ i ] thru ColPtr[i+1]-1 do
(result[ i ]:result[ i ]+b[Row[j]]*Val[j]
)
),
return(result)
)$[/color]
[color=green]CSC稀疏矩陣相乘結果[/color]
[color=red](%i16)[/color] [color=blue]CSCmul(b,ColPtr,Row,Val);[/color]
[color=red](%o16)[/color] \( [7,2,1,5] \)
[color=green]CSC稀疏矩陣相乘結果[/color]
[color=red](%i17)[/color] [color=blue]b.A;[/color]
[color=red](%o17)[/color] \( [7,2,1,5] \)
參考資料
有COO,CSR矩陣相乘程式碼
[url]http://www.mathcs.emory.edu/~cheung/Courses/561/Syllabus/3-C/sparse.html[/url] 為了因應稀疏矩陣,PageRank公式要稍微改一下
\( X_{(i+1)}=(\alpha A+(1-\alpha)B)\cdot X_{(i)} \)
\( X_{(i+1)}=\left( 0.85 \left[\matrix{\displaystyle 0&0&1&0\cr 0&1&0&\frac{1}{2}\cr \frac{1}{2}&0&0&\frac{1}{2}\cr \frac{1}{2}&0&0&0}\right]+0.15 \left[ \matrix{\displaystyle \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}} \right] \right) \cdot \left[ \matrix{\displaystyle \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4}} \right] \)
\( X_{(i+1)}=0.85 \left[\matrix{\displaystyle 0&0&1&0\cr 0&1&0&\frac{1}{2}\cr \frac{1}{2}&0&0&\frac{1}{2}\cr \frac{1}{2}&0&0&0}\right] \cdot \left[ \matrix{\displaystyle \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4}} \right]+0.15 \left[ \matrix{\displaystyle \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\cr \frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}} \right] \cdot \left[ \matrix{\displaystyle \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4}} \right] \)
轉換成CSR稀疏矩陣
\( \displaystyle X_{(i+1)}=0.85 \left( \matrix{Col=[3,2,4,1,4,1] \cr RowPtr=[1,2,4,6,7] \cr Val=[1,1,\frac{1}{2},\frac{1}{2},\frac{1}{2},\frac{1}{2}]} \right) \left[ \matrix{\displaystyle \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4}} \right]+0.15 \left[ \matrix{\displaystyle \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4} \cr \frac{1}{4}} \right] \)
\( \displaystyle X_{(i+1)}=\alpha \cdot CSR \cdot X_{(i)}+\frac{1-\alpha}{n} \)
[color=green]A網頁連到C,D
B網頁連到B
C網頁連到A
D網頁連到B,C[/color]
[color=red](%i1)[/color]
[color=blue]A:matrix([0,0,1,0],
[0,1,0,1],
[1,0,0,1],
[1,0,0,0]);[/color]
[color=red](%o1)[/color] \( \left[\matrix{0&0&1&0\cr 0&1&0&1\cr 1&0&0&1\cr 1&0&0&0}\right] \)
[color=green]將每行數字加起來代表該網頁對外連結個數
A網頁連到C,D => 2個
B網頁連到B => 1個
C網頁連到A => 1個
D網頁連到B,C => 2個[/color]
[color=red](%i2)[/color] [color=blue]sum:apply("+",args(A));[/color]
[color=red](%o2)[/color] \( [2,1,1,2] \)
[color=green]將每一行除以對外連結個數,變成轉移矩陣[/color]
[color=red](%i3)[/color] [color=blue]A:transpose(transpose(A)/sum);[/color]
[color=red](%o3)[/color] \( \left[ \matrix{\displaystyle 0&0&1&0\cr 0&1&0&\frac{1}{2}\cr \frac{1}{2}&0&0&\frac{1}{2}\cr \frac{1}{2}&0&0&0}\right] \)
[color=green]轉移矩陣轉換成CSR稀疏矩陣[/color]
[color=red](%i4)[/color]
[color=blue]CSR(A):=block
([Col:[],RowPtr:[1],Val:[],index:1],
for i:1 thru length(A) do
(for j:1 thru length(A[1]) do
(if A[i,j]#0 then
(Col:append(Col,[j]),
Val:append(Val,[A[i,j]]),
index:index+1
)
),
RowPtr:append(RowPtr,[index])
),
return([Col,RowPtr,Val])
)$[/color]
[color=green]行,列壓縮,値[/color]
[color=red](%i5)[/color] [color=blue][Col,RowPtr,Val]:CSR(A);[/color]
[color=red](%o5)[/color] \( \displaystyle [[3,2,4,1,4,1],[1,2,4,6,7],[1,1,\frac{1}{2},\frac{1}{2},\frac{1}{2},\frac{1}{2}]] \)
[color=green]CSR稀疏矩陣相乘[/color]
[color=red](%i6)[/color]
[color=blue]CSRmul(Col,RowPtr,Val,b):=block
([N,result],
N:length(RowPtr)-1,
result:create_list(0,i,1,N),
for i:1 thru length(RowPtr)-1 do
(for j:RowPtr[ i ] thru RowPtr[i+1]-1 do
(result[ i ]:result[ i ]+Val[j]*b[Col[j]]
)
),
return(result)
)$[/color]
[color=green]初始值為1/n的矩陣X,n為網頁個數[/color]
[color=red](%i8)[/color]
[color=blue]n:length(A)$
X:create_list(1/n,i,1,n);[/color]
[color=red](%o8)[/color] \( \displaystyle [\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{4}] \)
[color=green]迭代計算X=A.X,直到X誤差小於ε為止[/color]
[color=red](%i9)[/color]
[color=blue]PowerMethod(A,X,epsilon):=block
([AllX:[X],NextX:X,alpha:85/100],
do(NextX:CSRmul(Col,RowPtr,Val,X)*alpha,
NextX:NextX+(1-alpha)/n,
AllX:append(AllX,[NextX]),
if lmax(flatten(args(abs(NextX-X))))<epsilon then
(return(transpose(funmake(matrix,AllX)))),
X:NextX
)
)$[/color]
[color=green]結果和前面一樣[/color]
[color=red](%i10)[/color] [color=blue]AllX: PowerMethod(A,X,0.01);[/color]
[color=red](%o10)[/color] \( \displaystyle \left[ \matrix{ \displaystyle
\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{27087}{128000}&\frac{27087}{128000}&\frac{20249743}{102400000}&\frac{785852151}{4096000000}&\frac{15306704347}{81920000000}&\frac{299158329499}{1638400000000}\cr
\frac{1}{4}&\frac{57}{160}&\frac{2569}{6400}&\frac{56293}{128000}&\frac{1209381}{2560000}&\frac{101010051}{204800000}&\frac{2092613727}{4096000000}&\frac{17151248489}{32768000000}&\frac{3493031514769}{6553600000000}\cr
\frac{1}{4}&\frac{1}{4}&\frac{1311}{6400}&\frac{1311}{6400}&\frac{965279}{5120000}&\frac{37191303}{204800000}&\frac{719688491}{4096000000}&\frac{13983431147}{81920000000}&\frac{1097747219437}{6553600000000}\cr
\frac{1}{4}&\frac{23}{160}&\frac{23}{160}&\frac{23}{160}&\frac{652479}{5120000}&\frac{652479}{5120000}&\frac{497845631}{4096000000}&\frac{19503486567}{163840000000}&\frac{383093973899}{3276800000000}} \right] \)
頁:
[1]