Saturday, July 4, 2015

Relax Laplace! I

In a region with no free charge, the electrostatic potential obeys the Laplace equation:

`grad^2 phi = 0`

Jacobi iteration is a simple numerical technique to find solutions to problems like this. For example, if you have a 2D equal-spaced cartesian mesh of points, the numerical gradient

`(del^2 phi)/(del x^2) + (del^2 phi)/(del y^2)` can be approximated as:

(phi[i+1,j] + phi[i-1,j] - 2*phi[i,j]) + (phi[i,j+1] + phi[i,j-1] - 2*phi[i,j])

If that is set to zero, then one can iterate the expression:

next_phi[i,j] = 0.25 * (phi[i+1,j] + phi[i-1,j] +  phi[i,j+1] + phi[i,j-1])

It is slow, but here's a basic example. Scroll down to see cool ways to make it much faster!


Shown as a contour using mostly Matplotlib defaults:



1
2
CS = ax.contour(phi, origin = 'lower')
ax.clabel(CS, inline=1, fontsize=11)

The script to set up the problem (phido_me, and sphere) and then iterate is quite short. Instead of looping over all the indices in the array as you would in a compiled language like C or FORTRAN, in Python we try to just use existing NumPy methods like np.roll in line 29.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import numpy as np
import matplotlib.pyplot as plt

nx, ny = 101, 101

phi   = np.zeros((ny, nx), dtype = 'float')
do_me = np.ones_like(phi, dtype='bool')

x0, y0, r0 = 40, 65, 12

x = np.arange(nx, dtype = 'float')[None,:]
y = np.arange(ny, dtype = 'float')[:,None]
rsq = (x-x0)**2 + (y-y0)**2

circle = rsq <= r0**2

phi[circle] = 1.0
do_me[circle] = False

do_me[0,:], do_me[-1,:], do_me[:,0], do_me[:,-1] = False, False, False, False

n, nper = 100, 100
phi_hold = np.zeros((n+1, ny, nx))
phi_hold[0] = phi

for i in range(n):
        
    for j in range(nper):
        phi2 = 0.25*(np.roll(phi,  1, axis=0) +
                     np.roll(phi, -1, axis=0) +
                     np.roll(phi,  1, axis=1) +
                     np.roll(phi, -1, axis=1) )

        phi[do_me] = phi2[do_me]

    phi_hold[i+1] = phi

change = phi_hold[1:] - phi_hold[:-1]

cmin, cmax = [], []
for thing in change:
    cmin.append(thing[do_me].min())
    cmax.append(thing[do_me].max())

The actual calculation (lines 26 to 36) takes about 2.2 seconds on my laptop. That's 100*100*101*101 multiplications plus three times that many sums, among other things, or about 6ns per flop. Since a lap-top-CPU-flop is more like about 1 or 1.5ns, there's definitely some room for improvement. (That GPU is just sitting there hungry for work, and could probably do it much faster!)

There's an upcoming post on lap-top-CPU-flops with NumPy; stay tuned!

 So, let's make the problem a little harder by making it 3D, Instead of ~10,000 mesh points we'll have ~1,000,000. At 8 bytes per point, that's beyond the cache, so it's going to be slower still.


and two slices below the sphere:


This took 525 seconds. That's 100*100*101*101*101 multiplications plus five summations, or about 9ns per flop. It's creeping up there! ,And we're still changing at about 3E-04 per 100 iterations. The next Relax Laplace! post will cover some ways to bring that back down.

The 3D version of the script for reference:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
nx, ny, nz = 101, 101, 101

phi   = np.zeros((nz, ny, nx), dtype = 'float')
do_me = np.ones_like(phi, dtype='bool')

x0, y0, z0, r0 = 40, 65, 50, 12

x = np.arange(nx, dtype = 'float')[None,None,:]
y = np.arange(ny, dtype = 'float')[None,:,None]
z = np.arange(ny, dtype = 'float')[:,None,None]
rsq = (x-x0)**2 + (y-y0)**2 + (z-z0)**2

sphere = rsq <= r0**2

phi[sphere] = 1.0
do_me[sphere] = False

do_me[0,:,:], do_me[-1,:,:] = False, False
do_me[:,0,:], do_me[:,-1,:] = False, False
do_me[:,:,0], do_me[:,:,-1] = False, False

n, nper = 100, 100
phi_hold = np.zeros((n+1, nz, ny, nx))
phi_hold[0] = phi

one_sixth = (1./6.)

for i in range(n):
        
    for j in range(nper):
        phi2 = one_sixth*(np.roll(phi,  1, axis=0) +
                          np.roll(phi, -1, axis=0) +
                          np.roll(phi,  1, axis=1) +
                          np.roll(phi, -1, axis=1) +
                          np.roll(phi,  1, axis=2) +
                          np.roll(phi, -1, axis=2) )

        phi[do_me] = phi2[do_me]

    print i

    phi_hold[i+1] = phi






No comments:

Post a Comment

Note: Only a member of this blog may post a comment.