A belated Happy Pi Day!

Honestly, is there any better way to spend a day (or two) on Pi Day (March 14 = “3.14”) than to write some code to compute the value of Pi?!

You may remember from your math coursework Newton’s Formula, which is easy to implement:

\pi = 6 \sum\limits_{n=0}^{\infty}\frac{(2n)!}{(2^{4n+1})(n!)^{2} (2n+1)}

Or, you may have seen the Monte Carlo method. You could “speed this up” using a CUDA GPU implementation, e.g., http://cacs.usc.edu/education/cs596/src/cuda/pi.cu:

// Using CUDA device to calculate pi
#include <stdio.h>
#include <cuda.h>

#define NBIN 10000000  // Number of bins
#define NUM_BLOCK  30  // Number of thread blocks
#define NUM_THREAD  8  // Number of threads per block
int tid;
float pi = 0;

// Kernel that executes on the CUDA device
__global__ void cal_pi(float *sum, int nbin, float step, int nthreads, int nblocks) {
	int i;
	float x;
	int idx = blockIdx.x*blockDim.x+threadIdx.x;  // Sequential thread index across the blocks
	for (i=idx; i< nbin; i+=nthreads*nblocks) {
		x = (i+0.5)*step;
		sum[idx] += 4.0/(1.0+x*x);
	}
}

// Main routine that executes on the host
int main(void) {
	dim3 dimGrid(NUM_BLOCK,1,1);  // Grid dimensions
	dim3 dimBlock(NUM_THREAD,1,1);  // Block dimensions
	float *sumHost, *sumDev;  // Pointer to host & device arrays

	float step = 1.0/NBIN;  // Step size
	size_t size = NUM_BLOCK*NUM_THREAD*sizeof(float);  //Array memory size
	sumHost = (float *)malloc(size);  //  Allocate array on host
	cudaMalloc((void **) &sumDev, size);  // Allocate array on device
	// Initialize array in device to 0
	cudaMemset(sumDev, 0, size);
	// Do calculation on device
	cal_pi <<<dimGrid, dimBlock>>> (sumDev, NBIN, step, NUM_THREAD, NUM_BLOCK); // call CUDA kernel
	// Retrieve result from device and store it in host array
	cudaMemcpy(sumHost, sumDev, size, cudaMemcpyDeviceToHost);
	for(tid=0; tid<NUM_THREAD*NUM_BLOCK; tid++)
		pi += sumHost[tid];
	pi *= step;

	// Print results
	printf("PI = %f\n",pi);

	// Cleanup
	free(sumHost); 
	cudaFree(sumDev);

	return 0;
}

Unfortunately, while these are easy to implement, they are terrible algorithms, converging on the value of Pi very, very slowly indeed–even using the parallelism of GPUs. Although I’ve read about Chudnovsky’s Formula, I never tried it, probably because I had my priorities wrong. Beside, all of these require an arbitrary precision math package, which I didn’t have.

But, it turns out there are some: MPIR (Multiple Precision Integers and Rationals), MPIR.NET, and LongCalc

So, I wish everybody a belated Pi Day with a simple C# program that uses LongCalc (Mpir.NET/MPIR).

namespace Pi
{
    using System;
    using LongCalc;

    class Pi_ChudnovskyFormula
    {
        public static bf Compute(int times)
        {
            // https://en.wikipedia.org/wiki/Pi#Rapidly_convergent_series
            bf bigK = 6;
            bf bigM = 1;
            bf bigL = 13591409;
            bf bigX = 1;
            bf bigS = 13591409;
            bf mul = new bf("-262537412640768000");
            System.Console.WriteLine("iterations " + times);
            for (int k = 1; k < times; ++k)
            {
                bf bfk = k;
                if (k % 1000 == 0) System.Console.Write(" " + k);
                bf bfk3 = bfk * bfk * bfk;
                bf bigK3 = bigK * bigK * bigK;
                bf bigK4 = (bigK * 16);
                bigM = (bigK3 - bigK4) * bigM / bfk3;
                bigL += 545140134;
                bigX = bigX.Multiply(mul);
                bf ml = bigM * bigL;
                bigS += ml / bigX;
                bigK += 12;
            }
            System.Console.WriteLine();
            bf sq = new bf(10005).Sqrt();
            bf pi = 426880 * sq / bigS;
            return pi;
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
            int digits = 50000;
            LongCalc.bf.GlobalPrecision = (int)(digits * Math.Log(10, 2) * 2); // bits
            int times = LongCalc.bf.GlobalPrecision / 14; // approximate accuracy added to significand per iteration.
            bf r2 = Pi_ChudnovskyFormula.Compute(times);
            System.Console.WriteLine(r2.toString(10, false, digits));
        }
    }
}


3.1415926535897932384626433...
Posted in Tip