Thursday, July 18, 2013

A guide to Install and use Trilinos for beginners part II

In my previous blog I explained how I installed the Amesos package with mpi support.
Here I would like to show how to compile a simple program.

Lets say that in the working directory the main file is test_amesos.cpp. To compile this we have to create an empty Makefile file, then copy the content of this file to the empty file.
In the first line replace the environmental variable $(MYAPP_TRILINOS_DIR) with the path where the trilinos-11.0.3-Source for example was extracted.

Next you need to replace the MyApp.exe name and src_file with the correct ones. For example if you want to build an executable test_amesos then you have to change the following lines as:

default: print_info test_amesos
.
.
.
test: test_amesos input.xml

./test_amesos

test_amesostest_amesos.o
$(CXX) $(CXX_FLAGS) test_amesos o -o test_amesos $(LINK_FLAGS) $(INCLUDE_DIRS) $(DEFINES) $(LIBRARY_DIRS) $(LIBRARIES)
.
.
.
test_amesos.o:

$(CXX) -c $(CXX_FLAGS) $(INCLUDE_DIRS) $(DEFINES) test_amesos.cpp

Note that you can comment the lines 
libmyappLib.a: src_file.o
 $(Trilinos_AR) cr libmyappLib.a src_file.o

Once the Makefile is properly set you can compile the program by typing:
$make clean 
(Optionally to remove any existing object files) and finally compile with
$make test_amesos




A guide to Install and use Trilinos for beginners part I

Trilinos is an amazing collection of numerical algorithms for large scale simulations.
It allows people who have very little programming experience in C/C++ to use advanced numerical techniques and powerful solvers.

Being myself an inexperienced C/C++ programmer and unfamiliar with the linux environment I found extremely difficult to get started. In this article I'd like to describe how to install and compile programs using the Trilinos library.
Using the following steps I have been able to install Trilinos under Ubuntu 12.04 32-bit, Ubuntu 12.10 64-bit and two cluster systems that run some kind of linux distribution.

First download the .gz file from their website. (You need to enter your email address).
unzip it in your hard drive.
Then make sure that you have cmake 2.8 or greater installed by typing
$cmake --version
If cmake is not installed in your system you can get it from here.
(The installation of cmake was straightforward).

Untar the file: (This has to be done only once)
$ tar -zxvf  trilinos-11.0.3-Source.tar.gz
You need of course to replace the name with the name of the file you have downloaded. This will create the directory, let's say: ../trilinos-11.0.3-Source

Trilinos contains a large number of packages. It is very unlikely that you need all of them, so I highly recommend not to try building all of them at once, but choose only the packages you need. One package I use very often is the Amesos, which provides a nice interface to solve linear systems AX=B using direct methods. In the following I'll describe how to install Amesos and all the packages that Amesos depends on.

For each combination of packages you want to build do the following steps:
Make sure that your working directory is the ../trilinos-11.0.3-Source
Then create a new directory under /trilinos-11.0.3-Source:
Since we want to build the Amesos package with mpi support we will create the AMESOS_MPI directory:
$ mkdir AMESOS_MPI 
Go into the new directory
$ cd AMESOS_MPI

Create an empty file and name it for example "do-configure". Then copy the content from this example to the newly created "do-configure" file, which should be located under the ../trilinos-11.0.3-Source/AMESOS_MPI/.
You may have already guessed that the lines starting with # are comments.
Modify the second non-comment line as follows:
rm -f CMakeCache.txt  ; find . -name CMakeFiles -exec rm -rf {} \;
After the line cmake \ give the follow options:
-D CMAKE_INSTALL_PREFIX:PATH=../trilinos-11.0.3-Source/AMESOS_MPI \
This line tells where to install the package.

-D CMAKE_BUILD_TYPE:STRING=DEBUG \
You can choose here DEBUG or RELEASE (start always with debug)

-D CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/mpiCC \
Here you provide the mpi c++ compiler (If you don't know where this is try to locate it with the command $locate mpiCC )

-D CMAKE_C_COMPILER:FILEPATH=/usr/bin/mpicc \
This is the mpi C compiler

-D CMAKE_Fortran_COMPILER:FILEPATH=/usr/bin/gfortran \
And this is the fortran compiler

-D HAVE_GCC_ABI_DEMANGLE:BOOL=ON \
-D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \

-D CMAKE_VERBOSE_MAKEFILE:BOOL=TRUE \
The options I'm using here are only a small subset of the available options. You can see the full list and the explanation of the above options here.

-D Trilinos_ENABLE_ALL_PACKAGES:BOOL=FALSE \
Next you tell trilinos that you don't want to build all the packages.

-D Trilinos_ENABLE_Amesos:BOOL=ON \
And now you tell that you want to enable the Amesos package. If you want to build any other package, you have to replace the Amesos with the name of the package e.g. ML, Optika, Teuchos etc.

-D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON \
-D Trilinos_ENABLE_TESTS:BOOL=ON \
-D Trilinos_ENABLE_EXAMPLES:BOOL=ON \
Please see here what these options are.

-D TPL_ENABLE_MPI:BOOL=ON \
Here you tell that you want MPI support and next you provide the includes and library paths for the mpi. If you dont know where these are you can use the locate command
-D MPI_BASE_DIR:PATH="/usr/lib/openmpi" \
-D MPI_BIN_DIR:PATH="/usr/bin" \

In my laptop I only have 2 processors so I'm setting the following to 2
-D MPI_EXEC_MAX_NUMPROCS:STRING=2 \
-D MPI_EXEC:FILEPATH="mpirun" \
-D MPI_EXEC_NUMPROCS_FLAG:STRING=-np \
The above options are related to the mpi.

Finally we need to provide the names and paths of blas and lapack libraries
-D BLAS_LIBRARY_NAMES:STRING="blas" \
-D BLAS_LIBRARY_DIRS:PATH=/usr/lib/libblas \
-D LAPACK_LIBRARY_NAMES:STRING="lapack" \

-D LAPACK_LIBRARY_DIRS:PATH=/usr/lib/lapack \
Again if you are not sure where these are, use the locate command.

At the very end add these two lines
$EXTRA_ARGS \

../
The last line tells the script to go one level up so you need to make sure that the AMESOS_MPI directory is under the trilinos-11.0.3-Source directory.

Make the do-configure file executable and execute:
$chmod 755 do_configure
$./do-configure
If there are no errors, then you can build the package by typing
$make -j2
where -j2 is the number of available processors.
If there are no errors then you can try some tests:
$ctest -j2
And you can find out how many tests have failed. (Hopefully none)
The last step is to install the package by typing:
$make install

Now the package is ready for use! 

In the following I'm listing the do-configure file without comments:
cmake \
-D CMAKE_INSTALL_PREFIX:PATH=/home/giorgos/Downloads/trilinos-11.0.3-Source/AMESOS_MPI \
-D CMAKE_BUILD_TYPE:STRING=DEBUG \
-D CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/mpiCC \
-D CMAKE_C_COMPILER:FILEPATH=/usr/bin/mpicc \
-D CMAKE_Fortran_COMPILER:FILEPATH=/usr/bin/gfortran \
-D HAVE_GCC_ABI_DEMANGLE:BOOL=ON \
-D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \
-D CMAKE_VERBOSE_MAKEFILE:BOOL=TRUE \
-D Trilinos_ENABLE_ALL_PACKAGES:BOOL=FALSE \
-D Trilinos_ENABLE_Amesos:BOOL=ON \
-D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON \
-D Trilinos_ENABLE_TESTS:BOOL=ON \
-D Trilinos_ENABLE_EXAMPLES:BOOL=ON \
-D TPL_ENABLE_MPI:BOOL=ON \
-D MPI_BASE_DIR:PATH="/usr/lib/openmpi" \
-D MPI_BIN_DIR:PATH="/usr/bin" \
-D MPI_EXEC_MAX_NUMPROCS:STRING=2 \
-D MPI_EXEC:FILEPATH="mpirun" \
-D MPI_EXEC_NUMPROCS_FLAG:STRING=-np \
-D BLAS_LIBRARY_NAMES:STRING="blas" \
-D BLAS_LIBRARY_DIRS:PATH=/usr/lib/libblas \
-D LAPACK_LIBRARY_NAMES:STRING="lapack" \
-D LAPACK_LIBRARY_DIRS:PATH=/usr/lib/lapack \
$EXTRA_ARGS \
../






Monday, October 29, 2012

My first steps in CUDA

c=-0.8+0.156i


Very recently I got a new Nvidia graphic card (the cheapest I could find with compute capability greater than 1.3), and I started learning the very basic of GPU programming, in hope of using this for my projects.

I considered myself as an experienced Matlab programmer but I have practically no programming experience in c++, so in reality I have to learn simultaneously c++, CUDA and openGL (for some of the exercises)

My first challenge was to setup the environment to compile the CUDA programs. I had already  Visual Studio 2010 express (VS 2010) installed, so I just followed the three basic steps to install the necessary CUDA stuff, which are described here.
Next I configured the VS2010 so that I can compile CUDA programs. Although I couldn't find any official info about that (from nvidia or VS), there are many nice people, which describe these steps, for example here.

I'm repeating them briefly emphasizing on few minor things that can be a headache if they are missed.

1) Open a new project and select win32 console application and at the very last step of the wizard, before you hit finish CHOOSE EMPTY PROJECT (I have spent hours trying to find why I cant see the CUDA C/C++ compiler just because I forgot to check that box)

2) Right after you insert your first file RENAME it from *.cpp to *.cu (again I have spent few more hours to understand why my new project complains about CUDA commands, while the previous one compiled perfectly few minutes ago)

3) Next, right click on the project folder, choose the build customizations and tick the 4.2 CUDA box (or the CUDA version you use).

4) Right click on the project folder again and choose properties. Expand Likner and highlight input. Click on the box next to additional dependencies and add the cudart.lib library.

5) Last, right click on the file you want to compile and choose properties. From the general menu select for item type CUDA C/C++ and you are ready to be in a parallel universe.

I started reading this  book which I highly recommend for beginners (note that I'm beginner in both CUDA and C++. Note also that I'm not affiliated with the author in any way).
One of the book exercises is to program the Julia Set . Unfortunately, for some still unexplained reasons I couldn’t compile the provided code, so I decided to start from scratch and write a more simplistic version of the code without using classes and fancy data structures, although I realize that their programing style is actually match better than mine.

Of course to run this program you need to install openGL.
To compile it with VS2010 you need download the freeglut. (For example I choose the MSVC version from here). Unzip it anywhere and link the VS to that folder (right click on the solution, properties -> Linker ->General->Additional Library Directories add the .../freeglut/lib. Last you need to copy the freeglut.dll file into the folder where the executable is created, for example into the .../Debug folder (Unfortunately I haven't found any better way to overcome this, so I copy this file every time I need to use openGL)
 
The following program requires three inputs. (Note that if you dont give all the three inputs you get an error). First an integer number with the image resolution, second the real part of the constant complex and third the imaginary part of the constant complex. You have to be careful when choosing the complex constant number as you don't always get something nice and many times you get nothing but a white screen. For the figure on top for example I used the famous constant -0.8+0.156i.


c=-0.6+0.4i

GPU Version of Julia Set


#include "windows.h"
#include <GL/gl.h>
#include <GL/glu.h>
#include <GL/glut.h>

#include <iostream>
#include <cuda_runtime.h>
#include <cuda.h>
#include <math.h>
#include <ctime>

/* because I'm primarily a matlab user matrices always start with 1. so I use the ij function to convert to zero based*/
#define ij(i,j,ni) (i-1+ni*(j-1))
#define sgn(x) (x > 0) - (x < 0)

using namespace std;
int *host_BITMAP;
int DIM;
int host_kmin = 10000;
int host_kmax=-10000;

// OpenGl functions
void display()
{
   glClear(GL_COLOR_BUFFER_BIT);
    glColor3f (1.0, 1.0, 1.0);
    glPointSize(1);
    glBegin(GL_POINTS);
    float pnt;

    for (int i=0;i<DIM;i++){
        for (int j=0;j<DIM;j++){
            pnt=(float)host_BITMAP[ij(i+1,j+1,DIM)]*(1.0-0.0)/((float)host_kmax-(float)host_kmin);
            glColor3f((GLfloat)pnt, (GLfloat)pnt, (GLfloat)pnt);
            glVertex2i((GLint)j,(GLint)i);
       }
    }
    glEnd();
    glFlush();
}

void reshape (int w, int h)
{
   glViewport (0, 0, (GLsizei) w, (GLsizei) h);
   glMatrixMode (GL_PROJECTION);
   glLoadIdentity ();
   gluOrtho2D (0.0, (GLdouble) w, 0.0, (GLdouble) h);
}

__device__ int Julia(float x, float y, float *devc1, float *devc2){
    float xnew, ynew, Z;
    int k, TF1, TF;
    TF1=0;
    for (k=0;k<200;k++){
        xnew=pow(x,2)-pow(y,2)+devc1[0];
        ynew=2*x*y+devc2[0];
        x=xnew;
        y=ynew;
        Z=sqrt(x*x+y*y);
        TF1=TF1+ max(sgn(Z-5),0)*(200-k);
        TF=max(sgn(5-Z),0);
        x=x*TF;
        y=y*TF;
    }
    return TF1;
}

__global__ void makeBitmap(int *devBITMAP, int *devDIM,
    float *devc1, float *devc2, int *dev_kmin, int *dev_kmax){//
    int tf;
    float scale=1.5;
    float x;
    float y;
    int jj = blockIdx.x;
    int ii = blockIdx.y;
    x=scale*(float)(devDIM[0]/2.0-jj)/(devDIM[0]/2.0);
    y=scale*(float)(devDIM[0]/2.0-ii)/(devDIM[0]/2.0);
    tf=Julia(x,y,devc1,devc2);
    if (tf>dev_kmax[0])
        dev_kmax[0]=tf;
    if (tf<dev_kmin[0])
        dev_kmin[0]=tf;
    devBITMAP[ij(ii+1,jj+1,devDIM[0])]=tf;
}


int main(int nNumberofArgs, char* pszArgs[]){
    cudaDeviceProp prop;
    int count;
    cudaGetDeviceCount( &count );

    for (int i=0; i< count; i++) {
        printf( "--- General Information for device %d ---\n", i );
        cudaGetDeviceProperties( &prop, i );
        printf( "Name: %s\n", prop.name );
        printf( "Max threads per block: %d\n",prop.maxThreadsPerBlock );
    }

    int dev;
    cudaGetDevice( &dev );
    printf( "ID of current CUDA device: %d\n", dev );
    memset( &prop, 0, sizeof( cudaDeviceProp ) );
    prop.major = 2;
    prop.minor = 0;
    cudaChooseDevice( &dev, &prop );
    printf( "ID of CUDA device closest to revision 2.0: %d\n", dev );
    cudaSetDevice( dev );

    float c1, c2;
    DIM=atoi(pszArgs[1]);
    c1=atof(pszArgs[2]);
    c2=atof(pszArgs[3]);

    cout << "The image dimension is: " << DIM << "\n";
    cout << "The real part is: " << c1 << "\n";
    cout << "The imaginary part is: " << c2 << "\n";

    host_BITMAP = new int[DIM*DIM];

    int *dev_BITMAP;
    int *dev_DIM;
    float *dev_c1;
    float *dev_c2;
    int *dev_kmin;
    int *dev_kmax;

    cudaMalloc((void**)&dev_BITMAP, DIM*DIM*sizeof(int));
    cudaMalloc((void**)&dev_DIM, 1*sizeof(int));
    cudaMalloc((void**)&dev_c1, 1*sizeof(float));
    cudaMalloc((void**)&dev_c2, 1*sizeof(float));
    cudaMalloc((void**)&dev_kmin, 1*sizeof(float));
    cudaMalloc((void**)&dev_kmax, 1*sizeof(float));

    cudaMemcpy(dev_DIM, &DIM, 1*sizeof(int),cudaMemcpyHostToDevice);
    cudaMemcpy(dev_c1, &c1, 1*sizeof(float),cudaMemcpyHostToDevice);
    cudaMemcpy(dev_c2, &c2, 1*sizeof(float),cudaMemcpyHostToDevice);
    cudaMemcpy(dev_kmin, &host_kmin, 1*sizeof(int),cudaMemcpyHostToDevice);
    cudaMemcpy(dev_kmax, &host_kmax, 1*sizeof(int),cudaMemcpyHostToDevice);

    dim3 grid(DIM,DIM);

    std::clock_t start;
    double duration;
    start = std::clock();
    makeBitmap <<< grid, 1 >>> (dev_BITMAP, dev_DIM, dev_c1, dev_c2, dev_kmin, dev_kmax);
    duration = ( std::clock() - start ) / (double) CLOCKS_PER_SEC;
    std::cout<<"Julia Set in: " << duration << " sec\n";

    start = std::clock();
    cudaMemcpy(host_BITMAP,dev_BITMAP,DIM*DIM*sizeof(int),cudaMemcpyDeviceToHost);
    cudaMemcpy(&host_kmin,dev_kmin,1*sizeof(int),cudaMemcpyDeviceToHost);
    cudaMemcpy(&host_kmax,dev_kmax,1*sizeof(int),cudaMemcpyDeviceToHost);
    duration = ( std::clock() - start ) / (double) CLOCKS_PER_SEC;
    std::cout<<"Data Copied from GPU in: " << duration << " sec\n";

    cout << host_kmin << "\n";
    cout << host_kmax << "\n";

    glutInit(&nNumberofArgs, pszArgs);
    glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB);
    glutInitWindowSize(DIM, DIM);
    glutInitWindowPosition(100, 100);
    glutCreateWindow(pszArgs[0]);
    glClearColor (0.0, 0.0, 0.0, 0.0);
    glShadeModel(GL_FLAT);
    glutDisplayFunc(display);
    glutReshapeFunc(reshape);
    glutMainLoop();

    delete [] host_BITMAP;
    cudaFree (dev_BITMAP);
    cudaFree (dev_DIM);
    cudaFree (dev_c1);
    cudaFree (dev_c2);
    cudaFree (dev_kmin);
    cudaFree (dev_kmax);

    return 0;

}


(I dont remember the number)

CPU version of Julia Set

/* For ubuntu compile using : g++ filename.cpp -o julia -lGL -lglut -lGLU afer you remove the window include*/
#include "windows.h"
#include <GL/gl.h>
#include <GL/glu.h>
#include <GL/glut.h>

#include <iostream>

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <ctime>

#define ij(i,j,ni) (i-1+ni*(j-1))

#define sgn(x) (x > 0) - (x < 0)

using namespace std;

int *myBITMAP;
int DIM;
int kmin = 10000;
int kmax=-10000;

int Julia(double x, double y, double c1, double c2){

    double xnew, ynew, Z;
    int k, TF1, TF;
    TF1=0;
    for (k=0;k<200;k++){
        xnew=pow(x,2)-pow(y,2)+c1;
        ynew=2*x*y+c2;
        x=xnew;
        y=ynew;
        Z=sqrt(x*x+y*y);
        TF1=TF1+ max(sgn(Z-5),0)*(200-k);
        TF=max(sgn(5-Z),0);
        x=x*TF;
        y=y*TF;
        //cout << TF1 << "\n";

    }

    return TF1;
}

void makeBitmap(int DIM, double c1, double c2){

    int tf;
    double scale=1.5;
    double x;
    double y;

    for (int i=0;i<DIM;i++){

        //cout << i << "\n";
        for (int j=0;j<DIM;j++){
            x=scale*(double)(DIM/2-j)/(DIM/2);
            y=scale*(double)(DIM/2-i)/(DIM/2);
            tf=Julia(x,y,c1,c2);
            if (tf>kmax)
                kmax=tf;
            if (tf<kmin)
                kmin=tf;
            myBITMAP[ij(i+1,j+1,DIM)]=tf;
        }
    }

}


void display()

{
    glClear(GL_COLOR_BUFFER_BIT);
    glColor3f (1.0, 1.0, 1.0);
    glPointSize(1);
    glBegin(GL_POINTS);
    float pnt;

    for (int i=0;i<DIM;i++){

        for (int j=0;j<DIM;j++){
            pnt=(float)myBITMAP[ij(i+1,j+1,DIM)]*(1.0-0.0)/((float)kmax-(float)kmin);
            glColor3f((GLfloat)pnt, (GLfloat)pnt, (GLfloat)pnt);
            glVertex2i((GLint)j,(GLint)i);
       }
    }
    glEnd();
    glFlush();
}

void reshape (int w, int h)

{
   glViewport (0, 0, (GLsizei) w, (GLsizei) h);
   glMatrixMode (GL_PROJECTION);
   glLoadIdentity ();
   gluOrtho2D (0.0, (GLdouble) w, 0.0, (GLdouble) h);
}

int main(int nNumberofArgs, char* pszArgs[]){
    double c1;
    double c2;

    DIM=atoi(pszArgs[1]);
    c1=atof(pszArgs[2]);
    c2=atof(pszArgs[3]);
    cout << "Number of inputs:" << nNumberofArgs << "\n";
    cout << "The resolution is " << DIM << " \n";
    cout << "The real part is " << c1 << " \n";
    cout << "The imaginary part is " << c2 << " \n";

    myBITMAP = new int [DIM*DIM];

    std::clock_t start;

    double duration;
    start = std::clock();
    makeBitmap(DIM,c1,c2);
    duration = ( std::clock() - start ) / (double) CLOCKS_PER_SEC;
    std::cout<<"Julia Set in: " << duration << " sec\n";

    cout << kmin << "\n";
    cout << kmax << "\n";

    glutInit(&nNumberofArgs, pszArgs);

    glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB);
    glutInitWindowSize(DIM, DIM);
    glutInitWindowPosition(100, 100);
    glutCreateWindow(pszArgs[0]);

    glClearColor (0.0, 0.0, 0.0, 0.0);

    glShadeModel(GL_FLAT);
    glutDisplayFunc(display);
    glutReshapeFunc(reshape);
    glutMainLoop();

    delete [] myBITMAP;

    return 0;

}

c=0.285+0.01i

PS.
I would like to mention that initially, every time I was running the GPU code I used to get the following warning "Display driver stopped responding and has recovered"

while the CPU code was working fine. I read in few forums that the OS gives only 2 seconds to the GPU to carry out any task and after that stops and recovers, so instead of getting nice pictures I was getting blank screens and few times I was getting partial images.
 To overcome this I followed the advices from  this forum and in particular I add these values in my registry
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Contro l\GraphicsDrivers\DCI]
"Timeout"=dword:00000014
"TdrDdiDelay"=dword:00000014
"TdrDelay"=dword:00000014
"TdrLevel"=dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Contro l\GraphicsDrivers]

"TdrDdiDelay"=dword:00000014
"TdrDelay"=dword:00000014
"TdrLevel"=dword:00000000

Thursday, August 23, 2012

Inverse transformation of parametric elements

When one solves numerically partial differential equations, using finite element methods, soon or later will encounter with the concept of isoparametric elements. 
My goal here is not to explain the concept (I assume you have an idea), but to provide the solution to a problem that I faced up recently: I had to convert parametric or natural or local coordinates to global and vice versa.

A bit of background:
Every point in a domain is characterized by its coordinates (x, y ,z). These coordinates are called global. However in real applications it is not easy to work with the real or global coordinates, therefore it is common to use another coordinate system, where the location and the values of each point is a function of the location or values of the nodes of the element that contain the point. In other words we interpolate the value of a point  from the values of the point corners.

For example, in a triangle where v1, v2, v3 are the values on the three corners, we can express the value of a  point using the following formula:
vp=v1*N1(u,w)+v2*N2(u,w)+v3*N3(u,w) 
where, N1, N2, N3 are called shape functions, u, w, are the parametric variables and vp is the interpolated value (it can be simply the coordinates or hydraulic conductivity, or even a vector e.g velocity)
The parametric variables usually span within [0 1] or [-1 1].

Below is a small matlab/octave code that may help you to understand the concept of parametric variables.
(In case you don't know, the shape functions of a 2D triangle are N1=u, N2=w, and N3=1-u-w)

hold on % create a window plot
[x y]=ginput(3); %pick manually the three corners of your triangle
z=[0.7; 0.1; 0.3];%Assign some random property on the nodes (eg. height)
p=[]; %create an empty variable
for u=0:0.1:1 %loop through the parametric variable
   for w=0:0.1:1 %same here
      if u+w<=1% for triangles u+w must be less than one
               %otherwise the point is outside of the triangle
           p=[p;u*x(1)+w*x(2)+(1-u-w)*x(3)... %find the global
                u*y(1)+w*y(2)+(1-u-w)*y(3)... %coordinates for
                u*z(1)+w*z(2)+(1-u-w)*z(3)];  %each u,w pair
      end
   end
end
plot3(x([1:3 1]), y([1:3 1]), [0 0 0 0]);
hold on;
plot3(x([1:3 1]), y([1:3 1]), z([1:3 1]),'r'); 
plot3(p(:,1), p(:,2), p(:,3),'o');

 
The above code will produce a red triangle in 3D, the projection of the triangle in the x-y plane and a series of points that lay on the plane

We see that we can  interpolate the height value of every point within the triangle using very simple calculations.


So far we showed how you can find the global coordinates of a point if you know the local ones. Generally this is a straightforward task, yet the opposite can be cumbersome. 
Bellow I will explain a general method where using the symbolic toolbox of matlab this becomes also a trivial task.    

For a 2D triangle the problem is stated as follows, given the three corners of a triangle (x1,y1), (x2,y2), (x3,y3) and the coordinates of a point (xp,yp) find the parametric coordinates (u,w).
In case of triangles this is quite easy because the shape functions have relatively simple form. First we write the two equations:
xp=x1*u+x2*w+x3*(1-u-w)
yp=y1*u+y2*w+y3*(1-u-w)
and then we solve a system of two equations with two unknowns (u and w). This is actually something that matlab symbolic toolbox is very good at.
 Lets first create few symbolic variables
>> syms x1 x2 x3 y1 y2 y3 xp yp u w
>> eq1 = x1*u+x2*w+x3*(1-u-w) - xp
>> eq2 = y1*u+y2*w+y3*(1-u-w) - yp
Finally, all we have to do is to tell matlab to solve this system for us
>>  S=solve(eq1==0,eq2==0,u,w);
The outcome of the last command is a structured variable with fields u and w, which contains the solutions for u and w.
S.u=(x2*y3 - x3*y2 - x2*yp + xp*y2 + x3*yp - xp*y3)/(x1*y2 - x2*y1 - x1*y3 + x3*y1 + x2*y3 - x3*y2)
S.w=-(x1*y3 - x3*y1 - x1*yp + xp*y1 + x3*yp - xp*y3)/(x1*y2 - x2*y1 - x1*y3 + x3*y1 + x2*y3 - x3*y2)

Of course for the 2D triangle one can derive these formulas using a different approach.

Wednesday, August 22, 2012

Reading Modflow output files (again)

In many cases its very useful to be able to read and process in batch mode the output file of Modflow (Optimization, inverse modeling, Monte Carlo simulation etc...)

The USGS has provided a utility program to convert the binary output to ascii, which is easier to read, yet as you will see below it is rather straightforward to read the binary files without using an intermediate step.

In general, reading binary files requires precise knowledge of how the data are written on the file. Finding this information was actually more difficult than writing the program itself. Go to http://water.usgs.gov/nrp/gwsoftware/modflow2000/MFDOC/ and then and then under the FAQ look for "How can I read data from a binary file generated by MODFLOW?". At the bottom of the description click the plus next to Array Data and there they are!!.

Let's suppose we want to read the "mymodel.hds" binary file that contains the head field of a multi-layer aquifer with Nlay layers.
To read the file  we will use few Matlab/Octave commands, fopen/fclose, fread.

>>fid=fopen("mymodel.hds",'r'); % open the file for reading

Next we will start reading the data but as the description suggest we should know whether the data are written in single-precision or double-precision format. As we would never know beforehand the correct format, we just make an assumption and check this later.

 First we need to read an integer of 4 bytes, which is the the time step number.
>>KSPT = fread(fid, 1, 'uint');
similarly:
>>KPER = fread(fid, 1, 'uint');% read the stress period number
>>PERTIM = fread(fid, 1, 'float'); % the time in the current stress period, which is a real number and that's why we use 'float' format.
>> TOTIM = fread(fid, 1, 'float'); %the total elapsed time, which again is a real number.

Next we have to read a word, which is actually the moment of truth. If the assumptions about the single/double precision format or the number of bytes (4 or 8) for the real numbers are correct, we should read something that makes sense. When we read a head field, for example, we should simply read : '            HEAD'.
According to the description  we have to read 16 ANSI characters
>> DESC = fread(fid, 16, 'char');
The DESC now is an array that looks like this:
[32 32 32 32 32 72 69 65 68], which are the ANSI characters of the word we just read.
To convert it to a readable format use:
>>DESC=char(DESC');
DESC  is now the word '     HEAD' in ascii characters.
If DESC matches one of the values of the table, on the FAQ then our assumptions are correct.
Next we read 3 integer numbers
>>NCOL = fread(fid, 1, 'uint'); % number of columns
>>NROW = fread(fid, 1, 'uint'); % number of rows
>>ILAY = fread(fid, 1, 'uint'); % layer number
Finally we have to read the head field itself using a loop
>> for ir = 1 : NROW
       for ic = 1 : NCOL
           H(ir,ic,ILAY)=fread(fid, 1, float);
       end
   end

In case of multi layer we need to put all the above commands into a loop.

Last we need to close the file (which is something I always forget)
>>fclose(fid);

(I have used many times the above implementation without any problems. Therefore I cannot give recommendations if the file uses a different format).


Thursday, February 3, 2011

Read Modflow Head file using Matlab

When I first started working with Modflow, I used to use a utility program to convert the output data from binary to ascii. However, I have writen a function in Matlab that extracts the head field directly from the binary file to matlab workspace.

function H=readheadfile(filename,NR,NC,NL)
%NR, NC, NL: number of rows, columns, layers
fid=fopen(filename,'r');
A=fread(fid,'float'); %reads the entire file and stores it to the variable A
fclose(fid);
A=reshape(A,11+NC*NR,NL);
%{
The constant number 11 depends on the preprocessor platform. For Groundwater vistas, version 4 and 5, this number should be 11. For an old version of PmWin I had to change that to 14.
%}
A(1:11,:)=[];
for i=1:NL
H(:,:,i)=reshape(A(:,NL+1-i),NC,NR)';
end

Note that the output of the function readheadfile is a 3D array where the top layer corresponds to the bottom layer of the aquifer and vice versa.