SRC/get_perm_c_parmetis.c File Reference

Gets matrix permutation. More...

#include <limits.h>
#include <math.h>
#include "superlu_ddefs.h"

Functions/Subroutines

static float a_plus_at_CompRow_loc (int iam,int_t *perm_r,int nprocs_i,int_t *vtxdist_i,int_t n,int_t *rowptr,int_t *colind,int nprocs_o,int_t *vtxdist_o,int_t *p_bnz,int_t **p_b_rowptr,int_t **p_b_colind,gridinfo_t *grid)
float get_perm_c_parmetis (SuperMatrix *A, int_t *perm_r, int_t *perm_c, int nprocs_i, int noDomains, int_t **sizes, int_t **fstVtxSep, gridinfo_t *grid, MPI_Comm *metis_comm)


Detailed Description

 -- Distributed symbolic factorization auxialiary routine  (version 2.1) --
 Lawrence Berkeley National Lab, Univ. of California Berkeley - July 2003
 INRIA France - January 2004
 Laura Grigori

 November 1, 2007
 

Function Documentation

static float a_plus_at_CompRow_loc ( int  iam,
int_t perm_r,
int  nprocs_i,
int_t vtxdist_i,
int_t  n,
int_t rowptr,
int_t colind,
int  nprocs_o,
int_t vtxdist_o,
int_t p_bnz,
int_t **  p_b_rowptr,
int_t **  p_b_colind,
gridinfo_t grid 
) [static]

 Purpose
 =======

 Form the structure of Pr*A +A'Pr'. A is an n-by-n matrix in
 NRformat_loc format, represented by (rowptr, colind). The output
 B=Pr*A +A'Pr' is in NRformat_loc format (symmetrically, also row
 oriented), represented by (b_rowptr, b_colind).

 The input matrix A is distributed in block row format on nprocs_i
 processors.  The output matrix B is distributed in block row format
 on nprocs_o processors, where nprocs_o <= nprocs_i.  On output, the
 matrix B has its rows permuted according to perm_r.

 Sketch of the algorithm
 =======================

 Let iam by my process number.  Let fst_row, lst_row = m_loc +
 fst_row be the first/last row stored on iam.

 Compute Pr' - the inverse row permutation, stored in iperm_r.

 Compute the transpose  of the block row of Pr*A that iam owns:
    T[:,Pr(fst_row:lst_row)] = Pr' * A[:,fst_row:lst_row] * Pr'

 All to all communication such that every processor iam receives all
 the blocks of the transpose matrix that it needs, that is
           T[fst_row:lst_row, :]

 Compute B = A[fst_row:lst_row, :] + T[fst_row:lst_row, :]

 If Pr != I or nprocs_i != nprocs_o then permute the rows of B (that
 is compute Pr*B) and redistribute from nprocs_i to nprocs_o
 according to the block row distribution in vtxdist_i, vtxdist_o.
 

float get_perm_c_parmetis ( SuperMatrix A,
int_t perm_r,
int_t perm_c,
int  nprocs_i,
int  noDomains,
int_t **  sizes,
int_t **  fstVtxSep,
gridinfo_t grid,
MPI_Comm *  metis_comm 
)

 Purpose
 =======

 GET_PERM_C_PARMETIS obtains a permutation matrix Pc, by applying a
 graph partitioning algorithm to the symmetrized graph A+A'.  The
 multilevel graph partitioning algorithm used is the
 ParMETIS_V3_NodeND routine available in the parallel graph
 partitioning package parMETIS.

 The number of independent sub-domains noDomains computed by this
 algorithm has to be a power of 2.  Hence noDomains is the larger
 number power of 2 that is smaller than nprocs_i, where nprocs_i = nprow
 * npcol is the number of processors used in SuperLU_DIST.

 Arguments
 =========

 A       (input) SuperMatrix*
         Matrix A in A*X=B, of dimension (A->nrow, A->ncol). The number
         of the linear equations is A->nrow.  Matrix A is distributed
         in NRformat_loc format.

 perm_r  (input) int_t*
         Row permutation vector of size A->nrow, which defines the 
         permutation matrix Pr; perm_r[i] = j means row i of A is in 
         position j in Pr*A.

 perm_c  (output) int_t*
	   Column permutation vector of size A->ncol, which defines the 
         permutation matrix Pc; perm_c[i] = j means column i of A is 
         in position j in A*Pc.

 nprocs_i (input) int*
         Number of processors the input matrix is distributed on in a block
         row format.  It corresponds to number of processors used in
         SuperLU_DIST.

 noDomains (input) int*, must be power of 2
         Number of independent domains to be computed by the graph
         partitioning algorithm.  ( noDomains <= nprocs_i )

 sizes   (output) int_t**, of size 2 * noDomains
         Returns pointer to an array containing the number of nodes
         for each sub-domain and each separator.  Separators are stored 
         from left to right.
         Memory for the array is allocated in this routine.

 fstVtxSep (output) int_t**, of size 2 * noDomains
         Returns pointer to an array containing first node for each
         sub-domain and each separator.
         Memory for the array is allocated in this routine.

 Return value
 ============
   < 0, number of bytes allocated on return from the symbolic factorization.
   > 0, number of bytes allocated when out of memory.
 


Generated on Wed Nov 24 18:17:32 2010 for SuperLUDistributed by  doxygen 1.5.5