Fox MPI
- Imports
- Check Operating System and Define Constants
- Command Line Arguments
- MPI Initialization
- Matrix Initialization on Rank 0
- Broadcast Data to All Processes
- Matrix Multiplication Loop
- MPI Allreduce
- Print Results
We use MPI to parallelize the matrix multiplication of randomly generated squared matrices A and B, and it calculates the product matrix C. The matrices are distributed among processes, and each process computes a portion of the final result. The MPI Allreduce
function is used to combine the partial results from all processes.
Imports
import os
import numpy as np
from sys import argv
from mpi4py import MPI
from time import perf_counter
The code imports necessary libraries, including os
, numpy
, sys
, mpi4py
, and time
. mpi4py
is used for MPI (Message Passing Interface) communication in parallel computing.
Check Operating System and Define Constants
isLinux = os.name == 'posix'
Determines if the operating system is Linux.
if isLinux:
from resource import getrusage, RUSAGE_SELF
On Linux, it imports resource-related functions to measure resource usage.
Command Line Arguments
exponent = int(argv[1])
isInt = bool(argv[2])
Reads command line arguments: exponent
and isInt
. These are used to define the size of the matrices and the type of matrix elements.
MPI Initialization
comm = MPI.COMM_WORLD # get the communicator object
size = comm.Get_size() # total number of processes
rank = comm.Get_rank() # rank of this process
Initializes MPI communication, retrieves the total number of processes, and the rank of the current process.
Matrix Initialization on Rank 0
if rank == 0:
MATRIX_SIZE = 2**exponent
# generate two random matrices of size MATRIX_SIZE
# initialize the matrices A, B, and C
# matrices A and B contain random integers or floats based on 'isInt'
matrix_A = ...
matrix_B = ...
matrix_C = np.zeros((MATRIX_SIZE, MATRIX_SIZE), dtype=int) if isInt else np.zeros((MATRIX_SIZE, MATRIX_SIZE))
data = (MATRIX_SIZE, matrix_A, matrix_B, matrix_C)
else:
data = None
On the master process (rank 0), it initializes matrices A, B, and C, and packs the data into a tuple (data
). Other processes receive None
.
Broadcast Data to All Processes
data = comm.bcast(data, root=0) # broadcast the data to all processes
MATRIX_SIZE, matrix_A, matrix_B, matrix_C = data # unpack the data
All processes receive the data using MPI broadcast.
Matrix Multiplication Loop
start_time = perf_counter()
for x in range(MATRIX_SIZE):
if rank == x % size:
for i in range(MATRIX_SIZE):
y = (x + i) % MATRIX_SIZE
matrix_C[i] += matrix_A[i, y] * matrix_B[y]
Each process calculates a row of the matrix C. The rows are distributed among processes, and the multiplication is performed locally.
MPI Allreduce
comm.Allreduce(MPI.IN_PLACE, matrix_C, op=MPI.SUM)
MPI Allreduce function is used to sum the rows of matrix C calculated by each process.
Print Results
if rank == 0:
print(perf_counter() - start_time)
print(getrusage(RUSAGE_SELF).ru_maxrss) if isLinux else print(0)
The master process prints the execution time and the RAM memory usage if the operating system is Linux.