ROOT-Sim core  3.0.0-rc.2
A General-Purpose Multi-threaded Parallel/Distributed Simulation Library
Enumerations | Functions | Variables
mpi.c File Reference

MPI Support Module. More...

#include <distributed/mpi.h>
#include <mm/mm.h>
#include <datatypes/msg_queue.h>
#include <mm/msg_allocator.h>
Include dependency graph for mpi.c:

Enumerations

enum  { RS_MSG_TAG = 0 , RS_DATA_TAG }
 

Functions

static void comm_error_handler (MPI_Comm *comm, int *err_code_p,...)
 Handles a MPI error. More...
 
void mpi_global_init (int *argc_p, char ***argv_p)
 Initializes the MPI environment. More...
 
void mpi_global_fini (void)
 Finalizes the MPI environment.
 
void mpi_remote_msg_send (struct lp_msg *msg, nid_t dest_nid)
 Sends a model message to a LP residing on another node. More...
 
void mpi_remote_anti_msg_send (struct lp_msg *msg, nid_t dest_nid)
 Sends a model anti-message to a LP residing on another node. More...
 
void mpi_control_msg_broadcast (enum msg_ctrl_code ctrl)
 Sends a platform control message to all the nodes, including self. More...
 
void mpi_control_msg_send_to (enum msg_ctrl_code ctrl, nid_t dest)
 Sends a platform control message to a specific nodes. More...
 
void mpi_remote_msg_handle (void)
 Empties the queue of incoming MPI messages, doing the right thing for each one of them. More...
 
void mpi_remote_msg_drain (void)
 Empties the queue of incoming MPI messages, ignoring them. More...
 
void mpi_reduce_sum_scatter (const uint32_t values[n_nodes], uint32_t *result)
 Computes the sum-reduction-scatter operation across all nodes. More...
 
bool mpi_reduce_sum_scatter_done (void)
 Checks if a previous mpi_reduce_sum_scatter() operation has completed. More...
 
void mpi_reduce_min (double *node_min_p)
 Computes the min-reduction operation across all nodes. More...
 
bool mpi_reduce_min_done (void)
 Checks if a previous mpi_reduce_min() operation has completed. More...
 
void mpi_node_barrier (void)
 A node barrier.
 
void mpi_blocking_data_send (const void *data, int data_size, nid_t dest)
 Sends a byte buffer to another node. More...
 
void * mpi_blocking_data_rcv (int *data_size_p, nid_t src)
 Receives a byte buffer from another node. More...
 

Variables

static enum msg_ctrl_code ctrl_msgs []
 Array of control codes values to be able to get their address for MPI_Send() More...
 
static MPI_Request reduce_sum_scatter_req = MPI_REQUEST_NULL
 The MPI request associated with the non blocking scatter gather collective.
 
static MPI_Request reduce_min_req = MPI_REQUEST_NULL
 The MPI request associated with the non blocking all reduce collective.
 

Detailed Description

MPI Support Module.

This module implements all basic MPI facilities to let the distributed execution of a simulation model take place consistently.

Function Documentation

◆ comm_error_handler()

static void comm_error_handler ( MPI_Comm *  comm,
int *  err_code_p,
  ... 
)
static

Handles a MPI error.

Parameters
commthe MPI communicator involved in the error
err_code_pa pointer to the error code
...an implementation specific list of additional arguments in which we are not interested

This is registered in mpi_global_init() to print meaningful MPI errors

◆ mpi_blocking_data_rcv()

void* mpi_blocking_data_rcv ( int *  data_size_p,
nid_t  src 
)

Receives a byte buffer from another node.

Parameters
data_size_pwhere to write the size of the received data
srcthe id of the sender node
Returns
the buffer allocated with mm_alloc() containing the received data

This operation blocks the execution until the sender node actually sends the data with mpi_raw_data_blocking_send().

◆ mpi_blocking_data_send()

void mpi_blocking_data_send ( const void *  data,
int  data_size,
nid_t  dest 
)

Sends a byte buffer to another node.

Parameters
dataa pointer to the buffer to send
data_sizethe buffer size
destthe id of the destination node

This operation blocks the execution flow until the destination node receives the data with mpi_raw_data_blocking_rcv().

◆ mpi_control_msg_broadcast()

void mpi_control_msg_broadcast ( enum msg_ctrl_code  ctrl)

Sends a platform control message to all the nodes, including self.

Parameters
ctrlthe control message to send

◆ mpi_control_msg_send_to()

void mpi_control_msg_send_to ( enum msg_ctrl_code  ctrl,
nid_t  dest 
)

Sends a platform control message to a specific nodes.

Parameters
ctrlthe control message to send
destthe id of the destination node

◆ mpi_global_init()

void mpi_global_init ( int *  argc_p,
char ***  argv_p 
)

Initializes the MPI environment.

Parameters
argc_pa pointer to the OS supplied argc
argv_pa pointer to the OS supplied argv

◆ mpi_reduce_min()

void mpi_reduce_min ( double *  node_min_p)

Computes the min-reduction operation across all nodes.

Parameters
node_min_pa pointer to the value from the calling node which will also be used to store the computed minimum.

Each node supplies a single simtime_t value. The minimum of all these values is computed and stored in node_min_p itself. It is expected that only a single thread calls this function at a time. Each node has to call this function else the result can't be computed. It is possible to have a single mpi_reduce_min() operation pending at a time. Both arguments must point to valid memory regions until mpi_reduce_min_done() returns true.

◆ mpi_reduce_min_done()

bool mpi_reduce_min_done ( void  )

Checks if a previous mpi_reduce_min() operation has completed.

Returns
true if the previous operation has been completed, false otherwise.

◆ mpi_reduce_sum_scatter()

void mpi_reduce_sum_scatter ( const uint32_t  values[n_nodes],
uint32_t *  result 
)

Computes the sum-reduction-scatter operation across all nodes.

Parameters
valuesa flexible array implementing the addendum vector from the calling node.
resulta pointer where the nid-th component of the sum will be stored.

Each node supplies a n_nodes components vector. The sum of all these vector is computed and the nid-th component of this vector is stored in result. It is expected that only a single thread calls this function at a time. Each node has to call this function else the result can't be computed. It is possible to have a single mpi_reduce_sum_scatter() operation pending at a time. Both arguments must point to valid memory regions until mpi_reduce_sum_scatter_done() returns true.

◆ mpi_reduce_sum_scatter_done()

bool mpi_reduce_sum_scatter_done ( void  )

Checks if a previous mpi_reduce_sum_scatter() operation has completed.

Returns
true if the previous operation has been completed, false otherwise.

◆ mpi_remote_anti_msg_send()

void mpi_remote_anti_msg_send ( struct lp_msg msg,
nid_t  dest_nid 
)

Sends a model anti-message to a LP residing on another node.

Parameters
msgthe message to rollback
dest_nidthe id of the node where the targeted LP resides

This function also calls the relevant handlers in order to keep, for example, the non blocking gvt algorithm running. Note that when this function returns, the anti-message may have not been sent yet. We don't need to actively check for sending completion: the platform, during the fossil collection, leverages the gvt to make sure the message has been indeed sent and processed before freeing it.

◆ mpi_remote_msg_drain()

void mpi_remote_msg_drain ( void  )

Empties the queue of incoming MPI messages, ignoring them.

This routine checks, using the MPI probing mechanism, for new remote messages and it discards them. It is used at simulation completion to clear MPI state.

◆ mpi_remote_msg_handle()

void mpi_remote_msg_handle ( void  )

Empties the queue of incoming MPI messages, doing the right thing for each one of them.

This routine checks, using the MPI probing mechanism, for new remote messages and it handles them accordingly. Control messages are handled by the respective platform handler. Simulation messages are unpacked and put in the queue. Anti-messages are matched and accordingly processed by the message map.

◆ mpi_remote_msg_send()

void mpi_remote_msg_send ( struct lp_msg msg,
nid_t  dest_nid 
)

Sends a model message to a LP residing on another node.

Parameters
msgthe message to send
dest_nidthe id of the node where the targeted LP resides

This function also calls the relevant handlers in order to keep, for example, the non blocking gvt algorithm running. Note that when this function returns, the message may have not been actually sent. We don't need to actively check for sending completion: the platform, during the fossil collection, leverages the gvt to make sure the message has been indeed sent and processed before freeing it.

Variable Documentation

◆ ctrl_msgs

enum msg_ctrl_code ctrl_msgs[]
static
Initial value:
= {
}
@ MSG_CTRL_GVT_START
Used by the master to start a new gvt reduction operation.
Definition: control_msg.h:19
@ MSG_CTRL_GVT_DONE
Used by slaves to signal their completion of the gvt protocol.
Definition: control_msg.h:21
@ MSG_CTRL_TERMINATION
Used in broadcast to signal that local LPs can terminate.
Definition: control_msg.h:23

Array of control codes values to be able to get their address for MPI_Send()