Architecture of Continuity Parallel

  • Client
    • in gui_client.py, start_server() launches ContinuityParallel.py on the server (which runs on the front end) and waits for config_dyn_*.mpich to be created

  • ContinuityParallel.py

    • Runs from the front end; however, all real work is sent to compute nodes, so this shouldn’t be a problem at all.
    • Sets up SGE_ROOT and PATH environment variables
    • Creates ep_run
      • sets up some environment variables
      • mpirun -p4pg config_dyn* python ContinuityDaemon.py

    • Creates sge_cont
      • file used by qsub; basically contains the # of required nodes and
      • sgeStart.py (which simply creates config_dyn_* file and then waits)
    • Gets a current list of jobs
    • qsub sge_cont
    • Wait for sge_cont (by using sgeStart.py) to create config_dyn*
    • Once config_dyn is ready, identifies new job_id (as the job id that was not in previous list of jobs)
    • Creates tunnel to root node (which it identifies in config_dyn)
    • Runs ep_run on root node
    • When it’s done, clean up everything, including qdel new job_id