Skip to content

JobTypes and Types layered relationships. Could it be more flexible? #74

@victorsndvg

Description

@victorsndvg

JobTypes can currently describe how the applications are called depending on the underlying (software) layers used for running them.

The involved layers we have identified are (based on our experience):

  • Workload manager (e.g. Slurm)
  • Process manager (e.g. Openmpi)
  • Container technology (e.g. Singularity)
  • Software call (e.g. echo hola)

Currently, workload managers types like Slurm or Torque describe how a batch script is submitted. In addition, a SingularityJob type describes how to call a singularity container inside workload manager. In particular, a SingularityJob type always use mpirun as a process manager, but I think is not always mandatory (e.g. a sequential singularity job). Finally, a particular software call is also expressed through a command job option.

In my opinion, the top (workload manager) and bottom ("software-call") layers of the hierarchy are mandatory. I think the other layers could be optional and interrelated in order to express (in a more flexible way) the requirements and execution mode of every application. As a suggestion, a contained_in relationship could express this kind of hierarchical structures.

As a(n) (simplified) example, the following options could be acceptable:

  • srun command
  • srun [mpirun] command
  • srun [singularity] command
  • srun [mpirun] [singularity] command

This flexibility could be expressed by a blueprint like the following example:

    computational_resources:
        type: hpc.nodes.WorkloadManager
        properties:
           ...

    mpi:
        type: hpc.nodes.MPIJob
        properties:
            job_options: 
                modules:
                    - openmpi
                    ...
                flags:
                    - '--mca orte_tmpdir_base /tmp '
                    - '--mca pmix_server_usock_connections 1'
        relationships:
            - type: contained_in  #job_managed_by_wm
              target: computational_resources

    container:
        type: hpc.nodes.SingularityJob
        properties:
            job_options: 
                modules:
                    - singularity
                    ...
        relationships:
            - type: contained_in 
              target: mpi

    software:
        type: hpc.nodes.ShellJob
        properties:
            job_options: 
                command: 'flow inputfile outputfile'
        relationships:
            - type: contained_in 
              target: container

Even if we though about other container solutions (docker), or single-node parallel jobs, we can interchange MPI+container-contained_in relationships, avoiding the MPI-vendor-and-version matching inside and outside the container in some cases:

  • srun [singularity | docker] [mpirun] command

To think

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions