What's New

Index

Major Updates in Slurm Version 2.6

SLURM Version 2.6 was released in July 2013. Major enhancements include:

  • Support for job arrays, which increases performance and ease of use for sets of similar jobs.
  • Job profiling capability added to record a wide variety of job characteristics for each task on a user configurable periodic basis. Data currently available includes CPU use, memory use, energy use, Infiniband network use, Lustre file system use, etc.
  • Support for MPICH2 using PMI2 communications interface with much greater scalability.
  • Prolog and epilog support for advanced reservations.
  • Much faster throughput for job step execution with --exclusive option. The srun process is notified when resources become available rather than periodic polling.
  • Support improved for Intel MIC (Many Integrated Core) processor.
  • Advanced reservations with hostname and core counts now supports asymmetric reservations (e.g. specific different core count for each node).
  • External sensor plugin infrastructure added to record power consumption, temperature, etc.
  • Improved performance for high-throughput computing.
  • MapReduce+ support (launches ~1000x faster, runs ~10x faster).
  • Added "MaxCPUsPerNode" partition configuration parameter. This can be especially useful to schedule GPUs. For example a node can be associated with two Slurm partitions (e.g. "cpu" and "gpu") and the partition/queue "cpu" could be limited to only a subset of the node's CPUs, insuring that one or more CPUs would be available to jobs in the "gpu" partition/queue.

Major Updates in Slurm Version 13.12

NOTICE: Starting with 13.12 we will be numbering Slurm versions with a YEAR.MONTH format.
SLURM Version 13.12 release is planned in December 2013. Major enhancements include:

  • Integration with FLEXlm (Flexnet Publisher) license management.
  • Layouts framework, which will be the basis for further developments toward optimizing scheduling with respect to additional parameters such as temperature and power consumption.
  • Energy consumption added as a factor in fair-share scheduling.
  • Energy aware scheduling added with respect to power caps.
  • Improved user support for fault-tolerance (e.g. "hot spare" resources).
  • New partition configuration parameters: AllowAccounts, AllowQOS, DenyAccounts and DenyQOS.
  • Scalability improvements for MPI initialization including communication of the compute node network interface details.
  • Defer sending SIGKILL signal to processes while a core dump is in progress.
  • Other important enhancements that can not be made public at this time...

Major Updates in Slurm Version 14.06 and beyond

Detailed plans for release dates and contents of additional SLURM releases have not been finalized. Anyone desiring to perform SLURM development should notify slurm-dev@schedmd.com to coordinate activities. Future development plans includes:

  • Distributed architecture to support the management of resources with Intel MIC processors.
  • Support of I/O as a new resources, including proxy I/O nodes with data locality.
  • Improved scheduling support for job dependencies (e.g. pre-processing, post-processing, co-processing on I/O nodes, etc.) to optimize overall system utilization.
  • IP communications over InfiniBand network for improved performance.
  • Support for heterogeneous GPU environments (i.e. user specification of desired GPU types).
  • Fault-tolerance and jobs dynamic adaptation through communication protocol between Slurm , MPI libraries and the application.
  • Improved support for high-throughput computing (e.g. multiple slurmctld daemons on a single cluster).
  • Scheduling fully optimized for energy efficiency.
  • Numerous enhancements to advanced resource reservations (e.g. start or end the reservation early depending upon the workload).
  • Add Kerberos credential support including credential forwarding and refresh.
  • Improved support for provisioning and virtualization.
  • Provide a web-based SLURM administration tool.
  • Finer-grained BlueGene resouce management (partitions/queues and advanced reservations containing less than a whole midplane).

Security Patches

Common Vulnerabilities and Exposure (CVE) information is available at
http://cve.mitre.org/.

  • CVE-2009-0128
    There is a potential security vulnerability in SLURM where a user could build an invalid job credential in order to execute a job (under his correct UID and GID) on resources not allocated to that user. This vulnerability exists only when the crypto/openssl plugin is used and was fixed in SLURM version 1.3.0.
  • CVE-2009-2084
    SLURM failed to properly set supplementary groups before invoking (1) sbcast from the slurmd daemon or (2) strigger from the slurmctld daemon, which might allow local SLURM users to modify files and gain privileges. This was fixed in SLURM version 1.3.14.
  • CVE-2010-3308
    There is a potential security vulnerability where if the init.d scripts are executed by user root or SlurmUser to initiate the SLURM daemons and the LD_LIBRARY_PATH is not set and the operating system interprets a blank entry in the path as "." (current working directory) and that directory contains a trojan library, then that library will be used by the SLURM daemon with unpredictable results. This was fixed in SLURM version 2.1.14.

Last modified 31 July 2013