CIS Computing & Information Services

MATLAB

MATLAB is very popular as a scientific computing tool because of it's IDE, ease of programmability and comprehensive library of high level functions. It is used extensively on clusters for post processing of simulation results, analysis of large amounts of experimental data, etc.

Matlab is available as a software module on Oscar. The default version of Matlab is loaded automatically when you log in.

Kindly make sure you do not run Matlab on a login node.


matlab-threaded command

On Oscar, the command matlab is actually a wrapper that sets up MATLAB to run as a single-threaded, command-line program, which is the optimal way to pack multiple Matlab scripts onto the Oscar compute nodes.

To run the actual multi-threaded version with JVM and Display enable, use:

$ matlab-threaded

Similarly, to run this without the display enabled:

$ matlab-threaded -nodisplay

MATLAB GUI

VNC

The VNC client provided by CCV is the best way to launch GUI applications on Oscar, including Matlab. From the terminal emulator in VNC, first load the module corresponding to the intended version of Matlab. Then use the matlab-threaded command to launch the Matlab GUI. For example,

$ module load matlab/R2016a
$ matlab-threaded

Here is a snapshot of what it looks like:

X11 Forwarding

You can also run the MATLAB GUI in an X-forwarded interactive session. This requires installing an X server on your workstation/PC and logging in to Oscar with X forwarding enabled - https://www.ccv.brown.edu/doc/x-forwarding. Use the interact command to get interactive access to a compute node. Again, for launching the GUI, you need to use the matlab-threaded command, which enables the display and JVM. You may however experience a lag in response from the Matlab GUI in an X forwarded session. Note that if Matlab does not find the X window system available, it will launch in command line mode (next section).

CIFS

A workaround in some situations may be to use CIFS to mount the Oscar filesystem on your PC and using the Matlab installation on your computer. For example, if you have your simulation results residing on Oscar, this might be a quick way to do post-processing on the data instead of having to move the data to your computer or using the Matlab GUI on Oscar. Note that users can connect to CIFS only from Brown computers or on Brown WiFi.


Matlab Command Line

Instead of the GUI, Matlab’s interpreter can be launched interactively on command line (text based interface):

$ matlab-threaded -nodisplay

This way, the user does not have to worry about launching the display and sluggish response from the GUI. The startup time is also much less. It might take some time to get used to the command line interface. We recommend that unless users need to use tools like debugger, profiler which are more convenient on the GUI, or need to see live plots, they can use the command line version. Ultimately, it is a personal choice.

Notes:

Set the $EDITOR environment variable prior to launching Matlab to be able to use edit command, e.g.

$ export EDITOR=nano

nano is a basic command line editor. There are other command line editors like vim and emacs that users can choose.

From the Matlab command line (represented by the >> symbol below), you can directly type the command to run a script or function after changing the directory to where it is located:

>> cd path/to/work/dir
>> myscript

To check version, license info and list all toolboxes available with version:

>> ver

To run a Matlab function myfunc.m from the shell:

$ matlab-threaded –nodisplay –r “myfunc(arg1,arg2)”

Batch Jobs

GUI and command line interpreter may be suitable only for visualization and debugging or optimization. Batch jobs should be the preferred way of running programs (actual production runs) on a cluster. The reason being high wait times because of the amount of resources required and higher run times typical of these programs. Moreover, batch jobs are much more convenient for running many programs simultaneously. Batch scripts are used for submitting jobs to the scheduler (SLURM) on Oscar, which are described in detail here.

Example Batch Script

Here is an example batch script for running a serial Matlab program on an Oscar compute node:

#!/bin/bash

# Request an hour of runtime:
#SBATCH --time=1:00:00

# Default resources are 1 core with 2.8GB of memory.

# Use more memory (4GB):
#SBATCH --mem=4G

# Specify a job name:
#SBATCH -J MyMatlabJob

# Specify an output file
#SBATCH -o MyMatlabJob-%j.out
#SBATCH -e MyMatlabJob-%j.out

# Run a matlab function called 'foo.m' in the same directory as this batch script.
matlab -r "foo(1), exit"

This is also available in your home directory as the file:

~/batch_scripts/matlab-serial.sh

Note the exit command at the end which is very important to include either there or in the Matlab function/script itself. If you don't make Matlab exit the interpreter, it will keep waiting for the next command until SLURM cancels the job after running out of requested walltime. So for example, if you requested 4 hours of walltime and your actual program completes in 1 hour, the SLURM job will not complete until the designated 4 hours which results in idle cores and wastage of resources and also blocks up your other jobs.

If the name of your batch script file is matlab-serial.sh, the batch job can be submitted using the following command:

$ sbatch matlab-serial.sh

Job Arrays

SLURM job arrays can be used to submit multiple jobs using a single batch script. E.g. when a single Matlab script is to be used to run analyses on multiple input files or using different input parameters. An example batch script for submitting a Matlab job array:

#!/bin/bash

# Job Name
#SBATCH -J arrayjob

# Walltime requested
#SBATCH -t 0:10:00

# Provide index values (TASK IDs)
#SBATCH --array=1-4

# Use '%A' for array-job ID, '%J' for job ID and '%a' for task ID
#SBATCH -e arrayjob-%a.err
#SBATCH -o arrayjob-%a.out

# single core
#SBATCH -n 1

# Use the $SLURM_ARRAY_TASK_ID variable to provide different inputs for each job

echo "Running job array number: "$SLURM_ARRAY_TASK_ID

module load matlab/R2016a

matlab-threaded -nodisplay -nojvm -r "foo($SLURM_ARRAY_TASK_ID), exit"

Index values are assigned to each job in the array. The $SLURM_ARRAY_TASK_ID variable represents these values and can be used to provide a different input to each job in the array. Note that this variable can be accessed from Matlab too using the getenv function:

getenv('SLURM_ARRAY_TASK_ID')

The above script can be found in your home directory as the file:

~/batch_scripts/matlab-array.sh

Improving Performance & Memory Management

Matlab programs often suffer from poor performance and running out of memory. Among other things, you can refer the following web pages for best practices for an efficient code:

The first step to speeding up Matlab applications is identifying the part which takes up most of the run time. Matlab's "Profiling" tool can be very helpful in doing that:

Further reading from Mathworks:


Parallel Programming in Matlab

You can explore GPU computing through Matlab if you think your program can benefit from massively parallel computations:

Finally, parallel computing features like parfor and spmd can be used by launching a pool of workers on a node. Note that the Parallel Computing Toolbox by itself cannot span across multiple nodes. Hence, requesting more than one node for a job will result in wastage of resources.

HPCmatlab is a framework for fast prototyping of parallel applications in Matlab. If your application has enough parallelism to use multiple nodes, you can use the Message Passing Interface (MPI) through HPCmatlab to send and receive messages among different Matlab processes. It uses MEX functions to wrap the C language MPI functions. It is installed as a module on Oscar:

module load hpcmatlab/1.0